Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
134 views
in Technique[技术] by (71.8m points)

How to use Python Selenium get partial html source?

When I use driver.page_source I will get full source page, is there any way that I can get specific part of the html code.

from selenium import webdriver
chrome_options = webdriver.ChromeOptions ()

from selenium.webdriver.common.keys import Keys
    
    driver = webdriver.Chrome (executable_path="/selenium/chromedriver", options=chrome_options)
    driver.get("https://news.creaders.net/us/2021/01/27/2315313.html")
            
    content = driver.page_source

Then I will receive the whole page html.

But I only need the html that inside the : <div id="newsContent"> </div>

<div id="newsContent">

<p></p><p>cotent</p><p style="text-align: center;"><img src="https://pub.creaders.net/upload_files/image/202101/20210127_16117914118079.png" title="20210127_16117914118079.png" alt="image.png"></p>

</div>
question from:https://stackoverflow.com/questions/65930322/how-to-use-python-selenium-get-partial-html-source

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Try running your HTML output through the BeautifulSoup parser.

from bs4 import BeautifulSoup

soup = BeautifulSoup(html)
div = soup.find('div', id='newsContent')
print ''.join(map(str, div.contents))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...