Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
219 views
in Technique[技术] by (71.8m points)

css - Use R to Scrape information from interacitve embedded popups in dynamic website

I'm trying to scraping the information of those small yellow car from a dynamic website: Trunk road gritter tracker

It is a dynamic website that contains many yellow cars,each car has its own information like 'Vehicle', 'Age(range)', 'source date' .etc. And when you click on the car, the information will pop up.

However, I can only click to open the information bar for a single car once at a time, which means I can only get one specific popup data at a time, and correspondingly, the scraping result is only for that specific car. How can I scrape for the information bar for all cars?

In the initial state of the web page, the car has not been clicked, so my idea is to use the remdr$findelement() function to navigate the car, and then use clickElement() to simulate clicking. After clicking, a pop-up window will appear, and then use html_ Nodes() and html_text() function to scrape the information. After scraping, simulate clicking the pop-up close button. So far, the scraping of one car is completed, and then repeat this process to achieve scraping of all car information.

enter image description here I used Rselenium and rvest, and here is my code:

remDr = remoteDriver(remoteServerAddr="localhost",port=4444L,browserName="chrome")
    remDr$open(silent = T)
    url = "https://www.arcgis.com/apps/webappviewer/index.html?id=2de764a9303848ffb9a4cac0bd0b1aab"
    remDr$navigate(url)
    
    image_button = remDr$findElement('xpath', value = "//*[@id='TSWT_VehiclesAndTrail_2020_1393_layer']/*[name()='image']")
    image_button$clickElement()
    
    webpage <- read_html(remDr$getPageSource()[[1]])
    data <- webpage %>% html_nodes(".attrValue") %>% html_text()
    data

However, there are some problems. I don't know how to use findElement() to navigate to the cars when many cars are in the same page. The strcutre for the image of the cars is shown below. Although each image has its own specific xPath, it is in a nested strcutre and I cannot findelement successfully by directly search for that image xpath. According to these two answers:Finding SVG Elements using RSelenium and XPath and trigger all pop-ups, instead I use remDr$findElement('xpath', value = "//*[@id='TSWT_VehiclesAndTrail_2020_1393_layer']/*[name()='image']") But this works only when I enlarge the page so that there is only one car on the screen, I can get the information successfully, when more than one cars are in the screen, nothing will be return. And I tried remDr$findElement()[[1]] bu it still doesn't work.

<svg>
    <g id=xxx>
        <image></image>
        <image></image>
        <image></image>
            .
            .
            .
    </g>
</svg>

How can I get all the different information from different popups at once? Is my idea feasible, or is there any other simpler way?

question from:https://stackoverflow.com/questions/65912925/use-r-to-scrape-information-from-interacitve-embedded-popups-in-dynamic-website

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...