Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
830 views
in Technique[技术] by (71.8m points)

r - RSelenium: hangs in navigate to direct pdf download

Using RSelenium via Docker Toolbox for Windows with selenium/standalone-firefox-debug container - all working fine: docker run -d -v //c/test/://home/seluser/Downloads -p 4445:4444 -p 5901:5900 selenium/standalone-firefox-debug

Have setup firefox profile to download pdf directly:

fprof <- makeFirefoxProfile(list(browser.startup.homepage = "about:blank"
                                 , startup.homepage_override_url = "about:blank"
                                 , startup.homepage_welcome_url = "about:blank"
                                 , startup.homepage_welcome_url.additional = "about:blank"
                                 , browser.download.dir = "/home/seluser/Downloads"
                                 , browser.download.folderList = 2L
                                 , browser.download.manager.showWhenStarting = FALSE
                                 , browser.download.manager.focusWhenStarting = FALSE
                                 , browser.download.manager.closeWhenDone = TRUE
                                 , browser.helperApps.neverAsk.saveToDisk = "application/pdf, application/octet-stream"
                                 , pdfjs.disabled = TRUE
                                 , plugin.scan.plid.all = FALSE
                                 , plugin.scan.Acrobat = 99L))

Using the following code, when I navigate directly to the pdf, it downloads to the specified directory fine but then it hangs at that point, not allowing any proceeding code to execute.

library(RSelenium)

remDr <- remoteDriver(remoteServerAddr = "*docker-ip*", port = 4445L, extraCapabilities = fprof)
remDr$open()
remDr$navigate("http://www.equibase.com/premium/eqbPDFChartPlus.cfm?RACE=A&BorP=P&TID=BEL&CTRY=USA&DT=09/12/2015&DAY=D&STYLE=EQB")

I have to manually stop the R code and the error that displays is:

Error in checkError(res) : 
Undefined error in httr call. httr output: Operation was aborted by an application callback

If I VNC into the container and look at what is displayed in the browser, the file has downloaded but there is nothing in the address bar.

screenshot Any ideas? I am assuming that it is something to do with the httr/rselenium packages not receiving some sort of 'loaded' signal from the browser, but this extends beyond my troubleshooting ability. This method had worked previously using the .jar file selenium-standalone-server and RSelenium.

sessionInfo() & remDr$open() output below:

> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RSelenium_1.7.1

loaded via a namespace (and not attached):
 [1] httr_1.2.1     R6_2.2.0       assertthat_0.1 tools_3.3.2    wdman_0.2.2    binman_0.1.0  
 [7] curl_2.3       Rcpp_0.12.9    jsonlite_1.2   caTools_1.17.1 openssl_0.9.6  bitops_1.0-6  
[13] semver_0.2.0   XML_3.98-1.5  



> remDr$open()
[1] "Connecting to remote server"
$rotatable
[1] FALSE

$raisesAccessibilityExceptions
[1] FALSE

$firefoxOptions
$firefoxOptions$args
list()

$firefoxOptions$profile
[1] "UEsDBBQACAgIAEwPW0oAAAAAAAAAAAAAAAAIAAAAcHJlZnMuanOlkU9LAzEQxe+C36HsSaFmwVv1tNBjbwoey2wy242dZsJM0v36JqJYimWV3vLn/R4z72VF2UbB4a7phadyM5pAUo5m5ANG2GGzXDTQc05PPUHYN/fPtzf5BzuXb/mIIt7hNgv9l52QbDlfiRpwzifPAeZcvnd2PAVicMZ5qUhbbVtFqtp2/fWrs/jA5FA2XlNxeZxTHyCUyUviI09vI4aXupMPu8IOQIp/5Qe2Wa8xsMSK1WDNofadJF9iR6SI0sWoJmBputO9UTjiK6+97j/jjpG8hZp/G92wXJw+sE2YHjQJwuE8zSJ+19KAQk/ofh8jUt75YNRCMMXVGSC6sO2ptLPCPdRSVqsq+wBQSwcII+hBcQsBAAD3AgAAUEsDBBQACAgIAEwPW0oAAAAAAAAAAAAAAAAHAAAAdXNlci5qc51WTW/bMAy971cMOW3AKqTretlOXdcBA4Z1aFDsKMgSbauRJU0fcfPvR/mjSRNHbndKbJMS+fj4yOjBUeugfLconGnxiXhWQvdf6oo0TLXMAQHNCgVi8eFtyZSH91/exJ2nYAFtrHEhudTAVKj7Z4JGG8ln/DWE1rg1qUOwxNbS19uz9Nky788U6CrU6Pjx8vK52xiwAybwR0AAHkB8l86HK4yFK0C34OJhuKbBvB4pr51pgHrupA3URU2DbJLLxXL6osAKTxAOfauvlfEwnc1oLUyrlWEC79KsSsDWpv1Tg14hWgmpaXeLQdng02W0MYKpGexhE4xRnoBzxnGjvVH7cB+n72WljUbUGmgKcKvu0edz8eC9RKtgkAsOfETcSgyUcsd8nfdVUq+JsaApPAZwmqlUzFczqExlvYt6+rIWCuHkBp8Z54DljBoz90gHysEFP4nEU6Wkt4ptQdycL1e/DDInlfbTtDG+Erf6j9RYX3++JBIvMvd3P9FjwQoTw+dCMb1... <truncated>


$appBuildId
[1] "20170125094131"

$version
[1] ""

$platform
[1] "LINUX"

$proxy
named list()

$command_id
[1] 1

$nativeEvents
[1] TRUE

$specificationLevel
[1] 0

$acceptSslCerts
[1] FALSE

$processId
[1] 3012

$webdriver.remote.sessionid
[1] "6263b5ab-9375-425e-aa00-8fc632dc492e"

$browserVersion
[1] "51.0.1"

$platformVersion
[1] "4.4.47-boot2docker"

$XULappId
[1] "{ec8030f7-c20a-464f-9b0e-13a3a9e97384}"

$browserName
[1] "firefox"

$takesScreenshot
[1] TRUE

$javascriptEnabled
[1] TRUE

$takesElementScreenshot
[1] TRUE

$platformName
[1] "linux"

$cssSelectorsEnabled
[1] TRUE

$firefox_profile
[1] "UEsDBBQAAgAIAJRZW0oj6EFxCwEAAPcCAAAIAAAAcHJlZnMuanOlkU9LAzEQxe+C36HsSaFmwVv1tNBjbwoey2wy242dZsJM0v36JqJYimWV3vLn/R4z72VF2UbB4a7phadyM5pAUo5m5ANG2GGzXDTQc05PPUHYN/fPtzf5BzuXb/mIIt7hNgv9l52QbDlfiRpwzifPAeZcvnd2PAVicMZ5qUhbbVtFqtp2/fWrs/jA5FA2XlNxeZxTHyCUyUviI09vI4aXupMPu8IOQIp/5Qe2Wa8xsMSK1WDNofadJF9iR6SI0sWoJmBputO9UTjiK6+97j/jjpG8hZp/G92wXJw+sE2YHjQJwuE8zSJ+19KAQk/ofh8jUt75YNRCMMXVGSC6sO2ptLPCPdRSVqsq+wBQSwECHgAUAAIACACUWVtKI+hBcQsBAAD3AgAACAAAAAAAAAABACAAAAAAAAAAcHJlZnMuanNQSwUGAAAAAAEAAQA2AAAAMQEAAAAA"

$id
[1] "6263b5ab-9375-425e-aa00-8fc632dc492e"
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I had the same problems using the most recent version of firefox (51.0.1). This was on a windows machine and the issue seemed to be the pdfjs.disabled flag. The issue was not present in older versions of firefox. The Docker image tagged 2.53.1 runs firefox 47 for example. If possible run an older version using (on a linux box):

docker run -d -p 4445:4444 -p 5901:5900 -v /home/john/test:/home/seluser/Downloads selenium/standalone-firefox-debug:2.53.1

Now running your code we see:

fprof <- makeFirefoxProfile(list(browser.startup.homepage = "about:blank"
                                 , startup.homepage_override_url = "about:blank"
                                 , startup.homepage_welcome_url = "about:blank"
                                 , startup.homepage_welcome_url.additional = "about:blank"
                                 , browser.download.dir = "/home/seluser/Downloads"
                                 , browser.download.folderList = 2L
                                 , browser.download.manager.showWhenStarting = FALSE
                                 , browser.download.manager.focusWhenStarting = FALSE
                                 , browser.download.manager.closeWhenDone = TRUE
                                 , browser.helperApps.neverAsk.saveToDisk = "application/pdf, application/octet-stream"
                                 , pdfjs.disabled = TRUE
                                 , plugin.scan.plid.all = FALSE
                                 , plugin.scan.Acrobat = 99L))
library(RSelenium)

remDr <- remoteDriver(port = 4445L, extraCapabilities = fprof)
remDr$open()
remDr$navigate("http://www.equibase.com/premium/eqbPDFChartPlus.cfm?RACE=A&BorP=P&TID=BEL&CTRY=USA&DT=09/12/2015&DAY=D&STYLE=EQB")

> list.files("/home/john/test/")
[1] "eqbPDFChartPlus.cfm"

The pdf would need to be renamed (its being named as a colfusion .cfm file)

As to what is happening with more recent versions of firefox you would need to refer that to most likely the geckodriver project. Users with clients other than RSelenium have also had recent issues Can't download PDF with selenium webdriver + firefox


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...