Well, although it is a horrible solution (workaround, actually...), I finally decided to disable the automatic loading of frames in HtmlUnit as adviced by one of the developers of HtmlUnit. This is what I did in detail:
- Downloaded the HtmlUnit source
- Downloaded maven from here
- Commented the content (the body of the method, not the declaration) of the
loadFrames()
method of the HtmlPage class located in htmlunit-2.9/src/main/java/com/gargoylesoftware/htmlunit/html
- Compiled this custom code skipping tests with:
mvn -Dmaven.test.skip=true clean compile package
- Got the new
htmlunit-2.9.jar
located in htmlunit-2.9/artifacts
and replaced the current htmlunit-2.9.jar
library file
- This step might be the most delicate one as it will depend on each application. However, I'll show you the changes I needed to do to my application.
You know how my original code was (look at the question). That would download all frames and iframes from a page. I'm adding an example on how to get a page with frames just loading the frames you want:
try {
HtmlPage page = webClient.getPage("http://www.w3schools.com/HTML/tryit.asp?filename=tryhtml_noframes");
HtmlInlineFrame frame = page.getFirstByXPath("//iframe[@name='view']");
page = webClient.getPage(page.getFullyQualifiedUrl(frame.getSrcAttribute()));
System.out.println(page.asXml());
} catch (Exception e) {
e.printStackTrace();
}
After this library change, the content of the frame will be empty once the getPage()
method finishes. Notice it won't be null, looks like it is just returning an empty frame. What we need to do is to download the content of the frames we are interested in manually, that's why I'm performing a getPage()
again.
Well this is how I managed to selectively download frames and iframes with HtmlUnit. Any ideas on how to improve this will be appreciated. Anyway, I hope there will be added some way to disable the loading of the frames in HtmlUnit itself in the future, maybe adding a method such as getPage(URL url, boolean downloadFrames)
or something.
Hope this helps someone out there!
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…