Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
316 views
in Technique[技术] by (71.8m points)

screen scraping - CasperJS click event having AJAX call

I am trying to fetch data from a site by simulating events using CasperJS with phantomJS 1.7.0.

I am able to simulate normal click events and select events. But my code fails in following scenario:

When I click on button / anchor etc on remote page, the click on remote page initiates an AJAX call / JS call(depending on how that page is implemented by programmer.).

In case of JS call, my code works and I get changed data. But for clicks where is AJAX call is initiated, I do not get updated data.

For debugging, I tried to get the page source of the element container(before and after), but I see no change in code.

I tried to set wait time from 10 sec to 1 ms range, but that to does not reflect any changes in behavior.

Below is my piece of code for clicking. I am using an array of CSS Paths, which represents which element(s) to click.

/*Click on array of clickable elements using CSS Paths.*/
fn_click = function(){
casper.each(G_TAGS,function(casper, cssPath, count1) 
                    {
                            casper.then ( function() {
                            casper.click(cssPath);

                            this.echo('DEBUG AFTER CLICKING -START HTML ');
                            //this.echo(this.getHTML("CONTAINER WHERE DETAILS CHANGE"));
                            this.echo('DEBUG AFTER CLICKING -START HTML');
                            casper.wait(5000, function() 
                                                    {   

                                                        casper.then(fn_getData);
                                                    } 
                                    );
                            });     
                    });
};

UPDATE:

I tried to use remote-debug option from phantomJS, to debug above script. It is not working. I am on windows. I will try to run remote debugging on Ubuntu as well.

Please help me. I would appreciate any help on this.

UPDATE:

Please have a look at following code as a sample.

https://gist.github.com/4441570

Content before click and after click are same.

I am clicking on sorting options provided under tag (votes / activity etc.).

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I had the same problem today. I found this post, which put me in the direction of jQuery.

After some testing I found out that there was already a jQuery loaded on that webpage. (A pretty old version though)

Loading another jQuery on top of that broke any js calls made, so also the link that does an Ajax call.

To solve this I found http://api.jquery.com/jQuery.noConflict/

and I added the following to my code:

    this.evaluate(function () { jq = $.noConflict(true) } ); 

Anything that was formerly assigned to $ will be restored that way. And the jQuery that you injected is now available under 'jq'.

Hope this helps you guys.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...