I am tring to scrape a few sites. Here is my code:
for (var i = 0; i < urls.length; i++) {
url = urls[i];
console.log("Start scraping: " + url);
page.open(url, function () {
waitFor(function() {
return page.evaluate(function() {
return document.getElementById("progressWrapper").childNodes.length == 1;
});
}, function() {
var price = page.evaluate(function() {
// do something
return price;
});
console.log(price);
result = url + " ; " + price;
output = output + "
" + result;
});
});
}
fs.write('test.txt', output);
phantom.exit();
I want to scrape all sites in the array urls, extract some information and then write this information to a text file.
But there seems to be a problem with the for loop. When scraping only one site without using a loop, all works as I want. But with the loop, first nothing happens, then the line
console.log("Start scraping: " + url);
is shown, but one time too much.
If url = {a,b,c}, then phantomjs does:
Start scraping: a
Start scraping: b
Start scraping: c
Start scraping:
It seems that page.open isn't called at all.
I am newbie to JS so I am sorry for this stupid question.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…