Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
500 views
in Technique[技术] by (71.8m points)

c# - HTMLAgilityPack SelectNodes to select all <img> elements

I am making a project in C# that's basically an image screen scraper for an image-search related game. I'm trying to use HTMLAgilityPack to select all the image elements and put them in an HTMLNodeCollection, like this:

//set up for checking autos

HtmlNodeCollection imgs = new HtmlNodeCollection(doc.DocumentNode.ParentNode);
imgs = doc.DocumentNode.SelectNodes("//img");

foreach (HtmlNode img in imgs)
{
    HtmlAttribute src = img.Attributes["@src"];
    urls.Add(src.Value);
}

Note that urls is a public List collection:

public List<string> urls = new List<string>();

My foreach loop is throwing an exception:

Object reference not set to an instance of an object.

Checking the autos, sure enough, imgs is null. Is there any better way I can track down the source of this problem? I have no idea if it's my Xpath or what.

The most frustrating part is that I had already gotten it to work, but messed up my file versions and lost my work. Derp.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You might have a typo in the following line:

HtmlAttribute src = img.Attributes["@src"];

I got this to work for me (notice the @ position):

HtmlAttribute src = img.Attributes[@"src"];

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...