Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
242 views
in Technique[技术] by (71.8m points)

c# - How can i get a List<string> of images from a website link?

I want to build a List of images from a website link and then later also to save/download the images to the hard disk.

This is how im getting the links from a website/url:

private List<string> getLinks(HtmlAgilityPack.HtmlDocument document)
        {

            List<string> mainLinks = new List<string>();
            var linkNodes = document.DocumentNode.SelectNodes("//a[@href]");
            if (linkNodes != null)
            {
                foreach (HtmlNode link in linkNodes)
                {
                    var href = link.Attributes["href"].Value;
                    if (href.StartsWith("http://") ==true || href.StartsWith("https://") ==true || href.StartsWith("www.") ==true) // filter for http 
                    {
                        mainLinks.Add(href);
                    }
                }
            }
            return mainLinks;

        }

Then im using in this test function to build a List of all the links from the url:

private List<string> test(string url, int levels, DoWorkEventArgs eve)
        {

            HtmlWeb hw = new HtmlWeb();
            List<string> webSites;

                        try
            {
                doc = hw.Load(url);
                webSites = getLinks(doc);
                //retriveImages();

So webSites wich is a List will have all the link from the main website url fro example if its google.com then in webSites i will have 19 items each one is a link from google.com

And retrieveImages() is:

private void retrieveImages()
        {
            var nodes = doc.DocumentNode.SelectNodes("//img");
            foreach (var node in nodes)
            {
                List<string> images = new List<string>();
                images.Add(node.Name);
            } 

        }

And retrieveImages() for sure is not good code and also is not working. If im using it moving the // and calling the retrieveImages() nothing happened webSites List is empty.

What i want to do is in the function retrieveImages() to build a list of images from the curretn site/link im in now and then also to download the images to the hard disk the images in the List i built.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
private List<string> retrieveImages()
{
  List<string> imgList = new List<string>();
  HtmlDocument doc = new HtmlDocument();
  doc.Load("file.htm"); //or whatever HTML file you have
  HtmlNodeCollection imgs = doc.DocumentNode.SelectNodes("//img[@src]");
  if (imgs == null) return new List<string>();

  foreach (HtmlNode img in imgs)
  {
   if (img.Attributes["src"] == null)
      continue;
   HtmlAttribute src = img.Attributes["src"];

   imgList.Add(src.Value);
   //Do something with src.Value such as Get the image and save it locally
   // Image img = GetImage(src.Value)
   // img.Save(aLocalFilePath);
  }
return imgList;
}

private Image GetImage(string url)
{
    System.Net.WebRequest request = System.Net.WebRequest.Create(url);

    System.Net.WebResponse response = request.GetResponse();
    System.IO.Stream responseStream = response.GetResponseStream();

    Bitmap bmp = new Bitmap(responseStream);

    responseStream.Dispose();

    return bmp;
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...