c# - Web サイトのリンクがありますが、Web サイトからすべてのファイルをダウンロードするにはどうすればよいですか?

Question

例: http://www.test.com 私のプログラムはクロールを掘っています。そのため、毎回すべてのファイルをダウンロードする必要があります。

例えば：

using (WebClient Client = new WebClient ())
{
    Client.DownloadFile("http://www.abc.com/file/song/a.mpeg", "a.mpeg");
}

これにより、特定の a.mpeg ファイルのみがダウンロードされます。私は次のようなことをしたい:

using (WebClient Client = new WebClient ())
{
    Client.DownloadFile(address, "*.*");
}

アドレスは常に変更されているため、mpeg、jpg、avi などの特定のファイルではなく、すべてのファイルをダウンロードしたいので、任意の拡張子を指定します。

" . " をするのは正しい方法 ?

編集**

これは私が今日画像をダウンロードする方法です：

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using HtmlAgilityPack;
using System.IO;
using System.Text.RegularExpressions;
using System.Xml.Linq;
using System.Net;
using System.Web;
using System.Threading;
using DannyGeneral;
using GatherLinks;

namespace GatherLinks
{
    class RetrieveWebContent
    {
        HtmlAgilityPack.HtmlDocument doc;
        string imgg;
        int images;

        public RetrieveWebContent()
        {
            images = 0;
        }

        public List<string> retrieveFiles(string address)
        {

        }

        public List<string> retrieveImages(string address)
        {

            System.Net.WebClient wc = new System.Net.WebClient();
            List<string> imgList = new List<string>();
            try
            {
                    doc = new HtmlAgilityPack.HtmlDocument();
                    doc.Load(wc.OpenRead(address));
                    string t = doc.DocumentNode.InnerText;
                    HtmlNodeCollection imgs = doc.DocumentNode.SelectNodes("//img[@src]");
                    if (imgs == null) return new List<string>();

                    foreach (HtmlNode img in imgs)
                    {
                        if (img.Attributes["src"] == null)
                            continue;
                        HtmlAttribute src = img.Attributes["src"];
                        imgList.Add(src.Value);
                        if (src.Value.StartsWith("http") || src.Value.StartsWith("https") || src.Value.StartsWith("www"))
                        {
                            images++;
                            string[] arr = src.Value.Split('/');
                            imgg = arr[arr.Length - 1];
                            //imgg = Path.GetFileName(new Uri(src.Value).LocalPath);
                            //wc.DownloadFile(src.Value, @"d:\MyImages\" + imgg);
                            wc.DownloadFile(src.Value, "d:\\MyImages\\" + Guid.NewGuid() + ".jpg");
                        }
                    }
                return imgList;
            }
            catch
            {
                Logger.Write("There Was Problem Downloading The Image: " + imgg);
                return null;

            }
        }
    }
}

コードのこの場所で：

public List<string> retrieveFiles(string address)
        {

        }

jpg ファイルだけをダウンロードするのではなく、あらゆる種類のファイルをダウンロードしたいと考えています。たとえば、リンクがhttp://tes.com \i.jpg である場合、何らかの方法で保存するのではなく、Web サイトを解析する必要があるのはなぜですか?

score 3 · Accepted Answer

いいえ、WebClient.DownloadFile がクローラーのように動作することは決してありません。ページをダウンロードし、返されたページ HTML でC# HtmlParserを使用し、関心のあるリソースを列挙して、それらをすべて個別にダウンロードする必要があります。

c# - Web サイトのリンクがありますが、Web サイトからすべてのファイルをダウンロードするにはどうすればよいですか?

1 に答える 1

Related

Reference