c# - 特定の条件を使用して文字列から値を取得する

Question

文字列に段落値のみを取得する必要があるhtmlデータがあります。以下はサンプルhtmlです。

<html>
  <head>
    <title>
       <script>
          <div>
               Some contents
           </div>
          <div>
            <p> This is what i want </p>
            <p> Select all data from p </p>
            <p> Upto this is required </p>
          </div>
         <div>
          Other html elements
         </div>

したがって、文字列操作を使用して段落からデータを取得する方法。

必要な出力

<Div>
  <p> This is what i want    </p>
  <p> Select all data from p </p>
  <p> Upto this is required  </p>
</div>

score 1 · Accepted Answer

divにIDを付けます。例：

<div id="test">
<p> This is what i want </p>
<p> Select all data from p </p>
<p> Upto this is required </p>
</div>

次に、を使用します//div[@id='test']/p。

解決策の内訳：

//div                    - All div elements
[@id='test']   - With an ID attribute whose value is test
/p

score 0 · Accepted Answer

私はこのようなものにHtml敏捷性パックを使用しました。次に、LINQを使用して必要なものを取得できます。

score 0 · Accepted Answer

Xpathは明白な答えであり（HTMLが適切で、ルートがある場合など）、 chilkatのようなサードパーティのウィジェットに失敗します

score 0 · Accepted Answer

他の投稿で説明されているようにHtmlAgilityPackを使用する場合は、次を使用してhtmlのすべての段落要素を取得できます。

HtmlDocument doc = new HtmlDocument();
doc.Load("your html string");
var pNodes = doc.DocumentNode.SelectNodes("//div[@id='id of the div']/p")

.net Framework 2.0を使用しているため、古いバージョンのAgilityPackが必要になります。これはHTMLAgilityPackにあります。

段落内のテキストだけが必要な場合は、次を使用できます

var pNodes = doc.DocumentNode.SelectNodes("//div[@id='id of the div']/p/text()")

c# - 特定の条件を使用して文字列から値を取得する

4 に答える 4

Related

Reference