php - HTML コンテンツの解析方法

翻译自：https://stackoverflow.com/questions/18282380 2013-08-16T21:28:01.787

48 次

ウェブクローラーのニュースをやりたいです。リンクhttp://vnexpress.net/tin-tuc/ban-doc-viet/xa-hoi/chay-xe-may-theo-taxi-moi-biet-bi-chem-60-000-からコンテンツをロードしたいdong-2865724.htmlと、クラス fck_detailを持つすべてのコンテンツ div を取得し、そこから元のタグを保持したい。これを行う方法？

    <div class="fck_detail">
    <p class="Normal" style="text-align:justify;">Some texts</p>
    <p class="Normal" style="text-align:justify;">some texts</p>
    <p class="Normal" style="text-align:justify;">Some texts</p>
    <p class="Normal" style="text-align:justify;">Some texts</p>
    </div>

試しましたが成功しませんでした

    $doc = new DOMDocument();
    $doc->loadHTMLFile("http://example.com/some.html");
    $selector = new DOMXpath($doc);   
    $node = $selector->query('//div[@class="fck_detail"]')->item(0);
    echo trim($node->nodeValue);

上記のコードは、すべての HTML のみを取り除いたプレーンテキストを提供します。しかし、私はHTMLを保持したい。

php - HTML コンテンツの解析方法

0 に答える 0

Related

Reference