php - regex php：div内のすべてを検索

Question

正規表現を使用してdiv内のすべてのものを見つけようとしています。これを行うにはおそらくもっと賢い方法があることを私は知っていますが、私は正規表現を選択しました。

したがって、現在、私の正規表現パターンは次のようになっています。

$gallery_pattern = '/<div class="gallery">([\s\S]*)<\/div>/';

そして、それはトリックを行います-いくらか。

問題は、2つのdivが次々にある場合です-このように。

<div class="gallery">text to extract here</div>
<div class="gallery">text to extract from here as well</div>

両方のdivから情報を抽出したいのですが、テストするときの私の問題は、結果として間にテキストが表示されないことですが、代わりに次のようになります。

"text to extract here </div>  
<div class="gallery">text to extract from here as well"

要約すると。divの最初の終わりをスキップします。そして次へと続きます。div内のテキストには<、/と改行を含めることができます。ちょうどあなたが知っているので！

誰かがこの問題の簡単な解決策を持っていますか？私はまだ正規表現の初心者です。

score 12 · Accepted Answer

便利なDOMライブラリがある場合は、正規表現を使用してHTMLを解析しないでください。

$str = '
<div class="gallery">text to extract here</div>
<div class="gallery">text to extract from here as well</div>
';

$doc = new DOMDocument();
$doc->loadHTML($str);
$divs = $doc->getElementsByTagName('div');

if ( count($divs ) ) {
    foreach ( $divs as $div ) {
    echo $div->nodeValue . '<br>';
    }
}

score 9 · Accepted Answer

このようなものはどうですか：

$str = <<<HTML
<div class="gallery">text to extract here</div>
<div class="gallery">text to extract from here as well</div>
HTML;

$matches = array();
preg_match_all('#<div[^>]*>(.*?)</div>#s', $str, $matches);

var_dump($matches[1]);

'？'に注意してください正規表現であるため、「貪欲ではありません」。

どちらがあなたを得るでしょう：

array
  0 => string 'text to extract here' (length=20)
  1 => string 'text to extract from here as well' (length=33)

これはうまくいくはずです...あなたが覆い隠されたdivを持っていないなら; もしそうなら...まあ...実際に：あなたは本当に合理的な式を使ってHTMLを解析したいのですか？それ自体はそれほど合理的ではありませんか？

score 0 · Accepted Answer

この問題に対する可能な答えは http://simplehtmldom.sourceforge.net/で見つけることができますそのクラスは私が同様の問題を素早く解決するのを助けます

php - regex php：div内のすべてを検索

3 に答える 3

Related

Reference