php - simplehtmldom URL の抽出とパターンのマッチング

Question

指定された Web URL からすべてのハイパーリンク (URL) を抽出する次のコードがあります。

<?php include "simple_html_dom.php";

$html=new simple_html_dom();
$html->load_file('http://www.indiastudychannel.com/colleges');
$links=$html->find('a');

foreach($links as $l)
{

    $path='http://www.indiastudychannel.com/colleges'.$l->href;
    //doScrape($path);
    echo $path."<br>";
}
?>

上記のコードはすべてのハイパーリンクを抽出しますが、次のパターンを持つリンクのみを抽出したいと考えています。

http://www.indiastudychannel.com/colleges/54499-Godavari-College-Nursing.aspx http://www.indiastudychannel.com/colleges/54489-Rvs-College-Arts-And-Science.aspx http:// www.indiastudychannel.com/colleges/54488-Sankara-Institute-Management.aspx

これは正規表現を使用して実行できることは知っていますが、正確にはわかりません。わかりやすいように例を挙げてください。

score 0 · Accepted Answer

試す

preg_match("/^http://www.indiastudychannel.com/colleges/54489/\");

を見てみましょう

http://php.net/manual/en/function.preg-match.php

と

http://weblogtoolscollection.com/regex/regex.php

php - simplehtmldom URL の抽出とパターンのマッチング

1 に答える 1

Related

Reference