php - PHP を使用してテキストブロックからハイパーリンクをトリミングする

Question

Web ページに次のHTMLがあります。

<p>This is a <a href="http://www.google.com/">hyperlink</a> and this is another <a href="http://www.bing.com/">hyperlink</a>. There are many like it, but <a href="http://en.wikipedia.org/wiki/Full_Metal_Jacket">this one is mine</a>.</p>

さて、ふと思ったのですが…

PHP 関数を使用して、このテキストブロックを配列に分割する方法はありますか?

$html[0] = "<p>This is a & this is another . There are many like it, but .</p>";
$html[1] = "http://www.google.com/";
$html[2] = "http://www.bing.com/";
$html[3] = "http://en.wikipedia.org/wiki/Full_Metal_Jacket";

したがって、基本的にすべてのハイパーリンクのテキストの最初のブロックを取り除き、それらすべてを独自の配列要素に格納します。

これについて何か助けてくれてありがとう。

score 1 · Accepted Answer

この正規表現を使用して、htmlのURLを取得します。

  $url = "http://www.example.net/somepage.html";
  $input = @file_get_contents($url) or die("Could not access file: $url");
  $regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
  if(preg_match_all("/$regexp/siU", $input, $matches)) {
    // $matches[2] = array of link addresses
    // $matches[3] = array of link text - including HTML code
  }
?>

php - PHP を使用してテキスト ブロックからハイパーリンクをトリミングする

1 に答える 1

Related

Reference

php - PHP を使用してテキストブロックからハイパーリンクをトリミングする