説明
この正規表現は
- match the first anchor tag after
<div id="rigth_song">
which has an href attribute whose value ends with .mp3
- will avoid many of the edge cases which make matching html text with a regular expression very difficult.
<div\sid="right_song">.*?<a(?=\s|>)(?=(?:[^>=]|='[^']*'|="[^"]*"|=[^'"][^\s>]*)*?\shref=(['"]?)(.*?\.mp3)\1(?:\s|\/>|>))(?:[^>=]|='[^']*'|="[^"]*"|=[^'"][^\s>]*)*>.*?<\/a>
Example
Sample Text
Note the difficult edge case in the second anchor tag, like string href="bad.mp3"
is nested inside an attribute value; there is a javascript greater then sign >
inside a value; and the real href attribute is without quotes.
<a href="http://newday.com/song.mp3">First Link</a>
<div id="right_song">
<div style="font-size:15px;"><b>Pitbull ft. Chris Brown - Pitbull feat. Chris Brown - International Love mp3</b></div>
<div style="clear:both;"></div>
<div style="float:left;">
<div style="float:left; height:27px; font-size:13px; padding-top:2px;">
<div style="float:left;">
<a onmouseover=' href="bad.mp3" ; if ( 6 > x ) {funRotate(href); } ; ' href="http://secondurl.com/thisoneshouldonlyoutput.mp3">First Link</a>
</div>
Code
<?php
$sourcestring="your source string";
preg_match('/<div\sid="right_song">.*?<a(?=\s|>)(?=(?:[^>=]|=\'[^\']*\'|="[^"]*"|=[^\'"][^\s>]*)*?\shref=([\'"]?)(.*?\.mp3)\1(?:\s|\/>|>))(?:[^>=]|=\'[^\']*\'|="[^"]*"|=[^\'"][^\s>]*)*>.*?<\/a>
/imsx',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>
Match
Group 0 gets the text from the <div
through to an including the full matching anchor tag
Group 1 gets the opening quote around the href value which is back referenced later
Group 2 gets the href value
[0] => <div id="right_song">
<div style="font-size:15px;"><b>Pitbull ft. Chris Brown - Pitbull feat. Chris Brown - International Love mp3</b></div>
<div style="clear:both;"></div>
<div style="float:left;">
<div style="float:left; height:27px; font-size:13px; padding-top:2px;">
<div style="float:left;">
<a onmouseover=' href="bad.mp3" ; if ( 6 > x ) {funRotate(href); } ; ' href="http://secondurl.com/thisoneshouldonlyoutput.mp3">First Link</a>
[1] => "
[2] => http://secondurl.com/thisoneshouldonlyoutput.mp3