php - 正規表現：検出するパターン内部にないタグ

Question

I am trying to get the total number of videos that are on a dynamically generated page. To do this I parse the page's html and search for all <object>, <iframe> and <embed> tags. The page won't have any other type of iframe content other than video embed codes so I can be sure that any iframe tag is a video. The problem is some embed codes like Hulu for example have the <embed> tag inside the <object> tag. So with my current REGEX:

'/(<iframe|<object|<embed)/i'

this Hulu embed code is seen as 2 videos instead of one:

<object id="videoplayer1" width="728" height="407">
   <param name="movie" value='http://www.hulu.com/embed/7qXAa2z1zXKPMw4mBakrRw'></param>
   <param name="allowFullScreen" value="true"></param>
   <param name="allowScriptAccess" value="never"></param>
   <embed src='http://www.hulu.com/embed/7qXAa2z1zXKPMw4mBakrRw' type="application/x-shockwave-flash" allowfullscreen="true" width="728" height="407" allowscriptaccess='never'></embed>
</object>

Rather than searching for all embed tags I only want to search for the ones that aren't encapsulated by <object> tags. So the hulu one above will be avoided but one like this will be counted:

<embed src="http://www.ebaumsworld.com/player.swf" allowScriptAccess="always" flashvars="id1=81748652" wmode="opaque" width="567" height="345" allowfullscreen="true" />

What would the REGEX pattern look like for this, I'm using PHP.

score 0 · Accepted Answer

0

xpathを使用したXmlパーサーも私の行き先です

于 2012-04-09T22:58:11.570 に答える

1 に答える 1

Related

Reference