0

I am trying to get the total number of videos that are on a dynamically generated page. To do this I parse the page's html and search for all <object>, <iframe> and <embed> tags. The page won't have any other type of iframe content other than video embed codes so I can be sure that any iframe tag is a video. The problem is some embed codes like Hulu for example have the <embed> tag inside the <object> tag. So with my current REGEX:

'/(<iframe|<object|<embed)/i'

this Hulu embed code is seen as 2 videos instead of one:

<object id="videoplayer1" width="728" height="407">
   <param name="movie" value='http://www.hulu.com/embed/7qXAa2z1zXKPMw4mBakrRw'></param>
   <param name="allowFullScreen" value="true"></param>
   <param name="allowScriptAccess" value="never"></param>
   <embed src='http://www.hulu.com/embed/7qXAa2z1zXKPMw4mBakrRw' type="application/x-shockwave-flash" allowfullscreen="true" width="728" height="407" allowscriptaccess='never'></embed>
</object>

Rather than searching for all embed tags I only want to search for the ones that aren't encapsulated by <object> tags. So the hulu one above will be avoided but one like this will be counted:

<embed src="http://www.ebaumsworld.com/player.swf" allowScriptAccess="always" flashvars="id1=81748652" wmode="opaque" width="567" height="345" allowfullscreen="true" />

What would the REGEX pattern look like for this, I'm using PHP.


Converting a string of digits to an integer is mostly fairly simple: you read one digit at a time, convert that to a decimal number (normally by subtracting '0' from it). You take your existing value, multiply it by ten, and add the value of the current digit.

Dealing with negative numbers adds just a bit more difficulty to that. Most people do it by keeping a flag to indicate the number is negative if it starts with a '-'. Then, when they've converted the number, they negate if if that flag is set.

That does, however, have one problem: converting the most negative number takes some extra work, because (in 2's complement) the most negative number has a larger magnitude than you can represent as a positive number (without using more bits). For example, 16-bit 2's complement numbers range from -32768 to +32767, but you need either (at least) 17 bits or an unsigned 16-bit number to represent +32768.

Edit: Once you've converted the decimal digits to an integer, you'll need to convert the integer to hexadecimal digits to display it in hex. That conversion is a little bit easier. You repeatedly divide by 16 and the remainder becomes the next hexadecimal digit. You'll normally use a table like "0123456789abcdef" and use that remainder to index into the table to get the digit for display. You repeat the division and using the remainder until your dividend is zero. The one trick is that this produces the digits in reverse order (from least to most significant), so you normally put them into a buffer, starting from the end of the buffer and working your way toward the beginning.

4

1 に答える 1

0

xpathを使用したXmlパーサーも私の行き先です

于 2012-04-09T22:58:11.570 に答える