php - ページのタグを取り除き、返されたページ番号の配列または分離されたリストを取得します

Question

このページのタグを削除して、ページ番号のリストを取得できるようにしようとしています。したがって、curl プログラムがページのクロールを継続するための最大のページ番号を把握できます。現時点では、数字を取得するポイントまでタグを削除できますが、各数字を区切る方法がわからないため、最大のページ番号を確認できます。

私が受け取る現在の戻り値は

12

これが私のコードです：

<?php
// Defining the basic pruning function
function scrape_between($data, $start, $end){
    $data = stristr($data, $start); // Stripping all data from before $start
    $data = substr($data, strlen($start));  // Stripping $start
    $stop = stripos($data, $end);   // Getting the position of the $end of the data to scrape
    $data = substr($data, 0, $stop);    // Stripping all data from after and including the $end of the data to scrape
    return $data;   // Returning the scraped data from the function
}

ob_start();
?>
<span class="current">1</span><a href="javascript:__doPostBack('ctl00$phCenterColumn$motoSearchResults$gvCatalog$ctl01$ctl03','')">2</a><a href="javascript:__doPostBack('ctl00$phCenterColumn$motoSearchResults$gvCatalog$ctl01$ctl04','')">

<?php
$variable = ob_get_clean();

$startend5 = Array('">' => '</a>');

foreach($startend5 as $o => $p){
   $data = scrape_between($variable, $o, $p);
}
$data = strip_tags($data);
echo $data;
?>

参考までに、ob_start(); そしてob_get_clean(); すべてのcurlコマンドを含め、コードベースを必要以上に長くしたくない例のためだけです。

php - ページのタグを取り除き、返されたページ番号の配列または分離されたリストを取得します

1 に答える 1

Related

Reference