php - この php クローラーが機能しないのはなぜですか?

Question

私のローカルホストのドキュメントルートでは:

クロール.html

<html>
<body>
<p>
<form action="welcome.php" method="get">
Site to crawl: <input type="text" name="crawlThis">
<input type="submit">
</form>
</p>

</body>
</html>

ようこそ.php

 <html>
 <body>

 <?php 
 include ("crawler.php");

 echo $crawl = new Crawler($_GET["crawlThis"]);

 $images = $crawl->get("images");

 $links = $crawl->get("links"); 

 echo $links;
 echo $images;

 ?>
 <br>

</body>
</html>

およびcrawler.php

<?php

class Crawler {

protected $markup = '';

public function __construct($uri) {

$this->markup = $this->getMarkup($uri);

}

public function getMarkup($uri) {

return file_get_contents($uri);

}

public function get($type) {

$method = "_get_{$type}";

if (method_exists($this, $method)){

return call_user_method($method, $this);

}

}

protected function _get_images() {

if (!empty($this->markup)){

preg_match_all('/<img([^>]+)\/>/i', $this->markup, $images);

return !empty($images[1]) ? $images[1] : FALSE;

}

}

protected function _get_links() {

if (!empty($this->markup)){

preg_match_all('/<a([^>]+)\>(.*?)\<\/a\>/i', $this->markup, $links);

return !empty($links[1]) ? $links[1] : FALSE;

}

}

}


/*$crawl = new Crawler($);

$images = $crawl->get('images');

$links = $crawl->get('links');*/

?>

結果ページは空です。$images をエコーできないだけなのか、ロジックが間違っているのかわかりません。画像のリストと、リンクのリストを期待しています。

また、crawler.php を含める必要がありますか、それとも php は同じ名前のクラスのコンテナーディレクトリを検索しますか?

申し訳ありませんが、Java から PHP に移行するのは少々面倒です。

score 1 · Accepted Answer

1

于 2012-12-24T21:44:49.143 に答える

score 0 · Accepted Answer

私はすべて自分で書くことに賛成ですが、これを行うための文書化された例がたくさんあります。以下は、従うか使用できる良い例です。

クローラの例

php - この php クローラーが機能しないのはなぜですか?

私のローカルホストのドキュメントルートでは:

2 に答える 2

Related

Reference