0

各リンクにアクセスして、説明だけでなく投稿のコンテンツ全体を取得する RSS フィード アグリゲーターを作成しています。これは私のコードです:

<?php
function getcontent($l,$b,$c)
{
    $dom=file_get_html($l);
    $atitle=$dom->find($b);
    $content=$dom->find($c);
    $contents=implode(" ",$content);
foreach($atitle as $t)
            {
                echo "<b>".$t."</b>";

            }
            echo "<br /><br />";
        echo $contents;
        echo "<br />";
}
function filtercontent($strip,$l,$b,$c)
{
    $dom=file_get_html($l);
    $atitle=$dom->find($b);
    $content=$dom->find($c);
    $contents=implode(" ",$content);
    $contents=stristr($contents,$strip,true);
    foreach($atitle as $t)
            {
                echo "<b>".$t."</b>";

            }
            echo "<br />";
            echo $contents;
            echo "<br /><br />";

}
ini_set('default_charset', 'UTF-8');
ini_set('max_execution_time',0);
ini_set('memory_limit', -1);
include("simple_html_dom.php");

$url=array("http://www.deccanherald.com/rss/news.rss","http://syndication.indianexpress.com/rss/798/latest-news.xml");

$atitle=NULL;
$content=NULL;
foreach($url as $feed)
{
    $f=$feed;
    $feed=simplexml_load_file($feed);
    //echo $feed;
    if($feed)
    {
        //$feed_title=$feed->channel->title;
        //echo "<br />".$feed_title."<br />";
        $items=$feed->channel->item;
        foreach($items as $item)
        {
            //foreach($keywords as $key)
            //{
            //if(strtolower($item->description)==$key || strtolower($item->title)==$key)
            //{

        $title=$item->title;
        //echo "<h1><b>".$title."</b></h1><br />";
        $link=$item->link;
        //echo "<a href='".$link."'>".$link."</a><br />";
        $des=$item->description;
        //echo "<br />".$des."<br />";


            if($f=="http://beta.thehindu.com/news/?service=rss")
            {
            $title_class=".detail-title";
            $content_class=".body";
            getcontent($link,$title_class,$content_class);

            }
            if($f=="http://in.news.yahoo.com/rss/national/")
            {
            $title_class=".headline";
            $content_class=".yom-art-content";
            getcontent($link,$title_class,$content_class);
            }


        if($f=="http://syndication.indianexpress.com/rss/798/latest-news.xml")
            {

            $link=$link."0";
            $title_class=".headstory";
            $content_class=".contentLeftbigstory";
            $strip='<div class="paginationNew">';
            filtercontent($strip,$link,$title_class,$content_class);

            }
            if($f=="http://www.indiatvnews.com/rssfeed/india_news.xml")
            {

            $title_class=".topstorytitsub";
            $content_class=".standard";
            foreach($link as $post)
            {
                $dom=file_get_html($link);
                $title=$dom->find($title_class);
                $content=$dom->find('div[style=min-height:350px]');
                foreach($title as $t)
                echo "<b>".$t."</b><br />";
                foreach($content as $c)
                {
                    echo $c;

                }

            }


            }
            if($f=="http://beta.thehindu.com/news/?service=rss")
            {
            $title_class=".detail-title";
            $content_class=".body";
            getcontent($link,$title_class,$content_class);

            }
            if($f=="http://www.deccanherald.com/rss/news.rss")
            {
            $title_class=".newsText";
            $content_class=".postedBy";
            $strip='<a href="#top" class="gototop">Go to Top</a>';
            filtercontent($strip,$link,$title_class,$content_class);            
            }


            }
    }
        }


?> 

シンプルな html dom パーサーを使用して htmlを解析します。filtercontent 関数は、他の入力とは別に文字列の一部を入力として受け取ります。ストリップと呼ばれるこの文字列は、ストリップ文字列が最初に出現する前にすべてのコンテンツをフィルタリングして返すために使用されます。完璧に機能します。 syndication.com フィードでは失敗しますが、deccanherald.com フィードでは失敗します。理解を容易にするために他のフィードを除外しました。また、正常に機能する getcontent 関数を使用するフィードもあります。deccan herald の投稿のサンプルソースは次のとおりです。

<h1>Crazy star Ravichandran takes potshots at TV channels</h1>

                                                            <div class="postedBy">Mysore, September 28, 2012, DHNS:
                                                                                            <p>Actor opens ‘Conflux 2012’ media fest at Mahajana’s college in city</p>
                                                        <a name="top"></a>

                                                        <p><p><strong>When actor, director and producer of Kannada filmdom V&#8200;Ravichandran was invited to inaugurate &lsquo;Conflux 2012&rsquo; a two-day inter-collegiate media and communication fest of&#8200;SBRR&#8200;Mahajana First&#8200;Grade College in the city on Friday, many would have thought it contrasting.</strong><br /><br />However, when Ravi as he is popular among his acolytes, took over the dais and addressed the gathering where youngsters topped others, the choice of selecting Ravichandran to open the fest seemed apt. <br /><br />Mincing no words, the actor nick named &lsquo;Crazy Star&rsquo; made a relevant remark taking potshots at the electronic media for opting negativism rather than positive aspects to up their television rating points (TRP). Taking the names of two channels in Kannada, the actor said they are indulging in taking the people for a ride with concocted facts.<br /><br /> More than that, almost all the channels are airing moribund programmes. Said&#8200;Ravichandran; &ldquo; Pen is mightier than sword and show your talent in reaching the people and guide them.&rdquo;<br /><br />On filmdom, Ravichandran said that the fans still want him to romance heroines like what he did in Premaloka and other flicks. &ldquo;&#8200;I have already turned 50&rdquo;, said&#8200;Ravichandran making it clear that he cannot redo what he did in the past.&#8200;Referring to &lsquo;Manjina Hani&rsquo; the most awaited movie from his banner from the past several years, the actor said &lsquo;he is discovering the man in him&rsquo;.  <br /><br />Earlier, it was a filmy welcome to the actor. No sooner he entered the hall, pat filled the air an all time hit song from Ranadheera; baa baaro ranadheera...  <br /><br />Principal of the college&#8200;Prof K&#8200;V&#8200;Prabhakar said students from as many as 18 colleges from several parts of the State are participating in the fest.</p><p>To avoid chaos, the management had prohibited the entry of outsiders (especially students). <br /><br />Barring the participants, dignitaries and media, others were not allowed with students of the college keeping a tab on the visitors at the main gate of Vivekananda Hall of the college.<br /><br />Jayalakshmipuram police had to disperse the mad crowd who had dared to assemble in front of the hall.<br /><br />Chairman of&#8200;Mahajana Education Society R&#8200;Vasudevamurthy, HoD, mass communication and journalism Nivedita and others were present.<br /><br /><strong>Supports Cauvery stir</strong><br /><br />Actor&#8200;Ravichandran on&#8200;Friday extended support to ongoing agitation against the centre&rsquo;s directive to State to release 9,000 cusec of water to Tamil Nadu. On Karnataka bandh call given by various organisations on October 6 over the same issue, the actor said he too will support following Karnataka&#8200;Film&#8200;Chamber of Commerce&rsquo;s (KFCC) similar announcement. &ldquo;When the State itself is facing acute water shortage, how can we release water to them&rdquo;, the actor asserted. He also denied any interests to join politics saying; nange rajakeeya barolla (I don&rsquo;t know politics).</p></p>

                            <p class="gotoTop"><a href="#top" class="gototop">Go to Top</a></p>


                            <div class="socialNetworkingLinks">
                                 <a href="http://www.deccanherald.com/tell_a_friend.php?id=281782" style="margin-left:-5px;"><img src="http://www.deccanherald.com/images/email.jpg" alt="" border="0" /></a> 
                                <a href="#" onClick="javascript:window.print();"><img src="http://www.deccanherald.com/images/print.jpg" alt="" border="0" onClick="javascript:window.print();" /></a> 
                                <a href="javascript:addToFavorites()"><img src="http://www.deccanherald.com/images/bookmark.jpg" alt="" border="0" /></a>

私も$strip='<p class="gotoTop">'andも使用$strip='<div class="socialNetworkingLinks">'$strip="Go to Top"ましたが、何も機能しません.すべてがトップに移動してツールバーをソーシャル化して結果を返します.なぜ機能しないのですか. コードの何が問題なのですか。あるフィードでは機能しますが、他のフィードでは機能しません。これを修正するのを手伝ってください。

スクリーンショット: ここに画像の説明を入力

「トップへ」から始まるコンテンツを削除したい。

4

1 に答える 1

0

問題はにあると思います$content_class=".postedBy";。そのクラスの唯一のものはMysore, September 28, 2012, DHNS:、一致しない$stripです。

編集:

postedBy DIV は次のようになります。

<div class="postedBy">Mysore, September 28, 2012, DHNS:</div>

記事の本文は含まれません。

于 2012-09-28T19:13:57.217 に答える