php - PHP での XML 解析

Question

私はxmlを解析していますが、画像とテキストの両方を含むタグがあり、デザインレイアウトのテーブルの異なる列で画像とテキストの両方を分離したいのですが、その方法がわかりません. 私を助けてください。私のphpファイルは次のとおりです。

<?php
$RSS_Content = array();

function RSS_Tags($item, $type)
{
    $y = array();
    $tnl = $item->getElementsByTagName("title");
    $tnl = $tnl->item(0);
    $title = $tnl->firstChild->textContent;

    $tnl = $item->getElementsByTagName("link");
    $tnl = $tnl->item(0);
    $link = $tnl->firstChild->textContent;
    $tnl = $item->getElementsByTagName("description");
    $tnl = $tnl->item(0);
    $img = $tnl->firstChild->textContent;

    $y["title"]  = $title;
    $y["link"] = $link;
    $y["description"] = $img;
    $y["type"] = $type;

    return $y;
}

function RSS_Channel($channel)
{
    global $RSS_Content;

    $items = $channel->getElementsByTagName("item");

    // Processing channel

    $y = RSS_Tags($channel, 0);     // get description of channel, type 0
    array_push($RSS_Content, $y);

    // Processing articles

    foreach($items as $item)
    {
        $y = RSS_Tags($item, 1);    // get description of article, type 1
        array_push($RSS_Content, $y);
    }
}

function RSS_Retrieve($url)
{
    global $RSS_Content;

    $doc  = new DOMDocument();
    $doc->load($url);

    $channels = $doc->getElementsByTagName("channel");

    $RSS_Content = array();

    foreach($channels as $channel)
    {
        RSS_Channel($channel);
    }

}

function RSS_RetrieveLinks($url)
{
    global $RSS_Content;

    $doc  = new DOMDocument();
    $doc->load($url);

    $channels = $doc->getElementsByTagName("channel");

    $RSS_Content = array();

    foreach($channels as $channel)
    {
        $items = $channel->getElementsByTagName("item");
        foreach($items as $item)
        {
            $y = RSS_Tags($item, 1);
            array_push($RSS_Content, $y);
        }
    }

}

function RSS_Links($url, $size = 15)
{
    global $RSS_Content;

    $page = "<ul>";

    RSS_RetrieveLinks($url);
    if($size > 0)
    $recents = array_slice($RSS_Content, 0, $size + 1);

    foreach($recents as $article)
    {
        $type = $article["type"];
        if($type == 0) continue;
        $title = $article["title"];
        $link = $article["link"];
        $img = $article["description"];
        $page .= "<a href=\"#\">$title</a>\n";
    }

    $page .="</ul>\n";

    return $page;

}

function RSS_Display($url, $click, $size = 8, $site = 0, $withdate = 0)
{
    global $RSS_Content;

    $opened = false;
    $page = "";
    $site = (intval($site) == 0) ? 1 : 0;

    RSS_Retrieve($url);
    if($size > 0)
    $recents = array_slice($RSS_Content, $site, $size + 1 - $site);

    foreach($recents as $article)
    {
        $type = $article["type"];
        if($type == 0)
        {
            if($opened == true)
            {
                $page .="</ul>\n";
                $opened = false;
            }
            $page .="<b>";
        }
        else
        {
            if($opened == false)
            {
                $page .= "<table width='369' border='0'>
            <tr>";
                $opened = true;
            }
        }
        $title = $article["title"];
        $link = $article["link"];
        $img = $article["description"];
        $page .= "<td width='125' align='center' valign='middle'>
              <div align='center'>$img</div></td>                    
        <td width='228' align='left' valign='middle'><div align='left'><a 
                  href=\"$click\" target='_top'>$title</a></div></td>";
        if($withdate)
        {
            $date = $article["date"];
            $page .=' <span class="rssdate">'.$date.'</span>';
        }
            if($type==0)
            {
                $page .="<br />";
            }
        }

        if($opened == true)
        {
            $page .="</tr>
                </table>";
        }
        return $page."\n";

    }
?>

score 0 · Accepted Answer

画像と説明を分離するには、 description要素内に格納されている HTML を XML として再度解析する必要があります。幸いなことに、それはその要素内で有効な XML であるため、SimpleXMLを使用してこれを簡単に行うことができます。次のコード例では、URL を取得し、各項目*description* をテキストのみに変換し、画像のsrc属性を抽出して次のように保存します。画像要素:

<item>
    <title>Fake encounter: BJP backs Kataria, says CBI targeting Modi</title>
    <link>http://ibnlive.in.com/news/fake-encounter-bjp-backs-kataria-says-cbi-targeting-modi/391802-37-64.html</link>
    <description>The BJP lashed out at the CBI and questioned its 'shoddy investigation' into the Sohrabuddin fake encounter case.</description>
    <pubDate>Wed, 15 May 2013 13:48:56 +0530</pubDate>
    <guid>http://ibnlive.in.com/news/fake-encounter-bjp-backs-kataria-says-cbi-targeting-modi/391802-37-64.html</guid>
    <image>http://static.ibnlive.in.com/ibnlive/pix/sitepix/05_2013/bjplive_kataria3.jpg</image>
</item>

コード例は次のとおりです。

$url  = 'http://ibnlive.in.com/ibnrss/top.xml';
$feed = simplexml_load_file($url);

$items = $feed->xpath('(//channel/item)');

foreach ($items as $item) {
    list($description, $image) =
        simplexml_load_string("<r>$item->description</r>")
            ->xpath('(/r|/r//@src)');
    $item->description = (string)$description;
    $item->image       = (string)$image;
}

その後、 SimpleXMLをDOMElementにインポートできますが、dom_import_simplexml()正直なところ、 DOMDocumentLimitIteratorと同じようにページングにも使用でき、アクセスするデータは実際にはSimpleXMLを簡単に使用できます。最初に配列に解析してから配列を処理するのではなく、XML 要素をSimpleXMLElementsとして簡単に渡すことができます。それはばかげています。

php - PHP での XML 解析

1 に答える 1

Related

Reference