php - '<>'（本の名前）を除くphpストリップタグ

Question

<>文字を除くPHPのすべてのHTMLタグを削除するにはどうすればよいですか？

//There's other HTML tags, like h1, div, etc.
echo strip_tags('<gone with the wind> <p>a hotest book</p>');

これは返さa hotest bookれますが、本の名前を保持する必要があります。を返す関数が必要<gone with the wind> a hotest bookです。

score 5 · Accepted Answer

5

<（<）と（>）の使用を検討する必要があり&rt;ます。

于 2013-01-04T15:28:03.790 に答える

score 3 · Accepted Answer

以下では、DOMを使用して、有効なHTML4要素ではない要素を検索し、それらを本のタイトルと見なします。これらは、でホワイトリストに登録されstrip_tagsます。

libxml_use_internal_errors(true);
$dom = new DOMDocument;
$dom->loadHTML($html);

echo strip_tags($html, implode(',', 
    array_map(
        function($error) {
            return '<' . sscanf($error->message, 'Tag %s invalid')[0] . '>';
        },
        libxml_get_errors()
    )
));

オンラインデモ

有効なHTMLタグで始まる本のタイトルは、有効なHTMLと見なされるため、削除されることに注意してください（たとえば、「BodyofEvidence」または「HeadFirstPHP」）。<gone with the wind>また、属性が「with」、「the」、「wind」の要素「gone」と見なされることにも注意してください。有効な要素については、空の属性しかないかどうかを確認し、ない場合は削除できますが、タイトルが有効な要素名だけで構成されている場合でも、100％正確ではありません。さらに、終了タグを確認することもできますが、DOMを使用してそれを行う方法がわかりません（XMLParserはそれらを検出できます）。

いずれにせよ、名前空間を使用したり、山かっこと別の区切り文字を使用したりするなど、これらの本のタイトルに適した形式を見つけると、これを適切に行う可能性が大幅に向上します。

score 1 · Accepted Answer

これは簡単ですが、絶対確実な解決策ではありません。

PHP

$data = "<gone with the wind> <p>a hotest book</p>";
$out = preg_replace("/\<\w+\>|\<\/\w+\>/im", "", $data);

var_dump($out);

出力

string '<gone with the wind> a hotest book' (length=34)

一致します

<p>text</p>
<anything>text</anything>

一致しません

前に言ったように、コードが本のタイトルがどのように見えるかを知る方法はありません。

<img src="url">

<p>ただし、データが単純なタグであることが期待される場合は、これで機能します。

クレイジーな解決策、私はそれをそこに捨てると思った。

score 1 · Accepted Answer

あなたもそのように簡単にそれを行うことができます。

   <?php
   $string = htmlspecialchars("<gone with the wind>");
   echo strip_tags( "$string <p>a hotest book</p>");
   ?>

これは出力されます：

   <gone with the wind> a hotest book

デモはこちら

score 0 · Accepted Answer

私が考えることができる最善のことは、このようなことをすることです。どのタイプのタグが使用されるかわからなかったので、すべてのタグを想定しました。これにより、有効なhtmlタグだけでなく、有効なhtmlタグも削除されます。タグになります。

<?php
$tags = array("!DOCTYPE","a","abbr","acronym","address","applet","area","article","aside","audio","b","base","basefont","bdi","bdo","big","blockquote","body","br","button","canvas","caption","center","cite","code","col","colgroup","command","datalist","dd","del","details","dfn","dir","div","dl","dt","em","embed","fieldset","figcaption","figure","font","footer","form","frame","frameset","h1","h2","h3","h4","h5","h6","head","header","hgroup","hr","html","i","iframe","img","input","ins","kbd","keygen","label","legend","li","link","map","mark","menu","meta","meter","nav","noframes","noscript","object","ol","optgroup","option","output","p","param","pre","progress","q","rp","rt","ruby","s","samp","script","section","select","small","source","span","strike","strong","style","sub","summary","sup","table","tbody","td","textarea","tfoot","th","thead","time","title","tr","track","tt","u","ul","var","video","wbr");

$string = "<gone with the wind> <p>a hotest book</p>";


echo preg_replace("/<(\/|)(".implode("|", $tags).").*>/iU", "", $string);

最終的な出力は次のようになります。

<gone with the wind> a hotest book

score 0 · Accepted Answer

<>HTMLタグと本のタイトルがどれであるかを知る方法がないため、これは運が悪いでしょう。タグのように見えるが実際には有効なHTMLタグではないものを探すものを書くことさえできません。これは、モンキーズの1968年の映画「Head」のレコードを取得する可能性があるため<Head>です。 HTMLタグ。

データの提供者とこれを解決する必要があります。そうすれば、PHPstrip_tags関数を使用できます。

score 0 · Accepted Answer

$string = '<gone with the wind> <p>a hotest book</p>';
$string = strip_tags(preg_replace("/<([\w\s\d]{6,})>/", "&lt;$1&gt;", $string));
$string = html_entity_decode($string);

上記は、6文字を超える「タグ」をstrip_tagsを使用できるように変換し<>ます<>。

受信データによっては、6つの値を試す必要がある場合があります。あなたがそのようなタグを手に入れたら、<article>それをより高く押す必要があるかもしれません。

php - '<>'（本の名前）を除くphpストリップタグ

7 に答える 7

Related

Reference