c++ - 代替演算子 '|' で士気を高める失敗！行くべき2つの可能なルールがあるとき

Question

私はhttpパーサーに取り組んでいます。代替演算子を使用して解析しようとすると、問題が見つかりました。hold[]を使用して修正できるのは、属性の値に関するものではありません。この問題は、ルールの先頭が似ている 2 つのルールがある場合に発生します。私の問題を示すためのいくつかの簡単なルールを次に示します。

qi::rule<string_iterator> some_rule(
        (char_('/') >> *char_("0-9")) /*first rule accept  /123..*/
      | (char_('/') >> *char_("a-z")) /*second rule accept /abc..*/
    );

qi::parse次に、入力文字列が好きな場合に失敗することを使用して、このルールを解析します。"/abcd"

ただし、最初のルールの前に2番目のルールを切り替えると。パーサーは true を返します。問題は、パーサーが最初のルールで入力を消費し、最初のルールが Fail であることが判明したためだと思います。最初のルールの代替である 2 番目のルールには戻りません。

最初のルールを適用しようとしましhold[]たが、属性の生成にのみ役立ちます。この問題は解決しません。HTTPにはルールの始まりが他のルールと同じであるという多くのルールがあるため、この問題を修正する方法がわかりません。

===========私のコードについての詳細============================
ここに解析のための私の関数があります文字列

typedef std::string::const_iterator string_iterator;
typedef qi::rule<string_iterator, std::string()> rules_t;
void parse_to_string(const std::string& s, rules_t& r, std::string& result)
{
    using namespace rule;
    using qi::parse;

    std::string::const_iterator iter = s.begin();
    std::string::const_iterator end = s.end();

    bool err = parse(iter, end, r, result);

    if ( err && (iter==end) )
    {
           std::cout << "[correct]" << result << std::endl;
    }
    else
    {
          std::cout << "[incorrect]" << s << std::endl;
          std::cout << "[dead with]" << result << std::endl;
    }
}

主にこのコードがあります。

std::string result;
result = "";
str = "/htmlquery?";
qi::rule<string_iterator, std::string()> rule_wo_question( char_('/') >> *char_("a-z"));
qi::rule<string_iterator, std::string()> rule_w_question( char_('/') >> *char_("a-z") >> char_('?'));
qi::rule<string_iterator, std::string()> whatever_rule( rule_wo_question
                                                        | rule_w_question
                                                       );
parse_to_string(str, whatever_rule, result);

この結果が得られました。

[誤]/htmlquery? [dead with]/htmlquery <= '?' を消費できないことがわかります

ただし、このようにルールを切り替えると; (「rule_wo_question」の前に「rule_w_question」を入れました)

std::string result;
    result = "";
    str = "/htmlquery?";
    qi::rule<string_iterator, std::string()> rule_wo_question( char_('/') >> *char_("a-z"));
    qi::rule<string_iterator, std::string()> rule_w_question( char_('/') >> *char_("a-z") >> char_('?'));
    qi::rule<string_iterator, std::string()> whatever_rule( rule_w_question
                                                            | rule_wo_question
                                                           );
    parse_to_string(str, whatever_rule, result);

出力は次のようになります。[正解]/htmlquery?

最初のバージョン (間違ったもの) は、解析が '/htmlquery' ("rule_wo_question") を消費するように見えますが、'?' を消費できないことがわかります。これにより、このルールは失敗します。次に、このルールは代替ルール ("rule_w_question") に進むことができません。最後に、プログラムは「[不正]」を返します

2 番目のバージョンでは、「rule_wo_question」の前に「rule_w_question」を切り替えます。これが、パーサーが結果として "[correct]" を返す理由です。

================================================== ============ pthread と boost_filesystem にリンクされたブースト 1.47 を使用した私のコード全体が私のメイン .c です。

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/network/protocol.hpp>
#include <boost/spirit/include/phoenix_stl.hpp>
#include <boost/spirit/include/phoenix_fusion.hpp>
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_object.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/fusion/include/io.hpp>
#include <boost/bind.hpp>
#include <boost/spirit/include/qi_uint.hpp>

using namespace boost::spirit::qi;
namespace qi = boost::spirit::qi;

typedef std::string::const_iterator string_iterator;
typedef qi::rule<string_iterator, std::string()> rules_t;
void parse_to_string(const std::string& s, rules_t& r, std::string& result)
{
    using qi::parse;

    std::string::const_iterator iter = s.begin();
    std::string::const_iterator end = s.end();

    bool err = parse(iter, end, r, result);

    if ( err && (iter==end) )
    {
           std::cout << "[correct]" << result << std::endl;
    }
    else
    {
          std::cout << "[incorrect]" << s << std::endl;
          std::cout << "[dead with]" << result << std::endl;
    }
}





int main()
{
    std::string str, result;
    result = "";
    str = "/htmlquery?";
    qi::rule<string_iterator, std::string()> rule_wo_question( char_('/') >> *char_("a-z"));
    qi::rule<string_iterator, std::string()> rule_w_question( char_('/') >> *char_("a-z") >> char_('?'));
    qi::rule<string_iterator, std::string()> whatever_rule( rule_wo_question
                                                           | rule_w_question
                                                           );
    parse_to_string(str, whatever_rule, result);
    return 0;
}

結果は

[incorrect]/htmlquery?

[dead with]/htmlquery

score 3 · Accepted Answer

Spiritは、指定された順序で指定された選択肢を試行し、最初の選択肢と一致した後に解析を停止します。完全なマッチングは実行されません。1つの選択肢が一致すると、検索が停止します。IOW、選択肢の順序は重要です。常に「最長」の選択肢を最初にリストする必要があります。

score 1 · Accepted Answer

代わりにこれを行わない理由はありますか?

some_rule(
     char_('/')
     >> (
         *char_("0-9")  /\*first rule accept /123..\*/
       | *char_("a-z") /\*second rule accept/abc..\*/
     )
);

編集:実際には、空が続くと一致し ("0-9" 0 回)、わざわざ "az"を/探す必要はありません。*+

score 0 · Accepted Answer

qi::rule<string_iterator> some_rule(
    (char_('/') >> *char_("0-9")) >> qi::eol /*first rule accept  /123..*/
  | (char_('/') >> *char_("a-z")) >> qi::eol /*second rule accept /abc..*/
);

代わりに、eol「,」またはその他のターミネータを使用できます。問題は、char_('/') >> *char_("0-9"))「/」の後に 0 個以上の数字が続く場合に一致することです。したがって、「/abcd」は「/」に一致し、解析を停止します。K-ballo の解決策は、私がこの場合に行う方法ですが、この解決策は、(何らかの理由で) 彼が受け入れられない場合の代替手段として提供されています。

score 0 · Accepted Answer

それは、最初のルールに一致するものがあり、スピリットが貪欲だからです。

(char_('/') >> *char_("0-9"))

このルールに「/abcd」を入力すると、次のロジックが生成されます。

"/abcd" -> '/' は次の文字ですか? はい。サブルールが一致します。→「abcd」のままです。
"abcd" → 0桁以上？はい。0桁です。サブルールが一致します。→「abcd」のままです。
代替 ('|') ステートメントの最初の句が一致します。残りの代替句をスキップします。→「abcd」のままです。
「abcd」が残っているルールに一致します。これはおそらく解析されず、失敗の原因になります。

「0 以上」を意味する「*」を「1 以上」を意味する「+」に変更することを検討してください。

c++ - 代替演算子 '|' で士気を高める 失敗！行くべき2つの可能なルールがあるとき

4 に答える 4

Related

Reference

c++ - 代替演算子 '|' で士気を高める失敗！行くべき2つの可能なルールがあるとき