c++ - boost::regex を使用して C/C++ スタイルのコメントを削除する

Question

正規表現を使用して、文字列から C および C++ スタイルのコメントを削除しようとしています。私は両方を行うと思われるPerl用のものを見つけました:

s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//([^\\]|[^\n][\n]?)*?\n|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $3 ? $3 : ""#gse;

しかし、これをboost::regexコードブロックで使用する方法や、boost::regex.

参考までに: ここで正規表現を見つけました: perlfaq6で、必要なすべてのケースをカバーしているようです。

boost::spirit::qiプロジェクトのコンパイルに多大な時間がかかるため、これを使用しないことをお勧めします。

編集：

std::string input = "hello /* world */ world";

boost::regex reg("(/\\*([^*]|(\\*+[^*/]))*\\*+/)|(//.*)");

input = boost::regex_replace(input, reg, "");

したがって、短い正規表現は実際に機能しますが、長い正規表現は機能しません。

score 3 · Accepted Answer

コメントを削除するために使用できるC++ プリプロセッサライブラリ ( Boost.Wave )がブーストに既にある場合に、これに正規表現を使用するのは少し奇妙に思えます。

std::string strip_comments(std::string const& input) {
    std::string output;
    typedef boost::wave::cpplexer::lex_token<> token_type;
    typedef boost::wave::cpplexer::lex_iterator<token_type> lexer_type;
    typedef token_type::position_type position_type;

    position_type pos;

    lexer_type it = lexer_type(input.begin(), input.end(), pos, 
        boost::wave::language_support(
            boost::wave::support_cpp|boost::wave::support_option_long_long));
    lexer_type end = lexer_type();

    for (;it != end; ++it) {
        if (*it != boost::wave::T_CCOMMENT
         && *it != boost::wave::T_CPPCOMMENT) {
            output += std::string(it->get_value().begin(), it->get_value().end());
        }
    }
    return output;
}

score 0 · Accepted Answer

もしも

\*

になる

\\*

では、なぜそうしないのですか

[^\\]

なる

[^\\\\]

c++ - boost::regex を使用して C/C++ スタイルのコメントを削除する

2 に答える 2

Related

Reference