regex - 正規表現分割拡張 CSV 表記

Question

次の形式でデータをパッケージ化するカスタムトランスポート形式があります。

[a:000,"名前","フィールド","フィールド","フィールド"]

左括弧の後の最初の文字とすべての CSV 値を取得するために、個々の行を分割しようとしています。a、000、「名前」、「フィールド」、「フィールド」など...

一緒に石畳にしました

[^?,:\[\]]

これにより、コロン/コンマで区切られたフィールドではなく、個々の文字がすべて分割されます。これは引用符内のコンマに対応していないことを理解しています.だから、それは明らかにゴミです!

埋め込まれたコンマは、両端でデータを制御しているため、実際には大きな問題ではないので、エスケープすることができます。

洞察をありがとう！

score 2 · Accepted Answer

複数の文字に分割して一部を無視しようとする代わりに、一致させたいものに一致させてみてください。実装言語を指定しなかったので、これを Perl 用に投稿していますが、後読みと先読みをサポートする任意のフレーバーに適用できます。

while ($subject =~ m/(\w+(?=:)|(?<=:)\d+|(?<=,")[^"]*?(?="))/g) {
    # matched text = $&
}

説明：

# (\w+(?=:)|(?<=:)\d+|(?<=,")[^"]*?(?="))
# 
# Match the regular expression below and capture its match into backreference number 1 «(\w+(?=:)|(?<=:)\d+|(?<=,")[^"]*?(?="))»
# Match either the regular expression below (attempting the next alternative only if this one fails) «\w+(?=:)»
# Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
# Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
# Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=:)»
# Match the character “:” literally «:»
# Or match regular expression number 2 below (attempting the next alternative only if this one fails) «(?<=:)\d+»
# Assert that the regex below can be matched, with the match ending at this position (positive lookbehind) «(?<=:)»
# Match the character “:” literally «:»
# Match a single digit 0..9 «\d+»
# Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
# Or match regular expression number 3 below (the entire group fails if this one fails to match) «(?<=,")[^"]*?(?=")»
# Assert that the regex below can be matched, with the match ending at this position (positive lookbehind) «(?<=,")»
# Match the characters “,"” literally «,"»
# Match any character that is NOT a “"” «[^"]*?»
# Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
# Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=")»
# Match the character “"” literally «"»

それが働いて参照してください。

regex - 正規表現分割拡張 CSV 表記

2 に答える 2

Related

Reference