sed - What is regular expression for first field containing alpha-numeric?

Question

I have data that starts out like this in a .csv file

"684MF7","684MF7","RN"

The first field "684MF7" should only contain numeric characters; no alpha characters should be present in the first field. I have other checks for the second field, which in this case is also "684MF7", which is a legitimate value for the second field.

I want to find any alpha in the first field, and print that line. I invoke this sed file

{
        /^".*[^0-9]*.*",/p
}

with -n and -f (for the file name).

What regular expression isolates the first field only? I am getting a match on everything, which isn't what I want. Is my problem because I am trying to match zero or more instead of 1 or more alpha characters?

score 2 · Accepted Answer

最初のフィールド (任意のコンテンツ) は、次のように選択されます。

/^"[^"]*"/

フィールド内の文字の少なくとも 1 つをアルファベットにする必要があります (ただし、「非数字」と見なす方が適切かもしれません)。その場合、次のいずれかを選択する必要があります。

/^"[^"]*[A-Za-z][^"]*"/
/^"[^"]*[^0-9"][^"]*"/
/^"[^"]*[[:alpha:]][^"]*"/
/^"[^"]*[^"[:digit:]][^"]*"/

否定されたクラスも二重引用符と一致してはならないことに注意してください (常に回答をテストする理由の 1 つ - 以下のスクリプトの最初のバージョンでは、両方の入力行がリストされています)。

sedそして、それらの 1 つをコマンドに変換します。

sed -n '/^"[^"]*[^"[:digit:]][^"]*"/p' <<EOF
"684MF7","684MF7","RN"
"684007","684MF7","RN"
EOF

問題の別の見方は、「最初のフィールドがすべての数字ではない (少なくとも 1 つの数字が存在する) 行を印刷する」ことです。あれは：

sed -n '/^"[[:digit:]]\{1,\}"/!p' <<EOF
"684MF7","684MF7","RN"
"684007","684MF7","RN"
EOF

[0-9]全体として、これはおそらく使用するより良い解決策です (そして、代わりにを使用しても文句は言いません[[:digit:]])。

score 1 · Accepted Answer

一般.*に、他の表現を囲むと、予想以上に一致する傾向があります。あまり大きくないワイルドカードマッチを使用して、より詳細な式を記述してみてください

これが機能することがわかりました

> sed -n '/^".*[A-Z].*",".*",".*"/p' <(echo '"684MF7","684MF7","RN"')
> "684MF7","684MF7","RN"
> sed -n '/^".*[A-Z].*",".*",".*"/p' <(echo '"684117","684MF7","RN"')
>

" で囲まれたすべてのグループをピックアップします。

sed - What is regular expression for first field containing alpha-numeric?

3 に答える 3

Related

Reference