bash - CSV ファイルの 2 列目に 3 文字を超える行をすべて削除するには?

Question

2 列目に 3 文字を超える CSV ファイルのすべての行を削除するにはどうすればよいですか? 例えば：

cave,ape,1
tree,monkey,2

2 行目は 2 列目に 3 文字以上含まれているため、削除されます。

score 9 · Accepted Answer

9

awk -F, 'length($2)<=3' input.txt

于 2012-04-12T12:55:39.743 に答える

score 2 · Accepted Answer

まだ誰もsed答えを提供していないので、ここにあります：

sed -e '/^[^,]*,[^,]\{4\}/d' animal.csv

そして、ここにいくつかのテストデータがあります。

>animal.csv cat <<'.'      
cave,ape,0
,cat,1
,orangutan,2
large,wolf,3
,dog,4,happy
tree,monkey,5,sad
.

そして今テストします：

sed -i'' -e '/^[^,]*,[^,]\{4\}/d' animal.csv
cat animal.csv

Ape、Cat、および Dog のみが出力に表示されます。

score 2 · Accepted Answer

これは、データタイプのフィルタスクリプトです。データがutf8であることを前提としています

#!/bin/bash
function px {
 local a="$@"
 local i=0
 while [ $i -lt ${#a}  ]
  do
   printf \\x${a:$i:2}
   i=$(($i+2))
  done
}
(iconv -f UTF8 -t UTF16 | od -x |  cut -b 9- | xargs -n 1) |
if read utf16header
then
 px $utf16header
 cnt=0
 out=''
 st=0
 while read line
  do
   if [ "$st" -eq 1 ] ; then
     cnt=$(($cnt+1))
   fi
   if [ "$line" == "002c" ] ; then
     st=$(($st+1))
   fi
   if [ "$line" == "000a" ]
    then
     out=$out$line
     if [[ $cnt -le 3+1 ]] ; then
        px $out
     fi
     cnt=0
     out=''
     st=0
   else
    out=$out$line
   fi
  done
fi | iconv -f UTF16 -t UTF8

score 2 · Accepted Answer

次のコマンドを使用できます。

grep -vE "^[^,]+,[^,]{4,}," test.csv > filtered.csv

grep 構文の内訳:

-v = remove lines matching
-E = extended regular expression syntax (also -P is perl syntax)

バッシュのもの：

> filename = overwrite/create a file and fill it with the standard out

正規表現構文の内訳:

"^[^,]+,[^,]{4,},"

^ = beginning of line
[^,] = anything except commas
[^,]+ = 1 or more of anything except commas
, = comma
[^,]{4,} = 4 or more of anything except commas

また、上記は簡略化されており、最初の 2 列のデータにコンマが含まれていると機能しないことに注意してください。(エスケープされたコンマと生のコンマの違いはわかりません)

bash - CSV ファイルの 2 列目に 3 文字を超える行をすべて削除するには?

4 に答える 4

Related

Reference