awk - フィールドの出現回数のカウント

Question

複数の列から一意の文字列を数え、それらの数のみを表示するにはどうすればよいですかawk

私の入力ファイルc.txt：

US A one
IN A two
US B one
LK C one
US B two
US A three
IN A three
US B one
LK C two
US B three
US A one
IN A one
US B three
LK C three
US B two
US A two
IN A two
US B two
LK C three
US B two
US A one
IN A two
US B one
LK C one
US B two
US A three
IN A three
US B one
LK C two
US B three
US A one
IN A one
US B three
LK C three
US B two
US A two
IN A two
US B two
LK C three
US B two
US A one
IN A two
US B one
LK C one
US B two
US A three
IN A three
US B one
LK C two
US B three
US A one
IN A one
US B three
LK C three
US B two
US A two
IN A two
US B two
LK C three
US B two
US A one
IN A two
US B one
LK C one
US B two
US A three
IN A three
US B one
LK C two
US B three
US A one
IN A one
US B three
LK C three
US B two
US A two
IN A two
US B two
LK C three
US B two
US A one
IN A two
US B one
LK C one
US B two
US A three
IN A three
US B one
LK C two
US B three
US A one
IN A one
US B three
LK C three
US B two
US A two
IN A two
US B two
LK C three
US B two

私はこれを達成できましたが、3つのコマンドで別々に、単一のコマンドですべての出力を取得することは可能ですか?

awk '{a[$1]++}END{for (i in a)print i,a[i]}' c.txt
awk '{a[$1" "$2]++}END{for (i in a)print i,a[i]}' c.txt
awk '{a[$1" "$2" "$3]++}END{for (i in a)print i,a[i]}' c.txt

私の望ましい出力は次のとおりです。

IN 20 A 20 one 5 
IN 20 A 20 three 5
IN 20 A 20 two 10
LK 20 C 20 one 5
LK 20 C 20 three 10
LK 20 C 20 two 5
US 60 A 20 one 10
US 60 A 20 three 5
US 60 A 20 two 5
US 60 B 40 one 10
US 60 B 40 three 10
US 60 B 40 two 20

2 列目は、入力ファイルの 1 列目の一意の値の合計です。

4 列目は、入力ファイルの 1 列目と 2 列目の一意の値の合計です。

6 列目は、入力ファイルの 1 列目、2 列目、3 列目の合計 Uniq 値です。

score 3 · Accepted Answer

次のスクリプトを使用GNU awkできます。

$ cat count.awk 
{
    lines[$0]=$0
    count1[$1]++
    count2[$1,$2]++
    count3[$1,$2,$3]++
}
END{
    n = asorti(lines)
    for (i=1;i<=n;i++) {
        split(lines[i],field,FS)
        total1 = count1[field[1]]
        total2 = count2[field[1],field[2]]
        total3 = count3[field[1],field[2],field[3]]

        print field[1],total1,field[2],total2,field[3],total3
    }
}

ファイルでスクリプトを実行するには:

$ awk -f count.awk file 
IN 20 A 20 one 5
IN 20 A 20 three 5
IN 20 A 20 two 10
LK 20 C 20 one 5
LK 20 C 20 three 10
LK 20 C 20 two 5
US 60 A 20 one 10
US 60 A 20 three 5
US 60 A 20 two 5
US 60 B 40 one 10
US 60 B 40 three 10
US 60 B 40 two 20

score 2 · Accepted Answer

この awk ワンライナーを試してください:

$ awk '{a[$1]++;b[$1,$2]++;c[$1,$2,$3]++}END{for (i in c) {split (i, d, SUBSEP); print d[1],a[d[1]],d[2],b[d[1],d[2]],d[3],c[d[1],d[2],d[3]] } }' file | sort
IN 20 A 20 one 5
IN 20 A 20 three 5
IN 20 A 20 two 10
LK 20 C 20 one 5
LK 20 C 20 three 10
LK 20 C 20 two 5
US 60 A 20 one 10
US 60 A 20 three 5
US 60 A 20 two 5
US 60 B 40 one 10
US 60 B 40 three 10
US 60 B 40 two 20

または、より読みやすい形式で：

$ awk '
    {
        a[$1]++
        b[$1,$2]++
        c[$1,$2,$3]++
    }
    END{
        for (i in c) {
            split (i, d, SUBSEP); 
            print d[1], a[d[1]],
                  d[2], b[d[1], d[2]],
                  d[3], c[d[1], d[2], d[3]] 
        } 
    }' file | sort

awk - フィールドの出現回数のカウント

2 に答える 2

Related

Reference