awk - Awk：文字-1つのテキストファイルからの頻度？

Question

次のような多言語の.txtファイルがあるとします。

But where is Esope the holly Bastard
But where is 생 지 옥 이 군
지 옥 이
지 옥
지
我 是 你 的 爸 爸 ！
爸 爸 ！ ！ ！
你 不 會 的 ！

このAwk関数を使用して、スペースで区切られた単語の単語頻度をカウントしました。

$ awk '{a[$1]++}END{for(k in a)print a[k],k}' RS=" |\n" myfile.txt | sort

エレガントを手に入れる：

1 생
1 군
1 Bastard
1 Esope
1 holly
1 the
1 不
1 我
1 是
1 會
2 이
2 But
2 is
2 where
2 你
2 的
3 옥
4 지
4 爸
5 ！

文字数を数えるように変更する方法-頻度？

編集：文字の頻度については、（@ Sudo_Oの回答）を使用しました：

$ grep -o '\S' myfile.txt | awk '{a[$1]++}END{for(k in a)print a[k],k}' | sort > myoutput.txt

単語の頻度については、次を使用します。

$ grep -o '\w*' myfile.txt | awk '{a[$1]++}END{for(k in a)print a[k],k}' | sort > myoutput.txt

score 3 · Accepted Answer

1つの方法：

$ grep -o '\S' file | awk '{a[$1]++}END{for(k in a)print a[k],k}' 
3 옥
4 h
2 u
2 i
3 B
5 ！
2 w
4 爸
1 군
4 지
1 y
2 l
1 E
1 會
2 你
1 是
2 a
1 不
2 이
2 o
1 p
2 的
1 d
1 생
3 r
6 e
4 s
1 我
4 t

リダイレクションを使用して、出力をファイルに保存します。

$ grep -o '\S' file | awk '{a[$1]++}END{for(k in a)print a[k],k}' > output

そして、ソートされた出力の場合：

$ grep -o '\S' file | awk '{a[$1]++}END{for(k in a)print a[k],k}' | sort > output

awk - Awk：文字-1つのテキストファイルからの頻度？

1 に答える 1

Related

Reference