awk - Compare files with awk

Question

Hi I have two similar files (both with 3 columns). I'd like to check if these two files contains the same elements (but listed in a different orders). First of all I'd like to compare only the 1st columns

file1.txt

"aba" 0 0 
"abc" 0 1
"abd" 1 1 
"xxx" 0 0

file2.txt

"xyz" 0 0
"aba" 0 0
"xxx" 0 0
"abc" 1 1

How can I do it using awk? I tried to have a look around but I've found only complicate examples. What if I want to include also the other two columns on the comparison? The output should give me the number of matching elements.

score 29 · Accepted Answer

両方のファイルに共通の要素を出力するには:

$ awk 'NR==FNR{a[$1];next}$1 in a{print $1}' file1 file2
"aba"
"abc"
"xxx"

説明：

NRとFNRはawk、それぞれ現在のファイルの合計レコード数とレコード数を格納する変数です (デフォルトのレコードは行です)。

NR==FNR # Only true when in the first file 
{
    a[$1] # Build associative array on the first column of the file
    next  # Skip all proceeding blocks and process next line
}
($1 in a) # Check in the value in column one of the second files is in the array
{
    # If so print it
    print $1
}

行全体を一致させたい場合は、次を使用します$0。

$ awk 'NR==FNR{a[$0];next}$0 in a{print $0}' file1 file2
"aba" 0 0
"xxx" 0 0

または特定の列のセット:

$ awk 'NR==FNR{a[$1,$2,$3];next}($1,$2,$3) in a{print $1,$2,$3}' file1 file2
"aba" 0 0
"xxx" 0 0

score 6 · Accepted Answer

一致する要素の数を出力するには、次の 1 つの方法を使用しawkます。

awk 'FNR==NR { a[$1]; next } $1 in a { c++ } END { print c }' file1.txt file2.txt

入力を使用した結果:

余分な列 (たとえば、列 1、2、および 3) を追加する場合は、疑似多次元配列を使用します。

awk 'FNR==NR { a[$1,$2,$3]; next } ($1,$2,$3) in a { c++ } END { print c }' file1.txt file2.txt

入力を使用した結果:

awk - Compare files with awk

2 に答える 2

Related

Reference