awk - Wrong formatting when using print in awk

Question

I have a 2 column file that come from http://snap.stanford.edu/data/cit-HepPh.html. The file I downloaded is cit-HepPh.txt.gz. I delete all characters that are not numbers (the first 4 lines of the file), then I change the tab space between numbers with a simple space with:

awk '{print $1,$2}' Cit-HepPh.txt > 1

Then I tried to reverse the elements in the file and write them in another file. I used

awk '{print $2,$1}' 1 > 2

but what I obtain is something like

Instead of something like

why?

I did

head -2 Cit-HepPh.txt | od -a

and I have in return

0000000   9   9   0   7   2   3   3  ht   9   3   0   1   2   5   3  cr
0000020  nl   9   9   0   7   2   3   3  ht   9   5   0   4   3   0   4
0000040  cr  nl
0000042

what does it means?

score 3 · Accepted Answer

ファイルに他の (印刷されない) 文字が含まれているようです

の出力を投稿してみてください

head -2 Cit-HepPh.txt | od -a

ヘッドは入力の最初の 2 行を受け取り、各文字 (または非印刷の場合はそのコード) を出力します。

問題を特定した場合は、問題のある文字を sed または awk で削除できます

編集

行末 (すなわち dos 行末) として cr nl を持っている場合は、dos2unix のようなものでそれらを削除する必要があります。

awk '{sub(/\r$/,"");print $2,$1}'

score 0 · Accepted Answer

これを再現できません：

$ cat in.txt 
1 2
2 3
4 5
$ awk '{print $1,$2}' <in.txt
1 2
2 3
4 5

awk - Wrong formatting when using print in awk

2 に答える 2

Related

Reference