bash - リストをbashのテーブルに転置する方法

Question

アイテムのリスト（キーと値のペア）をテーブル形式に置き換えたいと思います。解決策は、bashスクリプト、awk、sed、またはその他の方法です。

次のような長いリストがあるとします。

date and time: 2013-02-21 18:18 PM
file size: 1283483 bytes
key1: value
key2: value

date and time: 2013-02-21 18:19 PM
file size: 1283493 bytes
key2: value

...

次のように、タブまたはその他の区切り文字を使用してテーブル形式に転置したいと思います。

date and time   file size   key1    key2
2013-02-21 18:18 PM 1283483 bytes   value   value
2013-02-21 18:19 PM 1283493 bytes       value
...

またはこのように：

date and time|file size|key1|key2
2013-02-21 18:18 PM|1283483 bytes|value|value
2013-02-21 18:19 PM|1283493 bytes||value
...

私はこのような解決策を見てきましたが、Bashでファイルを転置する効率的な方法ですが、ここでは別のケースがあるようです。awkソリューションは部分的に機能し、すべての行を列の長いリストに出力し続けますが、列を一意のリストに制限する必要があります。

awk -F': ' '
{ 
    for (i=1; i<=NF; i++)  {
        a[NR,i] = $i
    }
}
NF>p { p = NF }
END {    
    for(j=1; j<=p; j++) {
        str=a[1,j]
        for(i=2; i<=NR; i++){
            str=str" "a[i,j];
        }
        print str
    }
}' filename

アップデート

ソリューションを提供してくれたすべての人に感謝します。それらのいくつかは非常に有望に見えますが、私のバージョンのツールは古くなっている可能性があり、構文エラーが発生していると思います。私が今目にしているのは、私が非常に明確な要件から始めなかったということです。私が完全な要件を説明する前に、ソリューションを提供した最初の人であることに敬意を表します。質問を書いたときは長い一日を過ごしたので、あまり明確ではありませんでした。

私の目標は、アイテムの複数のリストを列形式に解析するための非常に一般的なソリューションを考え出すことです。ソリューションは255を超える列をサポートする必要はないと思います。列名は事前に知られることはありません。このようにして、ソリューションは私だけでなく誰にとっても機能します。2つの既知のものは、kev /値のペア間のセパレーター（ "："）とリスト間のセパレーター（空の行）です。他の人がこれを再利用できるように構成できるように、それらの変数があると便利です。

提案されたソリューションを見ると、入力ファイルに対して2回のパスを実行するのが良いアプローチであることがわかります。最初のパスは、すべての列名を収集し、オプションでそれらを並べ替えてから、ヘッダーを出力することです。次に、列の値を取得して印刷します。

score 2 · Accepted Answer

を使用する1つの方法がありGNU awkます。次のように実行します：

awk -f script.awk file

内容script.awk：

BEGIN {
    # change this to OFS="\t" for tab delimited ouput
    OFS="|"

    # treat each record as a set of lines
    RS=""
    FS="\n"
}

{
    # keep a count of the records
    ++i

    # loop through each line in the record
    for (j=1;j<=NF;j++) {

        # split each line in two
        split($j,a,": ")

        # just holders for the first two lines in the record
        if (j==1) { date = a[1] }
        if (j==2) { size = a[1] }

        # keep a tally of the unique key names
        if (j>=3) { !x[a[1]] }

        # the data in a multidimensional array:
        # record number . key = value
        b[i][a[1]]=a[2]
    }
}

END {

    # sort the unique keys
    m = asorti(x,y)

    # add the two strings to a numerically indexed array
    c[1] = date
    c[2] = size

    # set a variable to continue from
    f=2

    # loop through the sorted array of unique keys
    for (j=1;j<=m;j++) {

        # build the header line from the file by adding the sorted keys
        r = (r ? r : date OFS size) OFS y[j]

        # continue to add the sorted keys to the numerically indexed array
        c[++f] = y[j]
    }

    # print the header and empty
    print r
    r = ""

    # loop through the records ('i' is the number of records)
    for (j=1;j<=i;j++) {

        # loop through the subrecords ('f' is the number of unique keys)
        for (k=1;k<=f;k++) {

            # build the output line
            r = (r ? r OFS : "") b[j][c[k]]
        }

        # and print and empty it ready for the next record
        print r
        r = ""
    }
}

テストファイルの内容は次のfileとおりです。

date and time: 2013-02-21 18:18 PM
file size: 1283483 bytes
key1: value1
key2: value2

date and time: 2013-02-21 18:19 PM
file size: 1283493 bytes
key2: value2
key1: value1
key3: value3

date and time: 2013-02-21 18:20 PM
file size: 1283494 bytes
key3: value3
key4: value4

date and time: 2013-02-21 18:21 PM
file size: 1283495 bytes
key5: value5
key6: value6

結果：

2013-02-21 18:18 PM|1283483 bytes|value1|value2||||
2013-02-21 18:19 PM|1283493 bytes|value1|value2|value3|||
2013-02-21 18:20 PM|1283494 bytes|||value3|value4||
2013-02-21 18:21 PM|1283495 bytes|||||value5|value6

score 1 · Accepted Answer

これは列構造を前提としないため、順序付けを試みませんが、すべてのフィールドがすべてのレコードに対して同じ順序で出力されます。

use strict;
use warnings;

my (@db, %f, %fields);
my $counter = 1;
while (<>) {
  my ($field, $value) = (/([^:]*):\s*(.*)\s*$/);
  if (not defined $field) {
    push @db, { %f };
    %f = (); 
  } else {
    $f{$field} = $value;
    $fields{$field} = $counter++ if not defined $fields{$field};
  }
}
push @db, \%f;

#my @fields = sort keys %fields; # alphabetical order
my @fields = sort {$fields{$a} cmp $fields{$b} } keys %fields; #first seen order

# print header
print join("|", @fields), "\n";

# print rows
for my $row (@db) {
  print join("|", map { $row->{$_} ? $row->{$_} : "" } @fields), "\n";
}

score 1 · Accepted Answer

これが純粋なawkソリューションです：

# split lines on ": " and use "|" for output field separator
BEGIN { FS = ": "; i = 0; h = 0; ofs = "|" }

# empty line - increment item count and skip it
/^\s*$/ { i++ ; next } 

# normal line - add the item to the object and the header to the header list
# and keep track of first seen order of headers
{
   current[i, $1] = $2
   if (!($1 in headers)) {headers_ordered[h++] = $1}
   headers[$1]
}

END {
   h--

   # print headers
   for (k = 0; k <= h; k++)
   {
      printf "%s", headers_ordered[k]
      if (k != h) {printf "%s", ofs}
   } 
   print "" 

   # print the items for each object
   for (j = 0; j <= i; j++)
   {
      for (k = 0; k <= h; k++)
      {
         printf "%s", current[j, headers_ordered[k]]
         if (k != h) {printf "%s", ofs}
      }
      print ""
   }
}

入力例（最後の項目の後に改行があることに注意してください）：

foo: bar
foo2: bar2
foo1: bar

foo: bar3
foo3: bar3
foo2: bar3

出力例：

foo|foo2|foo1|foo3
bar|bar2|bar|
bar3|bar3||bar3

注：データに「：」が埋め込まれている場合は、おそらくこれを変更する必要があります。

score 0 · Accepted Answer

perlを使用する

use strict; use warnings;

# read the file paragraph by paragraph
$/ = "\n\n";

print "date and time|file size|key1|key2\n";

# parsing the whole file with the magic diamond operator
while (<>) {
    if (/^date and time:\s+(.*)/m) {
        print "$1|";
    }

    if (/^file size:(.*)/m) {
        print "$1|";
    }

    if (/^key1:(.*)/m) {
        print "$1|";
    }
    else {
        print "|";
    }

    if (/^key2:(.*)/m) {
        print "$1\n";
    }
    else {
        print "\n";
    }
}

使用法

perl script.pl file

出力

date and time|file size|key1|key2
2013-02-21 18:18 PM| 1283483 bytes| value| value
2013-02-21 18:19 PM| 1283493 bytes|| value

score 0 · Accepted Answer

例：

> ls -aFd * | xargs -L 5 echo | column -t
bras.tcl@      Bras.tpkg/           CctCc.tcl@       Cct.cfg      consider.tcl@
cvsknown.tcl@  docs/                evalCmds.tcl@    export/      exported.tcl@
IBras.tcl@     lastMinuteRule.tcl@  main.tcl@        Makefile     Makefile.am
Makefile.in    makeRule.tcl@        predicates.tcl@  project.cct  sourceDeps.tcl@
tclIndex

bash - リストをbashのテーブルに転置する方法

5 に答える 5

perlを使用する

使用法

出力

Related

Reference