git - コミットを含むリリースの効率的な取得

Question

コマンドラインで、次のように入力すると

git tag --contains {commit}

特定のコミットを含むリリースのリストを取得するには、コミットごとに約11〜20秒かかります。ターゲットコードベースには300,000を超えるコミットが存在するため、すべてのコミットについてこの情報を取得するには多くの時間がかかります。

しかし、gitkどうやらこのデータを取得するのにうまくいくようです。私が検索したところ、その目的のためにキャッシュを使用しています。

2つの質問があります：

そのキャッシュ形式をどのように解釈できますか？
gitコマンドラインツールからダンプを取得して同じ情報を生成する方法はありますか？

score 5 · Accepted Answer

これは、からほぼ直接取得できますgit rev-list。

latest.awk：

BEGIN { thiscommit=""; }
$1 == "commit" {
    if ( thiscommit != "" )
        print thiscommit, tags[thiscommit]
    thiscommit=$2
    line[$2]=NR
    latest = 0;
    for ( i = 3 ; i <= NF ; ++i ) if ( line[$i] > latest ) {
        latest = line[$i];
        tags[$2] = tags[$i];
    }
    next;
}
$1 != "commit"  { tags[thiscommit] = $0; }
END { if ( thiscommit != "" ) print thiscommit, tags[thiscommit]; }

サンプルコマンド：

git rev-list --date-order --children --format=%d --all | awk -f latest.awk

を使用することもでき--topo-orderます。おそらく、$1!="commit"ロジック内の不要な参照を取り除く必要があります。

必要な推移性の種類とリストの明示性によっては、タグを蓄積するために辞書が必要になる場合があります。これは、すべてのコミットのすべての参照の明示的なリストを取得するものです。

all.awk：

BEGIN {
    thiscommit="";
}
$1 == "commit" {
    if ( thiscommit != "" )
        print thiscommit, tags[thiscommit]
    thiscommit=$2
    line[$2]=NR
    split("",seen);
    for ( i = 3 ; i <= NF ; ++i ) {
        nnew=split(tags[$i],new);
        for ( n = 1 ; n <= nnew ; ++n ) {
            if ( !seen[new[n]] ) {
                tags[$2]= tags[$2]" "new[n]
                seen[new[n]] = 1
            }
        }
    }
    next;
}
$1 != "commit"  {
    nnew=split($0,new,", ");
    new[1]=substr(new[1],3);
    new[nnew]=substr(new[nnew],1,length(new[nnew])-1);
    for ( n = 1; n <= nnew ; ++n )
        tags[thiscommit] = tags[thiscommit]" "new[n]

}
END { if ( thiscommit != "" ) print thiscommit, tags[thiscommit]; }

all.awk322KのLinuxカーネルリポジトリのコミットを実行するのに数分かかりました。約1000秒かそのようなもの（多くの重複した文字列と冗長な処理）なので、本当に完了した後は、おそらくC++で書き直したいと思うでしょう。クロス積...しかし、gitkがそれを示しているとは思いませんが、最も近い隣人だけですよね？

git - コミットを含むリリースの効率的な取得

1 に答える 1

Related

Reference