git - Gitは、あるファイルから別のファイルへの1つの関数の移動を実際に追跡できますか？もしそうなら、どのように？

Question

あるファイルから別のファイルに単一の関数を移動すると、Gitがそれを追跡できるという声明に何度か出くわしました。たとえば、このエントリには、「関数をあるファイルから別のファイルに移動すると、Gitは移動中のその単一の関数の履歴を教えてくれるとLinusは言っています」と書かれています。

しかし、私はGitの内部設計のいくつかに少し気づいており、これがどのように可能であるかわかりません。だから私は疑問に思っています...これは正しいステートメントですか？もしそうなら、これはどのように可能ですか？

私の理解では、Gitは各ファイルのコンテンツをBlobとして保存し、各Blobは、そのコンテンツとサイズのSHAハッシュから生じるグローバルに一意のIDを持っています。次に、Gitはフォルダーをツリーとして表します。ファイル名情報はすべてBlobではなくTreeに属しているため、たとえばファイル名の変更は、BlobではなくTreeへの変更として表示されます。

したがって、20個の関数を含む「foo」というファイルと5個の関数を含む「bar」というファイルがあり、関数の1つをfooからbarに移動すると（それぞれ19と6になります）、その関数をあるファイルから別のファイルに移動したことをGitはどのように検出できますか？

私の理解では、これにより2つの新しいblobが存在します（1つは変更されたfoo用、もう1つは変更されたbar用）。関数が1つのファイルから別のファイルに移動されたことを示すためにdiffを計算できることに気付きました。しかし、関数に関する履歴がfooではなくbarに関連付けられる可能性があるかどうかはわかりません（とにかく自動的ではありません）。

Gitが実際に単一のファイルの内部を調べ、関数ごとにblobを計算する場合（可能な言語を解析する方法を知っている必要があるため、これはクレイジー/実行不可能です）、これがどのように可能であるかがわかります。

それで...ステートメントは正しいかどうか？そしてそれが正しければ、私の理解に欠けているものは何ですか？

score 35 · Accepted Answer

This functionality is provided through git blame -C <file>.

The -C option drives git into trying to find matches between addition or deletion of chunks of text in the file being reviewed and the files modified in the same changesets. Additional -C -C, or -C -C -C extend the search.

Try for yourself in a test repo with git blame -C and you'll see that the block of code that you just moved is originated in the original file where it belonged to.

From the git help blame manual page:

The origin of lines is automatically followed across whole-file renames (currently there is no option to turn the rename-following off). To follow lines moved from one file to another, or to follow lines that were copied and pasted from another file, etc., see the -C and -M options.

score 20 · Accepted Answer

As of Git 2.15, git diff now supports detection of moved lines with the --color-moved option. It works for moves across files.

It works, obviously, for colorized terminal output. As far as I can tell, there is no option to indicate moves in plain text patch format, but that makes sense.

For default behavior, try

git diff --color-moved

The command also takes options, which currently are no, default, plain, zebra and dimmed_zebra (Use git help diff to get the latest options and their descriptions). For example:

git diff --color-moved=zebra

As to how it is done, you can glean some understanding from this email exchange by the author of the functionality.

score 8 · Accepted Answer

A bit of this functionality is in git gui blame (+ filename). It shows an annotation of the lines of a file, each indicating when it was created and when last changed. For code movement across a file, it shows the commit of the original file as a creation, and the commit where it was added to the current file as last change. Try it.

What I really would want is to give git log as some argument a line number range additionally to a file path, and then it would show the history of this code block. There is no such option, if the documentation is right. Yes, from Linus' statement I too would think such a command should be readily available.

score 5 · Accepted Answer

git doesn't actually track renames at all. A rename is just a delete and add, that's all. Any tools who show renames reconstruct them from this history information.

As such, tracking function renames is a simple matter of analyzing the diffs of all files in each commit after the fact. There's nothing particularly impossible about it; the existing rename tracking already handles 'fuzzy' renames, in which some changes are done to the file as well as renaming it; this requires looking at the contents to the files. It would be a simple extension to look for function renames as well.

I don't know if the base git tools actually do this however - they try to be language neutral, and function identification is very much not language neutral.

score 2 · Accepted Answer

There's git diff that will show you that certain lines disappeared from foo and reappeared in bar. If there are no other changes in these files in the same commit, the change will be easy to spot.

An intellectual git client would be able to show you how lines moved from one file to another. A language-aware IDE would be able to correspond this change with a particular function.

A very similar thing happens when a file gets renamed. It just disappears under one name and reappears under another, but any reasonable tool is able to notice it and represent as a rename.

git - Gitは、あるファイルから別のファイルへの1つの関数の移動を実際に追跡できますか？もしそうなら、どのように？

5 に答える 5

Related

Reference