I'm developing a program which is able to find the difference in files between to folders for instance. I've made a method which traverses the folder structure of a given folder, and builds a tree for each subfolder. Each node contains a list of files, which is the files in that folder. Each node has an amount of children, which corresponds to folders in that folder.
Now the problem is to find the files present in one tree, but not in the other. I have a method: "private List Diff(Node index1, Node index2)", which should do this. But the problem is the way that I'm comparing the trees. To compare two trees takes a huge amount of times - when each of the input nodes contains about 70,000 files, the Diff method takes about 3-5 minutes to complete.
I'm currently doing it this way:
private List<MyFile> Diff(Node index1, Node index2)
{
List<MyFile> DifferentFiles = new List<MyFile>();
List<MyFile> Index1Files = FindFiles(index1);
List<MyFile> Index2Files = FindFiles(index2);
List<MyFile> JoinedList = new List<MyFile>();
JoinedList.AddRange(Index1Files);
JoinedList.AddRange(Index2Files);
List<MyFile> JoinedListCopy = new List<MyFile>();
JoinedListCopy.AddRange(JoinedList);
List<string> ChecksumList = new List<string>();
foreach (MyFile m in JoinedList)
{
if (ChecksumList.Contains(m.Checksum))
{
JoinedListCopy.RemoveAll(x => x.Checksum == m.Checksum);
}
else
{
ChecksumList.Add(m.Checksum);
}
}
return JoinedListCopy;
}
And the Node class looks like this:
class Node
{
private string _Dir;
private Node _Parent;
private List<Node> _Children;
private List<MyFile> _Files;
}