graph - How is representing all information in Nodes vs Attributes affect storage, computations?

Question

While using Graph Databases(my case Neo4j), we can represent the same information many ways. Making each entity a Node and connecting all entities through relationships or just adding the entities to attribute list of a Node.diff

Following are two different representations of the same data. All info represented as Nodes Info represented as Properties of specific Nodes Overall, which mechanism is suitable in which conditions?

My use case involves traversing the Database from different nodes until 4 depths and examining the information through connected nodes or attributes (based on which approach it is). One query of interest may be, "Who are the friends of John who went to Stanford?"

What is the difference in terms of Storage, computations

score 1 · Accepted Answer

通常、プロパティは遅延ロードされ、特に文字列をキャッシュに保持するのにコストがかかります。ノードとリレーションシップは、トラバーサルに最も効果的です。特に、リレーションシップタイプがリレーションシップレコードと一緒に保存され、トラバーサルで使用されたときにプロパティの読み込みがトリガーされないためです。

また、バランスの取れたグラフ (つまり、10,000 を超えるリレーションシップを持つ密集したノードが多くない) は、トラバースするのに最も効果的です。

繰り返し発生するプロパティのほとんどを、エンティティに接続するノードとしてモデル化して、グラフ自体を使用してこれらの値にインデックスを付けようとします。プロパティ値をフィルタリングしたり、高価なインデックスルックアップでプロパティにインデックスを付けたりする必要はありません。

score 0 · Accepted Answer

Stanford などのエンティティに対してクエリを実行していて、そのエンティティが多くの person ノードに関連しているため、最初の方法の方がはるかに優れています。ノードとしてモデル化する方がより直感的で、クエリを実行しやすいというのが私の意見です。トラバースを開始する場所がないため、「スタンフォードに行ったすべての人を見つける」は、2 番目のモデルで行うのは非常に簡単ではありません。主にノード/エンティティを記述するために属性を使用します。たとえば、2010 年にスタンフォードに行ったジョンの友人は誰ですか。この場合、年属性は結果をトリミングするためにのみ使用されます。 . ユースケースによって異なります。年が非常に重要であり、多くのクエリを実行したり、タイムラインを表すために使用されたりする場合は、年をスタンフォードに接続されたノードとしてモデル化することもできます。

graph - How is representing all information in Nodes vs Attributes affect storage, computations?

2 に答える 2

Related

Reference