database - Titan インデックスの更新に時間がかかりすぎる

Question

空のデータベースでも、Titan 1.0 でインデックスを作成するには数分かかります。時間は正確に見えますが、これは不必要な遅延があることを示唆しています。

私の質問は次のとおりです: Titan の再インデックスにかかる時間を短縮または排除するにはどうすればよいですか? 概念的には、作業が行われていないため、時間は最小限である必要があり、4 分ではありません。

(NB 私は以前に、Titan がタイムアウトせずに完全な遅延を待つだけの解決策を指摘されていました。これは間違った解決策です。遅延を完全に排除したいのです。)

データベースを最初からセットアップするために使用しているコードは次のとおりです。

graph = ... a local cassandra instance ...
graph.tx().rollback()

// 1. Check if the index already exists
mgmt = graph.openManagement()
i = mgmt.getGraphIndex('byIdent')
if(! i) {
  // 1a. If the index does not exist, add it
  idKey = mgmt.getPropertyKey('ident')
  idKey = idKey ? idKey : mgmt.makePropertyKey('ident').dataType(String.class).make()
  mgmt.buildIndex('byIdent', Vertex.class).addKey(idKey).buildCompositeIndex()
  mgmt.commit()
  graph.tx().commit()

  mgmt  = graph.openManagement()
  idKey = mgmt.getPropertyKey('ident')
  idx   = mgmt.getGraphIndex('byIdent')
  // 1b. Wait for index availability
  if ( idx.getIndexStatus(idKey).equals(SchemaStatus.INSTALLED) ) {
    mgmt.awaitGraphIndexStatus(graph, 'byIdent').status(SchemaStatus.REGISTERED).call()
  }
  // 1c. Now reindex, even though the DB is usually empty.
  mgmt.updateIndex(mgmt.getGraphIndex('byIdent'), SchemaAction.REINDEX).get()
  mgmt.commit()
  mgmt.awaitGraphIndexStatus(graph, 'byIdent').status(SchemaStatus.ENABLED).call()
} else { mgmt.commit() }

updateIndex...REINDEXタイムアウトまでブロックする呼び出しのようです。これは既知の問題ですか、それとも修正されませんか? 私は何か間違ったことをしていますか？

編集:コメントで説明されているように、REINDEX を無効にすることは、インデックスがアクティブにならないように見えるため、実際には修正ではありません。私は今見ます：

WARN  com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx  - Query requires iterating over all vertices [(myindexedkey = somevalue)]. For better performance, use indexes

score 3 · Accepted Answer

時間の遅延は完全に不要であり、Titan の誤用によるものです (ただし、パターンは Titan 1.0.0 ドキュメントの第 28 章に表示されます)。

トランザクションでブロックしないでください!

それ以外の：

  mgmt  = graph.openManagement()
  idKey = mgmt.getPropertyKey('ident')
  idx   = mgmt.getGraphIndex('byIdent')
  // 1b. Wait for index availability
  if ( idx.getIndexStatus(idKey).equals(SchemaStatus.INSTALLED) ) {
    mgmt.awaitGraphIndexStatus(graph, 'byIdent').status(SchemaStatus.REGISTERED).call()
  }

検討：

  mgmt  = graph.openManagement()
  idKey = mgmt.getPropertyKey('ident')
  idx   = mgmt.getGraphIndex('byIdent')
  // Wait for index availability
  if ( idx.getIndexStatus(idKey).equals(SchemaStatus.INSTALLED) ) {
    mgmt.commit()
    mgmt.awaitGraphIndexStatus(graph, 'byIdent').status(SchemaStatus.REGISTERED).call()
  } else { mgmt.commit() }

ENABLE_INDEX を使用

いいえ：mgmt.updateIndex(mgmt.getGraphIndex('byIdent'), SchemaAction.REINDEX).get()

それよりも：mgmt.updateIndex(mgmt.getGraphIndex('byIdent'),SchemaAction.ENABLE_INDEX).get()

database - Titan インデックスの更新に時間がかかりすぎる

1 に答える 1

Related

Reference