lucene - Lucene で現在マージされているインデックスで Commit を呼び出す

Question

私の質問は Lucene .NET 2.9.2 と見なされます

を使用してインデックスを更新するIndexWriterと、スケジューラがバックラウンドでセグメントのマージを開始したとします。Commitマージが完了する前に電話したらどうなりますか? 呼び出されたスレッドはCommitブロックされ、マージが完了するのを待ちますか、それとも 2 つのスレッドは独立していますか?

FieldCache答えは私の検索実装にとって非常に重要です。なぜなら、私はパフォーマンスの問題をに依存しておりCommit、マージが完了するのを待たなければ、間違った DocIds を取得する可能性があるからです...

アップデート：

私がやろうとしているのは、DocId からアプリケーション ID へのマッピングです。そのため、IndexSearcherSearch メソッドを使用するときに、アプリケーション ID の保存された値をフェッチする必要はありません。そのため、インデックス作成中にマッピングを構築し、そのマッピングをバイナリファイルに保存しようとしています。検索では、そのファイルを配列 (メモリ内...) にロードします。したがって、ファイルのバージョンはIndexReader(明確であるといいのですが...)

例: (索引プロセス・コード)

IndexWriter writer = //initialize writer

//modify index using the writer add\delete\update doc methods...

//get updated reader to the index
IndexReader r1= wrtier.GetReader();

//read all values for all documents for specific field name.

long[] ids = FieldCache_Fields.DEFAULT.GetLongs(r1, "ID");

//serialize the array to a file (code not provided)

Dictionary<string,string> metaData = new Dictionary<string,string>();
metaData.Add("FileName", /*full path to the serialized file*/);
writer.Commit(metaData);

(サーチャープロセスコード)

IndexReader r2 = //IndexRead.Open...
Dictionary<string,string> metaData = r2.GetCommitUserData()

string fullPathToFile = metaData["FileName"];  //get the file name that was serialized

//load the array from the file (=deserialize file)
long[] ids = //load from file

//now I can convert internal DocId to my Application Id, and save time instead of fetching data from the stored field (which takes more time...)

基本的に私の質問は次のとおりです: 2 つのリーダー r1 と r2 の DocIds が一致しない可能性はありますか? インデックスに他の変更がなかったという仮定の下で?

score 1 · Accepted Answer

バックグラウンドでのマージによってコミットがブロックされることはありません。

しかし、私はあなたのFieldCache問題を理解していませんIndexReader.sは不変であり、fieldcacheインスタンスは決して無効になることはありません..

lucene - Lucene で現在マージされているインデックスで Commit を呼び出す

1 に答える 1

Related

Reference