Unicode 文を lucene インデックスに単純にインデックス化する Python スクリプトがあります。そして、100 文と私の 1000 文の試行で問題なく動作します。しかし、200,000 のセンテンスをインデックス化する必要がある場合、4514 番目のセンテンスでマージ エラーが発生します。何が問題で、どのように解決できますか?
エラー: _
Exception in thread "Thread-4543" org.apache.lucene.index.MergePolicy$MergeException: java.io.FileNotFoundException: /home/alvas/europarl/index/_70g.tii (Too many open files)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:271)
Traceback (most recent call last):
Caused by: java.io.FileNotFoundException: /home/alvas/europarl/index/_70g.tii (Too many open files)
at java.io.RandomAccessFile.open(Native Method) File "indexer.py", line 183, in <module>
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:593)
at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:435)
at org.apache.lucene.index.TermInfosWriter.initialize(TermInfosWriter.java:91)
at org.apache.lucene.index.TermInfosWriter.<init>(TermInfosWriter.java:83)
at org.apache.lucene.index.TermInfosWriter.<init>(TermInfosWriter.java:77)
incrementalIndexing(sfile,tfile,indexDir)
at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:381)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:134) File "indexer.py", line 141, in incrementalIndexing
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3109)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2834)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:240)
writer.optimize(); writer.close()
lucene.JavaError: java.io.IOException: background merge hit exception: _70e:c4513 _70f:c1 into _70g [optimize]
Java stacktrace:
java.io.IOException: background merge hit exception: _70e:c4513 _70f:c1 into _70g [optimize]
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1749)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1689)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1669)
Caused by: java.io.FileNotFoundException: /home/alvas/europarl/index/_70g.tii (Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:593)
at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:435)
at org.apache.lucene.index.TermInfosWriter.initialize(TermInfosWriter.java:91)
at org.apache.lucene.index.TermInfosWriter.<init>(TermInfosWriter.java:83)
at org.apache.lucene.index.TermInfosWriter.<init>(TermInfosWriter.java:77)
at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:381)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:134)
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3109)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2834)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:240)
私のコード: http://pastebin.com/Ep133W5f
サンプル入力ファイル: http://pastebin.com/r5qE4qpt、http://pastebin.com/wxCU277x