現在、pdf からデータのインデックスを作成しようとしていますが、Solr から次の応答が返されます。
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
</lst>
<lst name="initArgs">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</lst>
<str name="command">full-import</str>
<str name="status">idle</str>
<str name="importResponse"/>
<lst name="statusMessages">
<str name="Time Elapsed">0:0:1.236</str>
<str name="Total Requests made to DataSource">0</str>
<str name="Total Rows Fetched">1</str>
<str name="Total Documents Processed">0</str>
<str name="Total Documents Skipped">0</str>
<str name="Full Dump Started">2012-05-11 15:45:01</str>
<str name="">Indexing failed. Rolled back all changes.</str>
<str name="Rolledback">2012-05-11 15:45:01</str></lst><str name="WARNING">This response format is experimental. It is likely to change in the future.</str>
</response>
これを示すログ ファイル:
org.apache.solr.common.SolrException log
SEVERE: Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.NoClassDefFoundError: org/apache/tika/parser/AutoDetectParser
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:264)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:426)
Caused by: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.NoClassDefFoundError: org/apache/tika/parser/AutoDetectParser
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:621)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
... 3 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.NoClassDefFoundError: org/apache/tika/parser/AutoDetectParser
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:759)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:619)
... 5 more
Caused by: java.lang.NoClassDefFoundError: org/apache/tika/parser/AutoDetectParser
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Unknown Source)
at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:388)
at org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:1100)
at org.apache.solr.handler.dataimport.DocBuilder.getEntityProcessor(DocBuilder.java:912)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:635)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:709)
... 6 more
Caused by: java.lang.ClassNotFoundException: org.apache.tika.parser.AutoDetectParser
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.net.FactoryURLClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
... 13 more
構成ファイルは次のようになります: data-config.xml:
<?xml version="1.0" encoding="utf-8"?>
<dataConfig>
<dataSource type="BinFileDataSource" name="binary" />
<document>
<entity name="f" dataSource="binary" rootEntity="false" processor="FileListEntityProcessor" baseDir="C:\solr\solr\docu" fileName=".*pdf" recursive="true">
<entity name="tika" processor="TikaEntityProcessor" url="${f.fileAbsolutePath}" format="text">
<field column="id" name="id" meta="true" />
<field column="fake_id" name="fake_id" />
<field column="model" name="model" meta="true" />
<field column="text" name="biog" />
</entity>
</entity>
</document>
</dataConfig>
schema.xml:
<fields>
<field name="id" type="string" indexed="true" stored="true" />
<field name="fake_id" type="string" indexed="true" stored="true" />
<field name="model" type="text_en" indexed="true" stored="true" />
<field name="firstname" type="text_en" indexed="true" stored="true"/>
<field name="lastname" type="text_en" indexed="true" stored="true"/>
<field name="title" type="text_en" indexed="true" stored="true"/>
<field name="biog" type="text_en" indexed="true" stored="true"/>
</fields>
<uniqueKey>fake_id</uniqueKey>
<defaultSearchField>biog</defaultSearchField>
最後に、私が持っている「Tika」ジャーは次のとおりです。
tika-core-1.0.jar および tika-parsers-1.0.jar
何がうまくいかないのですか?ありがとう