lucene - ハードディスク上のテキストファイルにluceneを介してRDBMSテーブルデータを保存する

Question

luceneを使用して320万レコードのRDBMSSQLクエリ結果をテキストファイルに保存し、それを検索したいと思います。[ここで、RAMDirectoryをluceneのFSDirectoryに統合する方法の例を見ました

[1]：LuceneでRAMDirectoryをFSDirectoryに統合する方法。私のために機能しているこのコードがあります。

  public class lucetest {
        public static void main(String args[]) {
            lucetest lucetestObj = new lucetest();
            lucetestObj.main1(lucetestObj);
        }

        public void main1(lucetest lucetestObj) {
            final File INDEX_DIR = new File(
                    "C:\\Documents and Settings\\44444\\workspace\\lucenbase\\bin\\org\\lucenesample\\index");

            try {
                Connection conn;
                Class.forName("com.teradata.jdbc.TeraDriver").newInstance();
                conn = DriverManager.getConnection(
                        "jdbc:teradata://x.x.x.x/CHARSET=UTF16", "aaa", "bbb");
                StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_35);

//              Directory index = new RAMDirectory(); //To use RAM space
Directory index = FSDirectory.open(INDEX_DIR); //To use Hard disk,This will not consume RAM

                IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_35,
                        analyzer);
                IndexWriter writer = new IndexWriter(index, config);

                // IndexWriter writer = new IndexWriter(INDEX_DIR, analyzer, true);
                System.out.println("Indexing to directory '" + INDEX_DIR + "'...");

                lucetestObj.indexDocs(writer, conn);
                writer.optimize();
                writer.close();
                System.out.println("pepsi");
                lucetestObj.searchDocs(index, analyzer, "india");
                try {
                    conn.close();
                } catch (SQLException e2) {
                    // TODO Auto-generated catch block
                    e2.printStackTrace();
                }
            } catch (Exception e) {
                e.printStackTrace();

            } finally {

            }

        }

        void indexDocs(IndexWriter writer, Connection conn) throws Exception {
            String sql = "select id, name, color from pet";

            String queryy = "  SELECT  CFMASTERNAME, " + "  ULTIMATEPARENTID,"
                    + "ULTIMATEPARENT, LONG_NAMEE FROM  XCUST_SRCH_SRCH"
                    + "sample 100000;";
            Statement stmt = conn.createStatement();
            ResultSet rs = stmt.executeQuery(queryy);
            int kk = 0;
            while (rs.next()) {
                Document d = new Document();
                d.add(new Field("id", rs.getString("CFMASTERID"), Field.Store.YES,
                        Field.Index.NO));
                d.add(new Field("name", rs.getString("CFMASTERNAME"),
                        Field.Store.YES, Field.Index.ANALYZED));
                d.add(new Field("color", rs.getString("LONG_NAMEE"),
                        Field.Store.YES, Field.Index.ANALYZED));
                writer.addDocument(d);
            }
            if (rs != null) {
                rs.close();
            }
        }

        void searchDocs(Directory index, StandardAnalyzer analyzer,
                String searchstring) throws Exception {

            String querystr = searchstring.length() > 0 ? searchstring : "lucene";
            Query q = new QueryParser(Version.LUCENE_35, "name", analyzer)
                    .parse(querystr);

            int hitsPerPage = 10;
            IndexReader reader = IndexReader.open(index);
            IndexSearcher searcher = new IndexSearcher(reader);
            TopScoreDocCollector collector = TopScoreDocCollector.create(
                    hitsPerPage, true);
            searcher.search(q, collector);
            ScoreDoc[] hits = collector.topDocs().scoreDocs;
            System.out.println("Found " + hits.length + " hits.");
            for (int i = 0; i < hits.length; ++i) {
                int docId = hits[i].doc;
                Document d = searcher.doc(docId);
                System.out.println((i + 1) + ".CFMASTERNAME " + d.get("name")
                        + " ****LONG_NAMEE**" + d.get("color") + "****ID******"
                        + d.get("id"));
            }

            searcher.close();
        }
    }

RAMディレクトリの代わりにSQL結果テーブルが指定されたパスでハードディスクに保存されるようにこのコードをフォーマットする方法解決策を見つけることができません。私の要件は、luceneを介してディスクに保存されたこのテーブルデータが結果を返すことです非常に高速です。したがって、インデックスが作成されたluceneを使用してデータをディスクに保存しています。

score 1 · Accepted Answer

Directory index = FSDirectory.open(INDEX_DIR);

You mention saving the sql result to a text file, but that is unnecessary overhead. As you iterate through a ResultSet, save the rows directly to the Lucene index.

As an aside, not that it matters much, but naming your local var (final or otherwise) in all caps is against the convention. Use camelCase. All caps is only for class-level constants (static final members of a class).

lucene - ハードディスク上のテキストファイルにluceneを介してRDBMSテーブルデータを保存する

1 に答える 1

Related

Reference