1. The lifecycle of IndexReader and IndexWriter
1) The open and close operations for reader/writer are at high cost.
2) Especially for IndexReader, time consumption of these operations are high.
3) So by convention, IndexReader is set as singleton in application.
package edu.xmu.lucene.Lucene_ModuleOne; import java.io.File; import java.io.IOException; import org.apache.lucene.document.Document; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.Term; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.TermQuery; import org.apache.lucene.search.TopDocs; import org.apache.lucene.store.Directory; import org.apache.lucene.store.FSDirectory; /** * Hello world! * */ public class App { private Directory dir = null; private static IndexReader reader = null; public App() { try { dir = FSDirectory.open(new File("E:/LuceneIndex")); reader = IndexReader.open(dir); } catch (IOException e) { e.printStackTrace(); } } public IndexSearcher getSearcher() { IndexSearcher searcher = new IndexSearcher(reader); return searcher; } /** * Search * */ public void search() { IndexSearcher searcher = getSearcher(); TermQuery query = new TermQuery(new Term("name", "Davy")); TopDocs topDocs; try { topDocs = searcher.search(query, 10); for (ScoreDoc scoreDoc : topDocs.scoreDocs) { Document doc = searcher.doc(scoreDoc.doc); String score = doc.get("score"); String date = doc.get("date"); float boost = doc.getBoost(); System.out.println("Score = " + score + ", Date = " + date + ", Boost = " + boost); } } catch (IOException e) { e.printStackTrace(); } finally { try { // We only have to close searcher and don't have to close // reader. searcher.close(); } catch (IOException e) { e.printStackTrace(); } } } }
2. Other quesions:
1) As there is only one reader that is created through the whole application.
2) Will the changes be detected and reflected to reader whenever an IndexWriter change the index file?
--> Any update/delete operations will not affect the reader that created before such operation executed.
package edu.xmu.lucene.Lucene_ModuleOne; import org.junit.Test; /** * Unit test for simple App. */ public class AppTest { @Test public void testSearch() { App app = new App(); for (int i = 0; i < 5; i++) { app.search(); } } }
Comments: If we execute test case as above. There will be no affect whatever the count of search operation be excuted. Because during the whole process, there is only one reader created. There will not be some other index files created.
3) How can we enable reader to detect the change of index during the whole lifecycle of this reader?
public IndexSearcher getSearcher() { try { if (reader == null) { reader = IndexReader.open(dir); } else { if (null != IndexReader.openIfChanged(reader)) { reader = IndexReader.openIfChanged(reader); } } } catch (CorruptIndexException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } IndexSearcher searcher = new IndexSearcher(reader); return searcher; }
Comments:
1) In class IndexReader: static IndexReader openIfChanged(IndexReader oldReader)
-> If the index has chaged since the provided reader was opend, open and return a new reader, else, return null.
2) By using this method, we can update reader in real time.
3) By convention, during the whole lifecycle of the application, there will be only one IndexReader.
4) But some other application, writer is required to be singleton.
-> So how can we commit the change we made to the index file as we cannot close the writer?
-> Use writer.commit();
-> If we don't commit, then the index file will not change, the modification we did in IndexWriter is invalid.
5) As we can see, during the procession of IndexReader.openIfChanged, there will be new IndexReader created if there is some change in index.
-> So what about the old reader? As the old reader hasn't been closed yet.
6) We can find out that during the procession of delete.
-> We can not only use writer.deleteDocument(new Term(key, value)); but also use reader.deleteDocument(new Term(key, value));
-> Remember to the easiest way for reader to commit is reader.close();
-> So what's the difference of the two approaches?
-> By default, reader is read only. Use IndexReader.open(dir, false) to make it writable.
-> If we use reader to modify the index, actually it will create a writer and then use the writer to execute the modification.
-> The benefit is that the modification information will reflect to reader in real time and we don't have to use IndexReader.openIfChanged(IndexReader reader);
-> But it would be a little difficult to submit changes using reader.
-> So by convention, we don't use this approach.
相关推荐
Maven坐标:org.apache.lucene:lucene-core:7.7.0; 标签:apache、lucene、core、中文文档、jar包、java; 使用方法:解压翻译后的API文档,用浏览器打开“index.html”文件,即可纵览文档内容。 人性化翻译,文档...
Lucene是一个基于Java的全文索引工具包。 1. 基于Java的全文索引引擎Lucene简介:关于作者和Lucene的...5. Hacking Lucene:简化的查询分析器,删除的实现,定制的排序,应用接口的 扩展 6. 从Lucene我们还可以学到什么
官网的lucene全文检索引擎工具包,下载后直接解压缩即可使用
关于lucene的一些介绍。Lucene:基于Java的全文检索引擎简介
指南-Lucene:ES篇.md
Lucene:基于Java的全文检索引擎简介.rar
由于林良益先生在2012之后未对IKAnalyzer进行更新,后续lucene分词接口发生变化,导致不可使用,所以此jar包支持lucene6.0以上版本
精品资料(2021-2022收藏)Lucene:基于Java的全文检索引擎简介.doc
精品资料(2021-2022收藏)Lucene:基于Java的全文检索引擎简介.docx
精品资料(2021-2022收藏)Lucene:基于Java的全文检索引擎简介22173.doc
面试指南-Lucene/ES篇
Maven坐标:org.apache.lucene:lucene-sandbox:6.6.0; 标签:apache、lucene、sandbox、jar包、java、中文文档; 使用方法:解压翻译后的API文档,用浏览器打开“index.html”文件,即可纵览文档内容。 人性化翻译...
Lucene: : 用Gradle构建 基本步骤: 安装OpenJDK 11(或更高版本) 从Apache下载Lucene并解压缩 连接到安装的顶层(lucene顶层目录的父目录) 运行gradle 步骤0)设置您的开发环境(OpenJDK 11或更高版本) ...
lucene 所有jar包 包含IKAnalyzer分词器