Table of Contents

Lucene

About

Lucene 1) is a text search engine library.

The following application are Lucene application (ie build on it):

Structure

The text data model of Lucene is based on the following concept: 2):

An index contains a sequence of documents.

Document

A document is a basic unit of information that can be indexed.

For example, you can have a document for:

Index

An index is a collection of documents that have somewhat similar characteristics.

Lucene's terms index falls into the family of indexes known as an inverted index because it can list, for a term, the documents that contain it. This is the inverse of the natural relationship, in which documents list terms.

For example, you can have an index for:

Query

Lucene comes with a rich query language 3)

Syntax:

[field:]expression

where:

Cheetsheat:

Relation Expression
equals attribute:“value”
does not equal attribute:-“value”
contains attribute:*value*
does not contain attribute:-*value*
starts with attribute:value*
ends with attribute:*value
has has:attribute
missing missing:attribute

Example:

text:go
# same as
go 
title:"The Right Way" and text:go 
# same as
title:"The Right Way" and go 

Anatomy of a Lucene Application

To create an lucene application, you should 4):

Example:

Analyzer analyzer = new StandardAnalyzer();

Path indexPath = Files.createTempDirectory("tempIndex");
Directory directory = FSDirectory.open(indexPath);
IndexWriterConfig config = new IndexWriterConfig(analyzer);
IndexWriter iwriter = new IndexWriter(directory, config);
Document doc = new Document();
String text = "This is the text to be indexed.";
doc.add(new Field("fieldname", text, TextField.TYPE_STORED));
iwriter.addDocument(doc);
iwriter.close();

// Now search the index:
DirectoryReader ireader = DirectoryReader.open(directory);
IndexSearcher isearcher = new IndexSearcher(ireader);
// Parse a simple query that searches for "text":
QueryParser parser = new QueryParser("fieldname", analyzer);
Query query = parser.parse("text");
ScoreDoc[] hits = isearcher.search(query, 10).scoreDocs;
assertEquals(1, hits.length);
// Iterate through the results:
for (int i = 0; i < hits.length; i++) {
    Document hitDoc = isearcher.doc(hits[i].doc);
    assertEquals("This is the text to be indexed.", hitDoc.get("fieldname"));
}
ireader.close();
directory.close();
IOUtils.rm(indexPath);

Example on how to index and query

Simple examples in the repository 5) are:

Usage:

java -cp lucene-core.jar:lucene-demo.jar:lucene-analysis-common.jar \
    org.apache.lucene.demo.IndexFiles \
    -index index \
    -docs your/directory/path
adding rec.food.recipes/soups/abalone-chowder
      [ ... ]

java -cp lucene-core.jar:lucene-demo.jar:lucene-queryparser.jar:lucene-analysis-common.jar \
   org.apache.lucene.demo.SearchFiles
Query: chowder
Searching for: chowder
34 total matching documents
...

4)
This example comes from the package index - minimal application.