Skip navigation links

Package org.apache.lucene.search.spans

The calculus of spans.

See: Description

Package org.apache.lucene.search.spans Description

The calculus of spans.

A span is a <doc,startPosition,endPosition> tuple that is enumerated by class Spans.

The following span query operators are implemented:

In all cases, output spans are minimally inclusive. In other words, a span formed by matching a span in x and y starts at the lesser of the two starts and ends at the greater of the two ends.

For example, a span query which matches "John Kerry" within ten words of "George Bush" within the first 100 words of the document could be constructed with:

 SpanQuery john   = new SpanTermQuery(new Term("content", "john"));
 SpanQuery kerry  = new SpanTermQuery(new Term("content", "kerry"));
 SpanQuery george = new SpanTermQuery(new Term("content", "george"));
 SpanQuery bush   = new SpanTermQuery(new Term("content", "bush"));
 
 SpanQuery johnKerry =
    new SpanNearQuery(new SpanQuery[] {john, kerry}, 0, true);
 
 SpanQuery georgeBush =
    new SpanNearQuery(new SpanQuery[] {george, bush}, 0, true);
 
 SpanQuery johnKerryNearGeorgeBush =
    new SpanNearQuery(new SpanQuery[] {johnKerry, georgeBush}, 10, false);
 
 SpanQuery johnKerryNearGeorgeBushAtStart =
    new SpanFirstQuery(johnKerryNearGeorgeBush, 100);
 

Span queries may be freely intermixed with other Lucene queries. So, for example, the above query can be restricted to documents which also use the word "iraq" with:

 Query query = new BooleanQuery();
 query.add(johnKerryNearGeorgeBushAtStart, true, false);
 query.add(new TermQuery("content", "iraq"), true, false);
 
Skip navigation links