org.apache.lucene.search.suggest.analyzing

Class AnalyzingSuggester

  • All Implemented Interfaces:
    Accountable
    Direct Known Subclasses:
    FuzzySuggester


    public class AnalyzingSuggester
    extends Lookup
    implements Accountable
    Suggester that first analyzes the surface form, adds the analyzed form to a weighted FST, and then does the same thing at lookup time. This means lookup is based on the analyzed form while suggestions are still the surface form(s).

    This can result in powerful suggester functionality. For example, if you use an analyzer removing stop words, then the partial text "ghost chr..." could see the suggestion "The Ghost of Christmas Past". Note that position increments MUST NOT be preserved for this example to work, so you should call the constructor with preservePositionIncrements parameter set to false

    If SynonymFilter is used to map wifi and wireless network to hotspot then the partial text "wirele..." could suggest "wifi router". Token normalization like stemmers, accent removal, etc., would allow suggestions to ignore such variations.

    When two matching suggestions have the same weight, they are tie-broken by the analyzed form. If their analyzed form is the same then the order is undefined.

    There are some limitations:

    • A lookup from a query like "net" in English won't be any different than "net " (ie, user added a trailing space) because analyzers don't reflect when they've seen a token separator and when they haven't.
    • If you're using StopFilter, and the user will type "fast apple", but so far all they've typed is "fast a", again because the analyzer doesn't convey whether it's seen a token separator after the "a", StopFilter will remove that "a" causing far more matches than you'd expect.
    • Lookups with the empty string return no results instead of all results.