org.apache.lucene.search.suggest.analyzing

Class SuggestStopFilterFactory

  • All Implemented Interfaces:
    ResourceLoaderAware


    public class SuggestStopFilterFactory
    extends TokenFilterFactory
    implements ResourceLoaderAware
    Factory for SuggestStopFilter.
     <fieldType name="autosuggest" class="solr.TextField" 
                positionIncrementGap="100" autoGeneratePhraseQueries="true">
       <analyzer>
         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
         <filter class="solr.LowerCaseFilterFactory"/>
         <filter class="solr.SuggestStopFilterFactory" ignoreCase="true"
                 words="stopwords.txt" format="wordset"/>
       </analyzer>
     </fieldType>

    All attributes are optional:

    • ignoreCase defaults to false
    • words should be the name of a stopwords file to parse, if not specified the factory will use StopAnalyzer.ENGLISH_STOP_WORDS_SET
    • format defines how the words file will be parsed, and defaults to wordset. If words is not specified, then format must not be specified.

    The valid values for the format option are:

    • wordset - This is the default format, which supports one word per line (including any intra-word whitespace) and allows whole line comments begining with the "#" character. Blank lines are ignored. See WordlistLoader.getLines for details.
    • snowball - This format allows for multiple words specified on each line, and trailing comments may be specified using the vertical line ("|"). Blank lines are ignored. See WordlistLoader.getSnowballWordSet for details.