org.apache.lucene.analysis.shingle

Class ShingleAnalyzerWrapper

    • Constructor Detail

      • ShingleAnalyzerWrapper

        public ShingleAnalyzerWrapper(Analyzer delegate,
                                      int minShingleSize,
                                      int maxShingleSize,
                                      String tokenSeparator,
                                      boolean outputUnigrams,
                                      boolean outputUnigramsIfNoShingles,
                                      String fillerToken)
        Creates a new ShingleAnalyzerWrapper
        Parameters:
        delegate - Analyzer whose TokenStream is to be filtered
        minShingleSize - Min shingle (token ngram) size
        maxShingleSize - Max shingle size
        tokenSeparator - Used to separate input stream tokens in output shingles
        outputUnigrams - Whether or not the filter shall pass the original tokens to the output stream
        outputUnigramsIfNoShingles - Overrides the behavior of outputUnigrams==false for those times when no shingles are available (because there are fewer than minShingleSize tokens in the input stream)? Note that if outputUnigrams==true, then unigrams are always output, regardless of whether any shingles are available.
        fillerToken - filler token to use when positionIncrement is more than 1