public class JapaneseAnalyzer extends StopwordAnalyzerBase
JapaneseTokenizer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
stopwords
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
Constructor and Description |
---|
JapaneseAnalyzer() |
JapaneseAnalyzer(UserDictionary userDict,
JapaneseTokenizer.Mode mode,
CharArraySet stopwords,
Set<String> stoptags) |
Modifier and Type | Method and Description |
---|---|
protected Analyzer.TokenStreamComponents |
createComponents(String fieldName)
Creates a new
Analyzer.TokenStreamComponents instance for this analyzer. |
static CharArraySet |
getDefaultStopSet() |
static Set<String> |
getDefaultStopTags() |
protected TokenStream |
normalize(String fieldName,
TokenStream in)
Wrap the given
TokenStream in order to apply normalization filters. |
getStopwordSet, loadStopwordSet, loadStopwordSet, loadStopwordSet
attributeFactory, close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, getVersion, initReader, initReaderForNormalization, normalize, setVersion, tokenStream, tokenStream
public JapaneseAnalyzer()
public JapaneseAnalyzer(UserDictionary userDict, JapaneseTokenizer.Mode mode, CharArraySet stopwords, Set<String> stoptags)
public static CharArraySet getDefaultStopSet()
public static Set<String> getDefaultStopTags()
protected Analyzer.TokenStreamComponents createComponents(String fieldName)
Analyzer
Analyzer.TokenStreamComponents
instance for this analyzer.createComponents
in class Analyzer
fieldName
- the name of the fields content passed to the
Analyzer.TokenStreamComponents
sink as a readerAnalyzer.TokenStreamComponents
for this analyzer.protected TokenStream normalize(String fieldName, TokenStream in)
Analyzer
TokenStream
in order to apply normalization filters.
The default implementation returns the TokenStream
as-is. This is
used by Analyzer.normalize(String, String)
.