org.apache.lucene.analysis.ja

Class Token



  • public class Token
    extends Object
    Analyzed token with morphological data from its dictionary.
    • Method Detail

      • getSurfaceForm

        public char[] getSurfaceForm()
        Returns:
        surfaceForm
      • getOffset

        public int getOffset()
        Returns:
        offset into surfaceForm
      • getLength

        public int getLength()
        Returns:
        length of surfaceForm
      • getReading

        public String getReading()
        Returns:
        reading. null if token doesn't have reading.
      • getPronunciation

        public String getPronunciation()
        Returns:
        pronunciation. null if token doesn't have pronunciation.
      • getBaseForm

        public String getBaseForm()
        Returns:
        base form or null if token is not inflected
      • isKnown

        public boolean isKnown()
        Returns true if this token is known word
        Returns:
        true if this token is in standard dictionary. false if not.
      • isUnknown

        public boolean isUnknown()
        Returns true if this token is unknown word
        Returns:
        true if this token is unknown word. false if not.
      • isUser

        public boolean isUser()
        Returns true if this token is defined in user dictionary
        Returns:
        true if this token is in user dictionary. false if not.
      • getPosition

        public int getPosition()
        Get index of this token in input text
        Returns:
        position of token
      • setPositionLength

        public void setPositionLength(int positionLength)
        Set the position length (in tokens) of this token. For normal tokens this is 1; for compound tokens it's > 1.
      • getPositionLength

        public int getPositionLength()
        Get the length (in tokens) of this token. For normal tokens this is 1; for compound tokens it's > 1.
        Returns:
        position length of token