org.apache.lucene.analysis.path

Class PathHierarchyTokenizerFactory



  • public class PathHierarchyTokenizerFactory
    extends TokenizerFactory
    Factory for PathHierarchyTokenizer.

    This factory is typically configured for use only in the index Analyzer (or only in the query Analyzer, but never both).

    For example, in the configuration below a query for Books/NonFic will match documents indexed with values like Books/NonFic, Books/NonFic/Law, Books/NonFic/Science/Physics, etc. But it will not match documents indexed with values like Books, or Books/Fic...

     <fieldType name="descendent_path" class="solr.TextField">
       <analyzer type="index">
         <tokenizer class="solr.PathHierarchyTokenizerFactory" delimiter="/" />
       </analyzer>
       <analyzer type="query">
         <tokenizer class="solr.KeywordTokenizerFactory" />
       </analyzer>
     </fieldType>
     

    In this example however we see the oposite configuration, so that a query for Books/NonFic/Science/Physics would match documents containing Books/NonFic, Books/NonFic/Science, or Books/NonFic/Science/Physics, but not Books/NonFic/Science/Physics/Theory or Books/NonFic/Law.

     <fieldType name="descendent_path" class="solr.TextField">
       <analyzer type="index">
         <tokenizer class="solr.KeywordTokenizerFactory" />
       </analyzer>
       <analyzer type="query">
         <tokenizer class="solr.PathHierarchyTokenizerFactory" delimiter="/" />
       </analyzer>
     </fieldType>