org.apache.lucene.search.join

Class JoinUtil



  • public final class JoinUtil
    extends Object
    Utility for query time joining.
    • Method Detail

      • createJoinQuery

        public static Query createJoinQuery(String fromField,
                                            boolean multipleValuesPerDocument,
                                            String toField,
                                            Query fromQuery,
                                            IndexSearcher fromSearcher,
                                            ScoreMode scoreMode)
                                     throws IOException
        Method for query time joining.

        Execute the returned query with a IndexSearcher to retrieve all documents that have the same terms in the to field that match with documents matching the specified fromQuery and have the same terms in the from field.

        In the case a single document relates to more than one document the multipleValuesPerDocument option should be set to true. When the multipleValuesPerDocument is set to true only the the score from the first encountered join value originating from the 'from' side is mapped into the 'to' side. Even in the case when a second join value related to a specific document yields a higher score. Obviously this doesn't apply in the case that ScoreMode.None is used, since no scores are computed at all.

        Memory considerations: During joining all unique join values are kept in memory. On top of that when the scoreMode isn't set to ScoreMode.None a float value per unique join value is kept in memory for computing scores. When scoreMode is set to ScoreMode.Avg also an additional integer value is kept in memory per unique join value.

        Parameters:
        fromField - The from field to join from
        multipleValuesPerDocument - Whether the from field has multiple terms per document
        toField - The to field to join to
        fromQuery - The query to match documents on the from side
        fromSearcher - The searcher that executed the specified fromQuery
        scoreMode - Instructs how scores from the fromQuery are mapped to the returned query
        Returns:
        a Query instance that can be used to join documents based on the terms in the from and to field
        Throws:
        IOException - If I/O related errors occur
      • createJoinQuery

        public static Query createJoinQuery(String joinField,
                                            Query fromQuery,
                                            Query toQuery,
                                            IndexSearcher searcher,
                                            ScoreMode scoreMode,
                                            MultiDocValues.OrdinalMap ordinalMap,
                                            int min,
                                            int max)
                                     throws IOException
        A query time join using global ordinals over a dedicated join field. This join has certain restrictions and requirements: 1) A document can only refer to one other document. (but can be referred by one or more documents) 2) Documents on each side of the join must be distinguishable. Typically this can be done by adding an extra field that identifies the "from" and "to" side and then the fromQuery and toQuery must take the this into account. 3) There must be a single sorted doc values join field used by both the "from" and "to" documents. This join field should store the join values as UTF-8 strings. 4) An ordinal map must be provided that is created on top of the join field. Note: min and max filtering and the avg score mode will require this join to keep track of the number of times a document matches per join value. This will increase the per join cost in terms of execution time and memory.
        Parameters:
        joinField - The SortedDocValues field containing the join values
        fromQuery - The query containing the actual user query. Also the fromQuery can only match "from" documents.
        toQuery - The query identifying all documents on the "to" side.
        searcher - The index searcher used to execute the from query
        scoreMode - Instructs how scores from the fromQuery are mapped to the returned query
        ordinalMap - The ordinal map constructed over the joinField. In case of a single segment index, no ordinal map needs to be provided.
        min - Optionally the minimum number of "from" documents that are required to match for a "to" document to be a match. The min is inclusive. Setting min to 0 and max to Interger.MAX_VALUE disables the min and max "from" documents filtering
        max - Optionally the maximum number of "from" documents that are allowed to match for a "to" document to be a match. The max is inclusive. Setting min to 0 and max to Interger.MAX_VALUE disables the min and max "from" documents filtering
        Returns:
        a Query instance that can be used to join documents based on the join field
        Throws:
        IOException - If I/O related errors occur