Class BM25Similarity

  • public class BM25Similarity
    extends Similarity
    BM25 Similarity. Introduced in Stephen E. Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford. Okapi at TREC-3. In Proceedings of the Third Text REtrieval Conference (TREC 1994). Gaithersburg, USA, November 1994.
    • Field Detail

      • discountOverlaps

        protected boolean discountOverlaps
        True if overlap tokens (tokens with a position of increment of zero) are discounted from the document's length.
    • Constructor Detail

      • BM25Similarity

        public BM25Similarity(float k1,
                              float b)
        BM25 with the supplied parameter values.
        k1 - Controls non-linear term frequency normalization (saturation).
        b - Controls to what degree document length normalizes tf values.
        IllegalArgumentException - if k1 is infinite or negative, or if b is not within the range [0..1]
      • BM25Similarity

        public BM25Similarity()
        BM25 with these default values:
        • k1 = 1.2
        • b = 0.75