BM25

A probabilistic information retrieval model.

class src.features.bm25.BM25[source]

Bases: object

A class to create BM25 features.

Methods: fit(corpus: pd.Series):

Train model

predict_proba(query: , document: ):

Return confidence score

bm25(word, document, k: int = 1, b: float = 0.75)

Compute weight

bm25(word, document, k: int = 1, b: float = 0.75)[source]

Compute BM25 weight.

Parameters
  • () (document) –

  • ()

  • k (int) –

  • b (float) –

Returns

weight (float)

corpus = None
corpus_length = None
fit(corpus: pandas.core.series.Series)[source]

Fits the model.

Parameters

corpus (pd.Series) –

Returns

none

l_avg = None
occurrences = {}
predict_proba(query, document)[source]

Predict with confidence score.

Parameters
  • () (document) –

  • ()

Return type

score (float)