TFIDF

Used to generate TF/IDF features for queries and documents

class src.embeddings.tfidf.TFIDF(path: Optional[str] = None)[source]

Bases: object

A class to create tfidf embeddings.

path
Type

str

Methods: fit(text_in_tokens: pd.Series, store: str = “models/tfidf.pkl”):

Fits the tfidf model to the data

transform(text_in_tokens: pd.Series, store: str = None):

Transforms series of preprocessed tokens to tfidf embeddings

fit(text_in_tokens: pandas.core.series.Series, store: str = 'models/tfidf.pkl')[source]

Fits the tfidf model to the data.

Parameters
  • text_in_tokens (pd.Series) – Series of preprocessed tokens

  • store (str) – Path to store model to

Returns

none

fitted = False
transform(text_in_tokens: pandas.core.series.Series, store: Optional[str] = None)[source]

Transform series of preprocessed tokens to tfidf embeddings.

Parameters
  • text_in_tokens (pd.Series) – Series of preprocessed tokens

  • store (str) – Path to tfidf embeddings to

Returns

Array containing tfidf embeddings

Return type

tf_idf_vec (np.array)

vectorizer = None