Training and Evaluation
Training and Evaluation of models.
- class src.models.training.Evaluation(previous_results: str = 'data/results/results.pkl')[source]
Bases:
objectA class to create perform model evaluations.
- previous_results
Path to previously stored results
- Type
str
Methods: __call__(X_y_train: pd.DataFrame, X_test: pd.DataFrame, qrels: pd.DataFrame, k: int = 50,
components_pca: int = 0, model=GaussianNB(), pairwise_model=None, pairwise_top_k: int = 50, pairwise_train: bool = True, name: str = None, save_result: bool = True):
INSERT_DESCRIPTION
- hyperparameter_optimization(model, search_space, X_y_train: pd.DataFrame, X_test: pd.DataFrame,
X_val: pd.DataFrame, qrels: pd.DataFrame, qrels_val: pd.DataFrame, k: int = 50, components_pca: int = 0, pairwise_model=None, pairwise_top_k: int = 50, pairwise_train: bool = True, trials: int = 50, name: str = None, save_result: bool = True):
Performs hyperparameter optimization.
- feature_selection(model, search_space, X_y_train: pd.DataFrame, X_test: pd.DataFrame, X_val: pd.DataFrame,
qrels: pd.DataFrame, qrels_val: pd.DataFrame, k: int = 50, components_pca: int = 0, save_results: bool = True, name: str = None):
Performs feature selection.
- compute_metrics(model, X: pd.DataFrame, y, X_test, test_pair, qrels: pd.DataFrame, k: int = 50,
components_pca: int = 0, pairwise_model=None, pairwise_top_k: int = 50, pairwise_train: bool = True, name: str = None, save_result: bool = False):
Calculates metrics and saves them in a dataframe locally.
- calculate_ranks(results: pd.DataFrame):
Returns relevant documents with their corresponding rank
- average_precision_score(results: pd.DataFrame):
Calculates Average Precision
- mean_average_precision_score(results: pd.DataFrame):
Calculates Mean Average Precision for a set of queries
- metrics(results: pd.DataFrame, k: int = None):
Calculates accuracy, precision, recall and f1 globally and in the top-k area
- normalized_discounted_cumulative_gain(results: pd.DataFrame):
Calculates Normalized Discounted Cumulative Gain
- mean_normalized_discounted_cumulative_gain_score(results: pd.DataFrame):
Calculates Mean Normalized Cumulative Gain
- mean_reciprocal_rank(results: pd.DataFrame):
Calculates Mean Reciprocal Rank
- average_precision_score(results: pandas.core.frame.DataFrame)[source]
Calculates average precision score.
- Parameters
results (pd.DataFrame) –
- Return type
AP (float)
- calculate_ranks(results: pandas.core.frame.DataFrame)[source]
Calculates ranks.
- Parameters
results (pd.DataFrame) –
- Return type
ranks (pd.DataFrame)
- compute_metrics(model, X: pandas.core.frame.DataFrame, y, X_test, test_pair, qrels: pandas.core.frame.DataFrame, k: int = 50, components_pca: int = 0, pairwise_model=None, pairwise_top_k: int = 50, pairwise_train: bool = True, name: Optional[str] = None, save_result: bool = False)[source]
Computes metrics.
- Parameters
() (model) –
X (pd.DataFrame) –
y (pd.Series) –
X_test (pd.DataFrame) –
test_pair (pd.DataFrame) –
qrels (pd.DataFrame) –
k (int) –
components_pca (int) –
pairwise_model (str) –
pairwise_top_k (int) –
pairwise_train (Boolean) –
name (str) –
save_result (Boolean) –
- Return type
MRR (float)
- feature_selection(model, X_y_train: pandas.core.frame.DataFrame, X_test: pandas.core.frame.DataFrame, qrels: pandas.core.frame.DataFrame, k: int = 50, components_pca: int = 0, save_results: bool = True, name: Optional[str] = None)[source]
Performs feature selection.
- Parameters
() (model) –
X_y_train (pd.DataFrame) –
X_test (pd.DataFrame) –
qrels (pd.DataFrame) –
k (int) –
components_pca (int) –
name (str) –
save_results (Boolean) –
- Return type
Selected Features (list)
- hyperparameter_optimization(model, search_space, X_y_train: pandas.core.frame.DataFrame, X_test: pandas.core.frame.DataFrame, X_val: pandas.core.frame.DataFrame, qrels: pandas.core.frame.DataFrame, qrels_val: pandas.core.frame.DataFrame, k: int = 50, components_pca: int = 0, pairwise_model=None, pairwise_top_k: int = 50, pairwise_train: bool = True, trials: int = 50, name: Optional[str] = None, save_result: bool = True)[source]
Performs hyperparameter optimization.
- Parameters
() (search_space) –
() –
X_y_train (pd.DataFrame) –
X_test (pd.DataFrame) –
X_val (pd.DataFrame) –
qrels (pd.DataFrame) –
qrels_val (pd.DataFrame) –
k (int) –
components_pca (int) –
pairwise_model (str) –
pairwise_top_k (int) –
pairwise_train (Boolean) –
trials (int) –
name (str) –
save_result (Boolean) –
- Returns
MRR and nDCG
- Return type
tuple (float)
- mean_average_precision_score(results: pandas.core.frame.DataFrame)[source]
Calculates mean average precision score.
- Parameters
results (pd.DataFrame) –
- Return type
MAP (float)
- mean_normalized_discounted_cumulative_gain_score(results: pandas.core.frame.DataFrame)[source]
Calculates mean normalized discounted cumulative gain score.
- Parameters
results (pd.DataFrame) –
- Return type
Mean of nDCG (float)
- mean_reciprocal_rank(results: pandas.core.frame.DataFrame, threshold: int = 3)[source]
Calculates mean reciprocal rank.
- Parameters
results (pd.DataFrame) –
- Return type
MRR (float)
- metrics(results: pandas.core.frame.DataFrame, k: Optional[int] = None)[source]
Calculates metrics (accuracy, precision, recall, f1).
- Parameters
results (pd.DataFrame) –
k (int) –
- Returns
Returns accuracy score of model precision (float): Returns precision score of model recall (float): Returns recall score of model f_score (float): Returns f_score score of model
- Return type
accuracy (float)