unitxt.evaluate_cli module¶
- unitxt.evaluate_cli.cli_load_dataset(args: Namespace) Dataset[source]¶
Loads the dataset based on command line arguments.
- Parameters:
args (argparse.Namespace) – Parsed command-line arguments.
- Returns:
The loaded dataset.
- Return type:
HFDataset
- Raises:
UnitxtArtifactNotFoundError – If the specified card or template artifact is not found.
FileNotFoundError – If a specified file (e.g., in a local card path) is not found.
AttributeError – If there’s an issue accessing attributes during loading.
ValueError – If there’s a value-related error during loading (e.g., parsing).
- unitxt.evaluate_cli.configure_unitxt_settings(args: Namespace)[source]¶
Configures unitxt settings and returns a context manager.
- Parameters:
args (argparse.Namespace) – Parsed command-line arguments.
- Returns:
A context manager for applying unitxt settings.
- Return type:
ContextManager
- unitxt.evaluate_cli.initialize_inference_engine(args: Namespace, model_args_dict: Dict[str, Any], chat_kwargs_dict: Dict[str, Any]) InferenceEngine[source]¶
Initializes the appropriate inference engine based on arguments.
- Parameters:
args (argparse.Namespace) – Parsed command-line arguments.
model_args_dict (Dict[str, Any]) – Processed model arguments.
chat_kwargs_dict (Dict[str, Any]) – Processed chat arguments.
- Returns:
The initialized inference engine instance.
- Return type:
- Raises:
SystemExit – If required dependencies are missing for the selected model type.
ValueError – If required keys are missing in model_args for the selected model type.
- unitxt.evaluate_cli.prepare_kwargs(kwargs: dict) Dict[str, Any][source]¶
Prepares the model arguments dictionary.
- Parameters:
kwargs (dict) – Parsed command-line arguments.
- Returns:
The processed model arguments dictionary.
- Return type:
Dict[str, Any]
- unitxt.evaluate_cli.prepare_output_paths(output_path: str, prefix: str) Tuple[str, str][source]¶
Creates output directory and defines file paths.
- Parameters:
output_path (str) – The directory where output files will be saved.
prefix (str) – The prefix for the output file names.
- Returns:
- A tuple containing the path for the results summary file
and the path for the detailed samples file.
- Return type:
Tuple[str, str]
- unitxt.evaluate_cli.prepend_timestamp_to_path(original_path, timestamp)[source]¶
Takes a path string and a timestamp string, prepends the timestamp to the filename part of the path, and returns the new path string.
- unitxt.evaluate_cli.process_and_save_results(args: Namespace, evaluation_results: EvaluationResults, results_path: str, samples_path: str) None[source]¶
Processes, prints, and saves the evaluation results.
- Parameters:
args (argparse.Namespace) – Parsed command-line arguments.
evaluation_results (EvaluationResults) – The list of evaluated instances.
results_path (str) – Path to save the summary results JSON file.
samples_path (str) – Path to save the detailed samples JSON file.
- Raises:
Exception – If an error occurs during result processing or saving (re-raised).
- unitxt.evaluate_cli.run_evaluation(predictions: List[Any], dataset: Dataset) EvaluationResults[source]¶
Runs evaluation on the predictions.
- Parameters:
predictions (List[Any]) – The list of predictions from the model.
dataset (HFDataset) – The dataset containing references and other data.
- Returns:
The evaluated dataset (list of instances with scores).
- Return type:
- Raises:
RuntimeError – If evaluation returns no results or an unexpected type.
Exception – If any other error occurs during evaluation.
- unitxt.evaluate_cli.run_inference(engine: InferenceEngine, dataset: Dataset) List[Any][source]¶
Runs inference using the initialized engine.
- Parameters:
engine (InferenceEngine) – The inference engine instance.
dataset (HFDataset) – The dataset to run inference on.
- Returns:
A list of predictions.
- Return type:
List[Any]
- Raises:
Exception – If an error occurs during inference.