📄 End To End¶

This is a task corresponding to an end to end RAG evaluation. It assumes the user provides a question, and: the RAG system returns an answer and a set of retrieved contexts (documents or passages). For details of RAG see: https://www.unitxt.ai/en/latest/docs/rag_support.html.

tasks.rag.end_to_end

Task(
    input_fields={
        "question": "Union[str, Dialog]",
        "question_id": "Any",
        "metadata_tags": "Dict[str, str]",
    },
    reference_fields={
        "reference_answers": "List[str]",
        "reference_contexts": "List[str]",
        "reference_context_ids": "Union[List[int], List[str]]",
        "is_answerable_label": "bool",
    },
    metrics=[
        "metrics.rag.end_to_end.answer_correctness",
        "metrics.rag.end_to_end.answer_faithfulness",
        "metrics.rag.end_to_end.answer_reward",
        "metrics.rag.end_to_end.context_correctness",
        "metrics.rag.end_to_end.context_relevance",
    ],
    prediction_type="RagResponse",
    augmentable_inputs=[
        "question",
    ],
    defaults={
        "question_id": "",
        "metadata_tags": {},
        "reference_answers": [],
        "reference_contexts": [],
        "reference_context_ids": [],
        "is_answerable_label": True,
    },
    default_template="templates.rag.end_to_end.json_predictions",
)
[source]

Explanation about Task¶

Task packs the different instance fields into dictionaries by their roles in the task.

Args:

input_fields (Union[Dict[str, str], List[str]]):
Dictionary with string names of instance input fields and types of respective values. In case a list is passed, each type will be assumed to be Any.

reference_fields (Union[Dict[str, str], List[str]]):
Dictionary with string names of instance output fields and types of respective values. In case a list is passed, each type will be assumed to be Any.

metrics (List[str]):
List of names of metrics to be used in the task.

prediction_type (Optional[str]):
Need to be consistent with all used metrics. Defaults to None, which means that it will be set to Any.

defaults (Optional[Dict[str, Any]]):
An optional dictionary with default values for chosen input/output keys. Needs to be consistent with names and types provided in ‘input_fields’ and/or ‘output_fields’ arguments. Will not overwrite values if already provided in a given instance.

The output instance contains three fields:

“input_fields” whose value is a sub-dictionary of the input instance, consisting of all the fields listed in Arg ‘input_fields’.

“reference_fields” – for the fields listed in Arg “reference_fields”.

“metrics” – to contain the value of Arg ‘metrics’

References: metrics.rag.end_to_end.context_correctness, metrics.rag.end_to_end.answer_faithfulness, metrics.rag.end_to_end.answer_correctness, templates.rag.end_to_end.json_predictions, metrics.rag.end_to_end.context_relevance, metrics.rag.end_to_end.answer_reward