📄 Supervised¶

Task to test tool calling capabilities. It assume the model is provided with a query and is requested to invoke a single tool from the list of provided tools.

Reference_calls is a list of ground truth tool calls to compare with.

tasks.tool_calling.supervised

Task(
    input_fields={
        "query": "str",
        "tools": "List[Tool]",
    },
    reference_fields={
        "reference_calls": "List[ToolCall]",
    },
    prediction_type="ToolCall",
    metrics=[
        "metrics.tool_calling",
    ],
    default_template="templates.tool_calling.base",
    requirements=[
        "jsonschema-rs",
    ],
)
[source]

Explanation about Task¶

Task packs the different instance fields into dictionaries by their roles in the task.

Args:

input_fields (Union[Dict[str, str], List[str]]):
Dictionary with string names of instance input fields and types of respective values. In case a list is passed, each type will be assumed to be Any.

reference_fields (Union[Dict[str, str], List[str]]):
Dictionary with string names of instance output fields and types of respective values. In case a list is passed, each type will be assumed to be Any.

metrics (List[str]):
List of names of metrics to be used in the task.

prediction_type (Optional[str]):
Need to be consistent with all used metrics. Defaults to None, which means that it will be set to Any.

defaults (Optional[Dict[str, Any]]):
An optional dictionary with default values for chosen input/output keys. Needs to be consistent with names and types provided in ‘input_fields’ and/or ‘output_fields’ arguments. Will not overwrite values if already provided in a given instance.

The output instance contains three fields:

“input_fields” whose value is a sub-dictionary of the input instance, consisting of all the fields listed in Arg ‘input_fields’.

“reference_fields” – for the fields listed in Arg “reference_fields”.

“metrics” – to contain the value of Arg ‘metrics’

References: templates.tool_calling.base, metrics.tool_calling