πŸ“„ SquadΒΆ

metrics.squad

MetricPipeline(
    main_score="f1",
    preprocess_steps=[
        AddID(),
        Set(
            use_deepcopy=True,
            fields={
                "prediction_template": {
                    "prediction_text": "PRED",
                    "id": "ID",
                },
                "reference_template": {
                    "answers": {
                        "answer_start": [
                            -1,
                        ],
                        "text": "REF",
                    },
                    "id": "ID",
                },
            },
        ),
        Copy(
            field_to_field=[
                [
                "references",
                "reference_template/answers/text",
                ],
                [
                "prediction",
                "prediction_template/prediction_text",
                ],
                [
                "id",
                "prediction_template/id",
                ],
                [
                "id",
                "reference_template/id",
                ],
            ],
        ),
        Copy(
            field_to_field=[
                [
                "reference_template",
                "references",
                ],
                [
                "prediction_template",
                "prediction",
                ],
            ],
        ),
    ],
    metric=Squad(),
)
[source]

from unitxt.metrics import Squad
from unitxt.operators import AddID, Copy, Set

Explanation about SetΒΆ

Sets specified fields in each instance, in a given stream or all streams (default), with specified values. If fields exist, updates them, if do not exist – adds them.

Args:

fields (Dict[str, object]): The fields to add to each instance. Use β€˜/’ to access inner fields

use_deepcopy (bool) : Deep copy the input value to avoid later modifications

Examples:

# Set a value of a list consisting of β€œpositive” and β€œnegative” do field β€œclasses” to each and every instance of all streams Set(fields={"classes": ["positive","negatives"]})

# In each and every instance of all streams, field β€œspan” is to become a dictionary containing a field β€œstart”, in which the value 0 is to be set Set(fields={"span/start": 0}

# In all instances of stream β€œtrain” only, Set field β€œclasses” to have the value of a list consisting of β€œpositive” and β€œnegative” Set(fields={"classes": ["positive","negatives"], apply_to_stream=["train"]})

# Set field β€œclasses” to have the value of a given list, preventing modification of original list from changing the instance. Set(fields={"classes": alist}), use_deepcopy=True) if now alist is modified, still the instances remain intact.

Explanation about CopyΒΆ

Copies values from specified fields to specified fields.

Args (of parent class):

field_to_field (Union[List[List], Dict[str, str]]): A list of lists, where each sublist contains the source field and the destination field, or a dictionary mapping source fields to destination fields.

Examples:

An input instance {β€œa”: 2, β€œb”: 3}, when processed by Copy(field_to_field={"a": "b"}) would yield {β€œa”: 2, β€œb”: 2}, and when processed by Copy(field_to_field={"a": "c"}) would yield {β€œa”: 2, β€œb”: 3, β€œc”: 2}

with field names containing / , we can also copy inside the field: Copy(field="a/0",to_field="a") would process instance {β€œa”: [1, 3]} into {β€œa”: 1}

Explanation about AddIDΒΆ

Stores a unique id value in the designated β€˜id_field_name’ field of the given instance.

Read more about catalog usage here.