📄 Piqa¶

To apply eyeshadow without a brush, should I use a cotton swab or a toothpick? Questions requiring this kind of physical commonsense pose a challenge to state-of-the-art natural language understanding systems. The PIQA dataset introduces the task of physical commonsense reasoning and a corresponding benchmark dataset Physical Interaction: Question Answering or PIQA. Physical commonsense knowledge is a major challenge on the road to true AI-completeness, including robots that interact with the world and understand natural language. PIQA focuses on everyday situations with a preference for atypical solutions. The dataset is inspired by instructables.com, which provides users with instructions on how to build, craft, bake, or manipulate objects using everyday materials. The underlying task is formualted as multiple choice question answering: given a question q and two possible solutions s1, s2, a model or a human must choose the most appropriate solution, of which exactly one is correct. The dataset is further cleaned of basic artifacts using the AFLite algorithm which is an improvement of adversarial filtering. The dataset contains 16,000 examples for training, 2,000 for development and 3,000 for testing.

Tags: annotations_creators:crowdsourced, arxiv:['1911.11641', '1907.10641', '1904.09728', '1808.05326'], croissant:True, language:en, language_creators:['crowdsourced', 'found'], license:unknown, multilinguality:monolingual, region:us, size_categories:10K<n<100K, source_datasets:original, task_categories:question-answering, task_ids:multiple-choice-qa

Note

ID: cards.piqa | Type: TaskCard

{
    "__description__": "To apply eyeshadow without a brush, should I use a cotton swab or a toothpick?\nQuestions requiring this kind of physical commonsense pose a challenge to state-of-the-art\nnatural language understanding systems. The PIQA dataset introduces the task of physical commonsense reasoning\nand a corresponding benchmark dataset Physical Interaction: Question Answering or PIQA.\nPhysical commonsense knowledge is a major challenge on the road to true AI-completeness,\nincluding robots that interact with the world and understand natural language.\nPIQA focuses on everyday situations with a preference for atypical solutions.\nThe dataset is inspired by instructables.com, which provides users with instructions on how to build, craft,\nbake, or manipulate objects using everyday materials.\nThe underlying task is formualted as multiple choice question answering:\ngiven a question `q` and two possible solutions `s1`, `s2`, a model or\na human must choose the most appropriate solution, of which exactly one is correct.\nThe dataset is further cleaned of basic artifacts using the AFLite algorithm which is an improvement of\nadversarial filtering. The dataset contains 16,000 examples for training, 2,000 for development and 3,000 for testing.",
    "__tags__": {
        "annotations_creators": "crowdsourced",
        "arxiv": [
            "1911.11641",
            "1907.10641",
            "1904.09728",
            "1808.05326"
        ],
        "croissant": true,
        "language": "en",
        "language_creators": [
            "crowdsourced",
            "found"
        ],
        "license": "unknown",
        "multilinguality": "monolingual",
        "region": "us",
        "size_categories": "10K<n<100K",
        "source_datasets": "original",
        "task_categories": "question-answering",
        "task_ids": "multiple-choice-qa"
    },
    "loader": {
        "path": "piqa",
        "type": "load_hf"
    },
    "preprocess_steps": [
        {
            "fields": [
                "sol1",
                "sol2"
            ],
            "to_field": "choices",
            "type": "list_field_values"
        },
        {
            "field_to_field": {
                "goal": "question",
                "label": "answer"
            },
            "type": "rename_fields"
        }
    ],
    "task": "tasks.qa.multiple_choice.open",
    "templates": "templates.qa.multiple_choice.open.all",
    "type": "task_card"
}

Explanation about TaskCard¶

TaskCard delineates the phases in transforming the source dataset into a model-input, and specifies the metrics for evaluation of model-output.

Attributes:
loader: specifies the source address and the loading operator that can access that source and transform it into a unitxt multistream.

preprocess_steps: list of unitxt operators to process the data source into a model-input.

task: specifies the fields (of the already (pre)processed instance) making the inputs, the fields making the outputs, and the metrics to be used for evaluating the model output.

templates: format strings to be applied on the input fields (specified by the task) and the output fields. The template also carries the instructions and the list of postprocessing steps, to be applied to the model output.

Explanation about RenameFields¶

Renames fields.

Move value from one field to another, potentially, if field name contains a /, from one branch into another. Remove the from field, potentially part of it in case of / in from_field.

Examples:
RenameFields(field_to_field={“b”: “c”}) will change inputs [{“a”: 1, “b”: 2}, {“a”: 2, “b”: 3}] to [{“a”: 1, “c”: 2}, {“a”: 2, “c”: 3}]

RenameFields(field_to_field={“b”: “c/d”}) will change inputs [{“a”: 1, “b”: 2}, {“a”: 2, “b”: 3}] to [{“a”: 1, “c”: {“d”: 2}}, {“a”: 2, “c”: {“d”: 3}}]

RenameFields(field_to_field={“b”: “b/d”}) will change inputs [{“a”: 1, “b”: 2}, {“a”: 2, “b”: 3}] to [{“a”: 1, “b”: {“d”: 2}}, {“a”: 2, “b”: {“d”: 3}}]

RenameFields(field_to_field={“b/c/e”: “b/d”}) will change inputs [{“a”: 1, “b”: {“c”: {“e”: 2, “f”: 20}}}] to [{“a”: 1, “b”: {“c”: {“f”: 20}, “d”: 2}}]

Explanation about ListFieldValues¶

Concatenates values of multiple fields into a list, and assigns it to a new field.

References: templates.qa.multiple_choice.open.all, tasks.qa.multiple_choice.open