๐Ÿ“„ Binaryยถ

This is binary text classification task.

The โ€˜classโ€™ is the name of the class we classify for and must be the same in all instances. The โ€˜text_typeโ€™ is an optional field that defines the type of text we classify (e.g. โ€œdocumentโ€, โ€œreviewโ€, etc.). This can be used by the template to customize the prompt.

The expected output is a list which is either an empty list [] or a list with a single element with the class name.

The default reported metrics are the classical f1_micro, f1_macro and accuracy.

tasks.classification.binary

Task(
    input_fields={
        "text": "str",
        "text_type": "str",
        "class": "str",
    },
    reference_fields={
        "class": "str",
        "label": "List[str]",
    },
    prediction_type="List[str]",
    metrics=[
        "metrics.f1_micro_multi_label",
        "metrics.f1_macro_multi_label",
        "metrics.accuracy",
    ],
    augmentable_inputs=[
        "text",
    ],
    defaults={
        "text_type": "text",
    },
)
[source]

Explanation about Taskยถ

Task packs the different instance fields into dictionaries by their roles in the task.

Args:
input_fields (Union[Dict[str, str], List[str]]):

Dictionary with string names of instance input fields and types of respective values. In case a list is passed, each type will be assumed to be Any.

reference_fields (Union[Dict[str, str], List[str]]):

Dictionary with string names of instance output fields and types of respective values. In case a list is passed, each type will be assumed to be Any.

metrics (List[str]):

List of names of metrics to be used in the task.

prediction_type (Optional[str]):

Need to be consistent with all used metrics. Defaults to None, which means that it will be set to Any.

defaults (Optional[Dict[str, Any]]):

An optional dictionary with default values for chosen input/output keys. Needs to be consistent with names and types provided in โ€˜input_fieldsโ€™ and/or โ€˜output_fieldsโ€™ arguments. Will not overwrite values if already provided in a given instance.

The output instance contains three fields:
  1. โ€œinput_fieldsโ€ whose value is a sub-dictionary of the input instance, consisting of all the fields listed in Arg โ€˜input_fieldsโ€™.

  2. โ€œreference_fieldsโ€ โ€“ for the fields listed in Arg โ€œreference_fieldsโ€.

  3. โ€œmetricsโ€ โ€“ to contain the value of Arg โ€˜metricsโ€™

References: metrics.f1_macro_multi_label, metrics.f1_micro_multi_label, metrics.accuracy

Read more about catalog usage here.