πŸ“„ Key Value ExtractionΒΆ

This is a key value extraction task, where a specific list of possible β€˜keys’ need to be extracted from the input. The ground truth is provided key-value pairs in the form of the dictionary. The results are evaluating using F1 score metric, that expects the predictions to be converted into a list of (key,value) pairs.

tasks.key_value_extraction

Task(
    input_fields={
        "input": "Any",
        "keys": "List[str]",
    },
    reference_fields={
        "key_value_pairs_answer": "Dict[str, str]",
    },
    prediction_type="Dict[str, str]",
    metrics=[
        "metrics.key_value_extraction.accuracy",
        "metrics.key_value_extraction.token_overlap",
    ],
    default_template="templates.key_value_extraction.extract_in_json_format",
)
[source]

Explanation about TaskΒΆ

Task packs the different instance fields into dictionaries by their roles in the task.

Args:
input_fields (Union[Dict[str, str], List[str]]):

Dictionary with string names of instance input fields and types of respective values. In case a list is passed, each type will be assumed to be Any.

reference_fields (Union[Dict[str, str], List[str]]):

Dictionary with string names of instance output fields and types of respective values. In case a list is passed, each type will be assumed to be Any.

metrics (List[str]):

List of names of metrics to be used in the task.

prediction_type (Optional[str]):

Need to be consistent with all used metrics. Defaults to None, which means that it will be set to Any.

defaults (Optional[Dict[str, Any]]):

An optional dictionary with default values for chosen input/output keys. Needs to be consistent with names and types provided in β€˜input_fields’ and/or β€˜output_fields’ arguments. Will not overwrite values if already provided in a given instance.

The output instance contains three fields:
  1. β€œinput_fields” whose value is a sub-dictionary of the input instance, consisting of all the fields listed in Arg β€˜input_fields’.

  2. β€œreference_fields” – for the fields listed in Arg β€œreference_fields”.

  3. β€œmetrics” – to contain the value of Arg β€˜metrics’

References: templates.key_value_extraction.extract_in_json_format, metrics.key_value_extraction.token_overlap, metrics.key_value_extraction.accuracy

Read more about catalog usage here.