π Normalized SacrebleuΒΆ
metrics.normalized_sacrebleu
type: MetricPipeline
main_score: sacrebleu
prediction_type: str
preprocess_steps:
- type: Copy
field: task_data/target_language
to_field: task_data/tokenize
not_exist_ok: True
get_default: en
- type: Lower
field: task_data/tokenize
- type: MapInstanceValues
mappers:
task_data/tokenize:
german: None
deutch: None
de: None
french: None
fr: None
romanian: None
ro: None
english: None
en: None
spanish: None
es: None
portuguese: None
pt: None
arabic: intl
ar: intl
korean: ko-mecab
ko: ko-mecab
japanese: ja-mecab
ja: ja-mecab
strict: True
metric:
type: NormalizedSacrebleu
[source]Explanation about CopyΒΆ
Copies values from specified fields to specified fields.
- Args (of parent class):
field_to_field (Union[List[List], Dict[str, str]]): A list of lists, where each sublist contains the source field and the destination field, or a dictionary mapping source fields to destination fields.
- Examples:
An input instance {βaβ: 2, βbβ: 3}, when processed by
Copy(field_to_field={"a": "b"})would yield {βaβ: 2, βbβ: 2}, and when processed byCopy(field_to_field={"a": "c"})would yield {βaβ: 2, βbβ: 3, βcβ: 2}with field names containing / , we can also copy inside the field:
Copy(field="a/0",to_field="a")would process instance {βaβ: [1, 3]} into {βaβ: 1}
Explanation about MapInstanceValuesΒΆ
A class used to map instance values into other values.
This class is a type of
InstanceOperator, it maps values of instances in a stream using predefined mappers.
- Args:
- mappers (Dict[str, Dict[str, Any]]):
The mappers to use for mapping instance values. Keys are the names of the fields to undergo mapping, and values are dictionaries that define the mapping from old values to new values. Note that mapped values are defined by their string representation, so mapped values are converted to strings before being looked up in the mappers.
- strict (bool):
If True, the mapping is applied strictly. That means if a value does not exist in the mapper, it will raise a KeyError. If False, values that are not present in the mapper are kept as they are.
- process_every_value (bool):
If True, all fields to be mapped should be lists, and the mapping is to be applied to their individual elements. If False, mapping is only applied to a field containing a single value.
- Examples:
MapInstanceValues(mappers={"a": {"1": "hi", "2": "bye"}})replaces"1"with"hi"and"2"with"bye"in field"a"in all instances of all streams: instance{"a": 1, "b": 2}becomes{"a": "hi", "b": 2}. Note that the value of"b"remained intact, since field-name"b"does not participate in the mappers, and that1was casted to"1"before looked up in the mapper of"a".
MapInstanceValues(mappers={"a": {"1": "hi", "2": "bye"}}, process_every_value=True): Assuming field"a"is a list of values, potentially including"1"-s and"2"-s, this replaces each such"1"with"hi"and"2"β with"bye"in all instances of all streams: instance{"a": ["1", "2"], "b": 2}becomes{"a": ["hi", "bye"], "b": 2}.
MapInstanceValues(mappers={"a": {"1": "hi", "2": "bye"}}, strict=True): To ensure that all values of field"a"are mapped in every instance, usestrict=True. Input instance{"a":"3", "b": 2}will raise an exception per the above call, because"3"is not a key in the mapper of"a".
MapInstanceValues(mappers={"a": {str([1,2,3,4]): "All", str([]): "None"}}, strict=True)replaces a list[1,2,3,4]with the string"All"and an empty list by string"None".
Read more about catalog usage here.