unitxt.templates module¶

class unitxt.templates.ApplyRandomTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, demos_field: str | NoneType = None, templates: List[unitxt.templates.Template] = __required__)[source]¶: Bases: ApplyTemplate

class unitxt.templates.ApplySingleTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, demos_field: str | NoneType = None, template: unitxt.templates.Template = __required__)[source]¶: Bases: ApplyTemplate

class unitxt.templates.ApplyTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, demos_field: str | NoneType = None)[source]¶: Bases: InstanceOperator

class unitxt.templates.DialogFieldsData(data_classification_policy: List[str] = None, user_role_label: str = __required__, assistant_role_label: str = __required__, system_role_label: str = __required__, dialog_field: str = __required__)[source]¶: Bases: Artifact

class unitxt.templates.DialogPairwiseChoiceTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None, input_format: str = __required__, choice_a_field: str = __required__, choice_b_field: str = __required__, answer_field: str = __required__, choice_a_label: str = __required__, choice_b_label: str = __required__, choice_tie_label: str = __required__, shuffle: bool = __required__, dialog_fields: List[unitxt.templates.DialogFieldsData] = __required__, turns_separator: str = '\n\n', label_separator: str = ' ')[source]¶: Bases: DialogTemplate, PairwiseChoiceTemplate

class unitxt.templates.DialogTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None, input_format: str = __required__, dialog_fields: List[unitxt.templates.DialogFieldsData] = __required__, turns_separator: str = '\n\n', label_separator: str = ' ')[source]¶: Bases: InputOutputTemplate

class unitxt.templates.InputFormatTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, input_format: str = __required__)[source]¶: Bases: Template

class unitxt.templates.InputOutputTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None, input_format: str = __required__)[source]¶

Bases: InputFormatTemplate, OutputFormatTemplate

Generate field ‘source’ from fields designated as input, and fields ‘target’ and ‘references’ from fields designated as output, of the processed instance.

Args specify the formatting strings with which to glue together the input and reference fields of the processed instance into one string (‘source’ and ‘target’), and into a list of strings (‘references’).

class unitxt.templates.InputOutputTemplateWithCustomTarget(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None, input_format: str = __required__, reference: str = __required__)[source]¶: Bases: InputOutputTemplate

class unitxt.templates.KeyValTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, pairs_separator: str = ', ', key_val_separator: str = ': ', use_keys_for_inputs: bool = True, outputs_key_val_separator: str = ': ', use_keys_for_outputs: bool = False)[source]¶

Bases: Template

Generate field ‘source’ from fields designated as input, and fields ‘target’ and ‘references’ from fields designated as output, of the processed instance.

Args specify with what separators to glue together the input and output designated fields of the processed instance into one string (‘source’ and ‘target’), and into a list of strings (‘references’).

class unitxt.templates.MultiLabelTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_list_by_comma'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = '{labels}', input_format: str = __required__, labels_field: str = 'labels', labels_separator: str = ', ', empty_label: str = 'None')[source]¶

Bases: InputOutputTemplate

postprocessors: List[str] = ['processors.to_list_by_comma']¶

class unitxt.templates.MultiReferenceTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None, input_format: str = __required__, references_field: str = 'references', random_reference: bool = False)[source]¶: Bases: InputOutputTemplate

class unitxt.templates.MultipleChoiceTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, input_format: str = __required__, choices_field: str = 'choices', target_field: str = 'label', choices_separator: str = ', ', source_choice_format: str = '{choice_numeral}. {choice_text}', target_choice_format: str = '{choice_numeral}', enumerator: str = 'capitals', shuffle_choices: bool = False)[source]¶

Bases: InputFormatTemplate

Formats the input (that specifies the question), the multiple choices to select the answer from, and specifies the field with the correct answer.

class unitxt.templates.OutputFormatTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None)[source]¶: Bases: Template

class unitxt.templates.OutputQuantizingTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.MultiTypeSerializer = None, output_format: str = None, input_format: str = __required__, quantum: float | int = 0.1)[source]¶: Bases: InputOutputTemplate

class unitxt.templates.PairwiseChoiceTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None, input_format: str = __required__, choice_a_field: str = __required__, choice_b_field: str = __required__, answer_field: str = __required__, choice_a_label: str = __required__, choice_b_label: str = __required__, choice_tie_label: str = __required__, shuffle: bool = __required__)[source]¶

Bases: InputOutputTemplate

PairwiseChoiceTemplate.

Requirements:: The answer field value should be of type Literal[“choice_a”, “choice_b”, “tie”]

Parameters:

choice_a_field (str) – The field which contains choice_a value
choice_b_field (str) – The field which contains choice_b value
answer_field (str) – The field which contains the answer value. Should be of type Literal[“choice_1”, “choice_2”, “tie”]
choice_a_label (str) – The label of choice A answer as it is verbalized in the template.
choice_b_label (str) – The label of choice B answer as it is verbalized in the template.
choice_tie_label (str) – The label of a tie answer as it should be verbalized in the template.
shuffle (bool) – whether to shuffle the choices or not. This is done to take into account position bias.

shuffle: 50% of the time:

The values of choice_a_field and choice_b_field will be swapped.
If the values of answer_field is choice_a_label, set it to choice_b_label.
Else if the values of answer_field is choice_b_label, set it to choice_a_label. Else if the value of answer_field is choice_tie_label, do nothing.

class unitxt.templates.PairwiseComparativeRatingTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None, input_format: str = __required__, choice_a_field: str = __required__, choice_b_field: str = __required__, choice_a_id_field: str = __required__, choice_b_id_field: str = __required__, answer_field: str = __required__, shuffle: bool = __required__)[source]¶

Bases: InputOutputTemplate

PairwiseChoiceTemplate.

Parameters:

choice_a_field (str) – The field which contains choice_a value
choice_b_field (str) – The field which contains choice_b value
answer_field (str) – The field which contains the answer value. The value should be an int. Positive for preferring choice_a, and negative for preferring choice_b
shuffle (bool) – whether to shuffle the choices or not. This is done to take into account position bias.

shuffle: 50% of the time:

The values of choice_a_field and choice_b_field will be swapped.
Replace the values of answer_field with its mapped value according to the reverse_preference_map Dict.

class unitxt.templates.SpanLabelingBaseTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_list_by_comma'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = '{labels}', input_format: str = __required__, labels_field: str = 'labels', labels_separator: str = ', ', empty_label: str = 'None', spans_starts_field: str = 'spans_starts', spans_ends_field: str = 'spans_ends', text_field: str = 'text', labels_support: list = None)[source]¶: Bases: MultiLabelTemplate

class unitxt.templates.SpanLabelingJsonTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.load_json', 'processors.dict_of_lists_to_value_key_pairs'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = '{labels}', input_format: str = __required__, labels_field: str = 'labels', labels_separator: str = ', ', empty_label: str = 'None', spans_starts_field: str = 'spans_starts', spans_ends_field: str = 'spans_ends', text_field: str = 'text', labels_support: list = None)[source]¶

Bases: SpanLabelingBaseTemplate

postprocessors: List[str] = ['processors.load_json', 'processors.dict_of_lists_to_value_key_pairs']¶

class unitxt.templates.SpanLabelingTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_span_label_pairs'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = '{labels}', input_format: str = __required__, labels_field: str = 'labels', labels_separator: str = ', ', empty_label: str = 'None', spans_starts_field: str = 'spans_starts', spans_ends_field: str = 'spans_ends', text_field: str = 'text', labels_support: list = None, span_label_format: str = '{span}: {label}', escape_characters: List[str] = [':', ','])[source]¶

Bases: SpanLabelingBaseTemplate

escape_characters: List[str] = [':', ',']¶

postprocessors: List[str] = ['processors.to_span_label_pairs']¶

class unitxt.templates.Template(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None)[source]¶

Bases: InstanceOperator

The role of template is to take the fields of every instance and verbalize it.

Meaning the template is taking the instance and generating source, target and references.

Parameters:

skip_rendered_instance (bool) – if “source”, “target”, and “references” are already defined fields in the instance, skip its processing
postprocessors – a list of strings being artifact names of text processors, to be applied on the model output
instruction – a formatting string that yields an instruction with potential participation of values from the “input_fields” part of the instance
target_prefix – a string to be used to format the prompt. Not a formatting string.

exception unitxt.templates.TemplateFormatKeyError(template, data, data_type, format_str, format_name)[source]¶: Bases: UnitxtError

class unitxt.templates.TemplatesDict(data_classification_policy: List[str] = None, items: Dict[str, unitxt.artifact.Artifact] = {})[source]¶: Bases: DictCollection

class unitxt.templates.TemplatesList(data_classification_policy: List[str] = None, items: List[unitxt.artifact.Artifact] = [])[source]¶: Bases: ListCollection

class unitxt.templates.YesNoTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, input_format: str = None, class_field: str = None, label_field: str = None, yes_answer: str = 'Yes', no_answer: str = 'No')[source]¶

Bases: InputFormatTemplate

A template for generating binary Yes/No questions asking whether an input text is of a specific class.

input_format:: Defines the format of the question.
class_field:: Defines the field that contains the name of the class that this template asks of.
label_field:: Defines the field which contains the true label of the input text. If a gold label is equal to the value in class_name, then the correct output is self.yes_answer (by default, “Yes”). Otherwise the correct output is self.no_answer (by default, “No”).
yes_answer:: The output value for when the gold label equals self.class_name. Defaults to “Yes”.
no_answer:: The output value for when the gold label differs from self.class_name. Defaults to “No”.

unitxt.templates.escape_chars(s, chars_to_escape)[source]¶

unitxt.templates.random() → x in the interval [0, 1).¶