unitxt.templates module¶
- class unitxt.templates.ApplyRandomTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, demos_field: str | NoneType = None, templates: List[unitxt.templates.Template] = __required__)[source]¶
Bases:
ApplyTemplate
- class unitxt.templates.ApplySingleTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, demos_field: str | NoneType = None, template: unitxt.templates.Template = __required__)[source]¶
Bases:
ApplyTemplate
- class unitxt.templates.ApplyTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, demos_field: str | NoneType = None)[source]¶
Bases:
InstanceOperator
- class unitxt.templates.DialogFieldsData(data_classification_policy: List[str] = None, user_role_label: str = __required__, assistant_role_label: str = __required__, system_role_label: str = __required__, dialog_field: str = __required__)[source]¶
Bases:
Artifact
- class unitxt.templates.DialogPairwiseChoiceTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None, input_format: str = __required__, choice_a_field: str = __required__, choice_b_field: str = __required__, answer_field: str = __required__, choice_a_label: str = __required__, choice_b_label: str = __required__, choice_tie_label: str = __required__, shuffle: bool = __required__, dialog_fields: List[unitxt.templates.DialogFieldsData] = __required__, turns_separator: str = '\n\n', label_separator: str = ' ')[source]¶
Bases:
DialogTemplate
,PairwiseChoiceTemplate
- class unitxt.templates.DialogTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None, input_format: str = __required__, dialog_fields: List[unitxt.templates.DialogFieldsData] = __required__, turns_separator: str = '\n\n', label_separator: str = ' ')[source]¶
Bases:
InputOutputTemplate
- class unitxt.templates.InputFormatTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, input_format: str = __required__)[source]¶
Bases:
Template
- class unitxt.templates.InputOutputTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None, input_format: str = __required__)[source]¶
Bases:
InputFormatTemplate
,OutputFormatTemplate
Generate field ‘source’ from fields designated as input, and fields ‘target’ and ‘references’ from fields designated as output, of the processed instance.
Args specify the formatting strings with which to glue together the input and reference fields of the processed instance into one string (‘source’ and ‘target’), and into a list of strings (‘references’).
- class unitxt.templates.InputOutputTemplateWithCustomTarget(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None, input_format: str = __required__, reference: str = __required__)[source]¶
Bases:
InputOutputTemplate
- class unitxt.templates.KeyValTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, pairs_separator: str = ', ', key_val_separator: str = ': ', use_keys_for_inputs: bool = True, outputs_key_val_separator: str = ': ', use_keys_for_outputs: bool = False)[source]¶
Bases:
Template
Generate field ‘source’ from fields designated as input, and fields ‘target’ and ‘references’ from fields designated as output, of the processed instance.
Args specify with what separators to glue together the input and output designated fields of the processed instance into one string (‘source’ and ‘target’), and into a list of strings (‘references’).
- class unitxt.templates.MultiLabelTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_list_by_comma'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = '{labels}', input_format: str = __required__, labels_field: str = 'labels', labels_separator: str = ', ', empty_label: str = 'None')[source]¶
Bases:
InputOutputTemplate
- postprocessors: List[str] = ['processors.to_list_by_comma']¶
- class unitxt.templates.MultiReferenceTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None, input_format: str = __required__, references_field: str = 'references', random_reference: bool = False)[source]¶
Bases:
InputOutputTemplate
- class unitxt.templates.MultipleChoiceTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, input_format: str = __required__, choices_field: str = 'choices', target_field: str = 'label', choices_separator: str = ', ', source_choice_format: str = '{choice_numeral}. {choice_text}', target_choice_format: str = '{choice_numeral}', enumerator: str = 'capitals', shuffle_choices: bool = False)[source]¶
Bases:
InputFormatTemplate
Formats the input (that specifies the question), the multiple choices to select the answer from, and specifies the field with the correct answer.
- class unitxt.templates.OutputFormatTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None)[source]¶
Bases:
Template
- class unitxt.templates.OutputQuantizingTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.MultiTypeSerializer = None, output_format: str = None, input_format: str = __required__, quantum: float | int = 0.1)[source]¶
Bases:
InputOutputTemplate
- class unitxt.templates.PairwiseChoiceTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None, input_format: str = __required__, choice_a_field: str = __required__, choice_b_field: str = __required__, answer_field: str = __required__, choice_a_label: str = __required__, choice_b_label: str = __required__, choice_tie_label: str = __required__, shuffle: bool = __required__)[source]¶
Bases:
InputOutputTemplate
PairwiseChoiceTemplate.
- Requirements:
The answer field value should be of type Literal[“choice_a”, “choice_b”, “tie”]
- Parameters:
choice_a_field (str) – The field which contains choice_a value
choice_b_field (str) – The field which contains choice_b value
answer_field (str) – The field which contains the answer value. Should be of type Literal[“choice_1”, “choice_2”, “tie”]
choice_a_label (str) – The label of choice A answer as it is verbalized in the template.
choice_b_label (str) – The label of choice B answer as it is verbalized in the template.
choice_tie_label (str) – The label of a tie answer as it should be verbalized in the template.
shuffle (bool) – whether to shuffle the choices or not. This is done to take into account position bias.
- shuffle: 50% of the time:
The values of choice_a_field and choice_b_field will be swapped.
- If the values of answer_field is choice_a_label, set it to choice_b_label.
Else if the values of answer_field is choice_b_label, set it to choice_a_label. Else if the value of answer_field is choice_tie_label, do nothing.
- class unitxt.templates.PairwiseComparativeRatingTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = None, input_format: str = __required__, choice_a_field: str = __required__, choice_b_field: str = __required__, choice_a_id_field: str = __required__, choice_b_id_field: str = __required__, answer_field: str = __required__, shuffle: bool = __required__)[source]¶
Bases:
InputOutputTemplate
PairwiseChoiceTemplate.
- Parameters:
choice_a_field (str) – The field which contains choice_a value
choice_b_field (str) – The field which contains choice_b value
answer_field (str) – The field which contains the answer value. The value should be an int. Positive for preferring choice_a, and negative for preferring choice_b
shuffle (bool) – whether to shuffle the choices or not. This is done to take into account position bias.
- shuffle: 50% of the time:
The values of choice_a_field and choice_b_field will be swapped.
Replace the values of answer_field with its mapped value according to the reverse_preference_map Dict.
- class unitxt.templates.SpanLabelingBaseTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_list_by_comma'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = '{labels}', input_format: str = __required__, labels_field: str = 'labels', labels_separator: str = ', ', empty_label: str = 'None', spans_starts_field: str = 'spans_starts', spans_ends_field: str = 'spans_ends', text_field: str = 'text', labels_support: list = None)[source]¶
Bases:
MultiLabelTemplate
- class unitxt.templates.SpanLabelingJsonTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.load_json', 'processors.dict_of_lists_to_value_key_pairs'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = '{labels}', input_format: str = __required__, labels_field: str = 'labels', labels_separator: str = ', ', empty_label: str = 'None', spans_starts_field: str = 'spans_starts', spans_ends_field: str = 'spans_ends', text_field: str = 'text', labels_support: list = None)[source]¶
Bases:
SpanLabelingBaseTemplate
- postprocessors: List[str] = ['processors.load_json', 'processors.dict_of_lists_to_value_key_pairs']¶
- class unitxt.templates.SpanLabelingTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_span_label_pairs'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, output_format: str = '{labels}', input_format: str = __required__, labels_field: str = 'labels', labels_separator: str = ', ', empty_label: str = 'None', spans_starts_field: str = 'spans_starts', spans_ends_field: str = 'spans_ends', text_field: str = 'text', labels_support: list = None, span_label_format: str = '{span}: {label}', escape_characters: List[str] = [':', ','])[source]¶
Bases:
SpanLabelingBaseTemplate
- escape_characters: List[str] = [':', ',']¶
- postprocessors: List[str] = ['processors.to_span_label_pairs']¶
- class unitxt.templates.Template(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None)[source]¶
Bases:
InstanceOperator
The role of template is to take the fields of every instance and verbalize it.
Meaning the template is taking the instance and generating source, target and references.
- Parameters:
skip_rendered_instance (bool) – if “source”, “target”, and “references” are already defined fields in the instance, skip its processing
postprocessors – a list of strings being artifact names of text processors, to be applied on the model output
instruction – a formatting string that yields an instruction with potential participation of values from the “input_fields” part of the instance
target_prefix – a string to be used to format the prompt. Not a formatting string.
- exception unitxt.templates.TemplateFormatKeyError(template, data, data_type, format_str, format_name)[source]¶
Bases:
UnitxtError
- class unitxt.templates.TemplatesDict(data_classification_policy: List[str] = None, items: Dict[str, unitxt.artifact.Artifact] = {})[source]¶
Bases:
DictCollection
- class unitxt.templates.TemplatesList(data_classification_policy: List[str] = None, items: List[unitxt.artifact.Artifact] = [])[source]¶
Bases:
ListCollection
- class unitxt.templates.YesNoTemplate(data_classification_policy: List[str] = None, _requirements_list: List[str] | Dict[str, str] = [], caching: bool = None, apply_to_streams: List[str] = None, dont_apply_to_streams: List[str] = None, skip_rendered_instance: bool = True, postprocessors: List[str] = ['processors.to_string_stripped'], instruction: str = '', target_prefix: str = '', title_fields: List[str] = [], serializer: unitxt.serializers.Serializer = None, input_format: str = None, class_field: str = None, label_field: str = None, yes_answer: str = 'Yes', no_answer: str = 'No')[source]¶
Bases:
InputFormatTemplate
A template for generating binary Yes/No questions asking whether an input text is of a specific class.
- input_format:
Defines the format of the question.
- class_field:
Defines the field that contains the name of the class that this template asks of.
- label_field:
Defines the field which contains the true label of the input text. If a gold label is equal to the value in class_name, then the correct output is self.yes_answer (by default, “Yes”). Otherwise the correct output is self.no_answer (by default, “No”).
- yes_answer:
The output value for when the gold label equals self.class_name. Defaults to “Yes”.
- no_answer:
The output value for when the gold label differs from self.class_name. Defaults to “No”.
- unitxt.templates.random() x in the interval [0, 1). ¶