📄 Empty Input Output Separator

Note

ID: formats.empty_input_output_separator | Type: SystemFormat

{
    "__type__": "system_format",
    "demo_format": "{source}{target_prefix}{target}\n\n",
    "model_input_format": "{system_prompt}{instruction}\n{demos}\n{source}{target_prefix}"
}

Explanation about SystemFormat

Generates the whole input to the model, from constant strings that are given as args, and from values found in specified fields of the instance.

Important: formats can use ‘N’ notations that means new-line if no new-line before and no empty string before.

SystemFormat expects the input instance to contain: 1. A field named “system_prompt” whose value is a string (potentially empty) that delivers a task independent opening text. 2. A field named “source” whose value is a string verbalizing the original values in the instance (as read from the source dataset), in the context of the underlying task. 3. A field named “instruction” that contains a (non-None) string. 4. A field named with the value in arg ‘demos_field’, containing a list of dicts, each dict with fields “source” and “target”, representing a single demo. 5. A field named “target_prefx” that contains a string to prefix the target in both each demo, and to end the whole generated prompt

SystemFormat formats the above fields into a single string to be inputted to the model. This string overwrites field “source” of the instance. Formatting is driven by two args: ‘demo_format’ and ‘model_input_format’. SystemFormat also pops fields “system_prompt”, “instruction”, “target_prefix”, and the field containing the demos out from the input instance.

Args:

demos_field (str): the name of the field that contains the demos, being a list of dicts, each with “source” and “target” keys demo_format (str): formatting string for a single demo, combining fields “source” and “target” model_input_format (str) overall product format, combining instruction and source (as read from fields “instruction” and “source” of the input instance), together with demos (as formatted into one string) format_args: Dict[str,str]: additional format args to be used when formatting the different format strings

Example:

when input instance:

{
    "source": "1+1",
    "target": "2",
    "instruction": "Solve the math exercises.",
    "demos": [{"source": "1+2", "target": "3"}, {"source": "4-2", "target": "2"}]
}

is processed by

system_format = SystemFormat(
    demos_field="demos",
    demo_format="Input: {source}\nOutput: {target}\n\n",
    model_input_format="Instruction: {instruction}\n\n{demos}Input: {source}\nOutput: ",
)

the resulting instance is:

{
    "target": "2",
    "source": "Instruction: Solve the math exercises.\n\nInput: 1+2\nOutput: 3\n\nInput: 4-2\nOutput: 2\n\nInput: 1+1\nOutput: ",
}

Read more about catalog usage here.