๐Ÿ“„ User Assistantยถ

formats.user_assistant

SystemFormat(
    demo_format="<|user|>
{source}
<|assistant|>
 {target_prefix}{target}

",
    model_input_format="{system_prompt}{instruction}{demos}<|user|>
{source}
<|assistant|>
{target_prefix}",
)
[source]

Explanation about SystemFormatยถ

Generates the whole input to the model, from constant strings that are given as args, and from values found in specified fields of the instance.

Important: formats can use '\N' notations that means new-line if no new-line before and no empty string before.

SystemFormat expects the input instance to contain: 1. A field named โ€œsystem_promptโ€ whose value is a string (potentially empty) that delivers a task-independent opening text. 2. A field named โ€œsourceโ€ whose value is a string verbalizing the original values in the instance (as read from the source dataset), in the context of the underlying task. 3. A field named โ€œinstructionโ€ that contains a (non-None) string. 4. A field named with the value in arg 'demos_field', containing a list of dicts, each dict with fields โ€œsourceโ€ and โ€œtargetโ€, representing a single demo. 5. A field named โ€œtarget_prefixโ€ that contains a string to prefix the target in each demo, and to end the whole generated prompt

SystemFormat formats the above fields into a single string to be input to the model. This string overwrites field โ€œsourceโ€ of the instance. Formatting is driven by two args: 'demo_format' and 'model_input_format'. SystemFormat also pops fields โ€œsystem_promptโ€, โ€œinstructionโ€, โ€œtarget_prefixโ€, and the field containing the demos out from the input instance.

Args:

demos_field (str): the name of the field that contains the demos, being a list of dicts, each with โ€œsourceโ€ and โ€œtargetโ€ keys demo_format (str): formatting string for a single demo, combining fields โ€œsourceโ€ and โ€œtargetโ€ model_input_format (str): overall product format, combining instruction and source (as read from fields โ€œinstructionโ€ and โ€œsourceโ€ of the input instance), together with demos (as formatted into one string) format_args (Dict[str,str]): additional format args to be used when formatting the different format strings

Example:

when input instance:

{
    "source": "1+1",
    "target": "2",
    "instruction": "Solve the math exercises.",
    "demos": [{"source": "1+2", "target": "3"}, {"source": "4-2", "target": "2"}]
}

is processed by

system_format = SystemFormat(
    demos_field=constants.demos_field,
    demo_format="Input: {source}\nOutput: {target}\n\n",
    model_input_format="Instruction: {instruction}\n\n{demos}Input: {source}\nOutput: ",
)

the resulting instance is:

{
    "target": "2",
    "source": "Instruction: Solve the math exercises.\n\nInput: 1+2\nOutput: 3\n\nInput: 4-2\nOutput: 2\n\nInput: 1+1\nOutput: ",
}

Read more about catalog usage here.