๐Ÿ“„ Llavaยถ

formats.models.llava

type: HFSystemFormat
model_name: llava-hf/llava-1.5-7b-hf
[source]

Explanation about HFSystemFormatยถ

Formats the complete input for the model using the HuggingFace chat template of a given model.

HFSystemFormat expects the input instance to contain: 1. A field named โ€œsystem_promptโ€ whose value is a string (potentially empty) that delivers a task-independent opening text. 2. A field named โ€œsourceโ€ whose value is a string verbalizing the original values in the instance (as read from the source dataset), in the context of the underlying task. 3. A field named โ€œinstructionโ€ that contains a (non-None) string. 4. A field named with the value in arg โ€˜demos_fieldโ€™, containing a list of dicts, each dict with fields โ€œsourceโ€ and โ€œtargetโ€, representing a single demo. 5. A field named โ€œtarget_prefixโ€ that contains a string to prefix the target in each demo, and to end the whole generated prompt.

SystemFormat formats the above fields into a single string to be inputted to the model. This string overwrites field โ€œsourceโ€ of the instance.

Example:

HFSystemFormat(model_name=โ€HuggingFaceH4/zephyr-7b-betaโ€)

Uses the template defined the in tokenizer_config.json of the model:

โ€œchat_templateโ€: โ€œ{% for message in messages %}n{% if message[โ€˜roleโ€™] == โ€˜userโ€™ %}n{{ โ€˜<|user|>nโ€™ + message[โ€˜contentโ€™] + eos_token }}n{% elif message[โ€˜roleโ€™] == โ€˜systemโ€™ %}n{{ โ€˜<|system|>nโ€™ + message[โ€˜contentโ€™] + eos_token }}n{% elif message[โ€˜roleโ€™] == โ€˜assistantโ€™ %}n{{ โ€˜<|assistant|>nโ€™ + message[โ€˜contentโ€™] + eos_token }}n{% endif %}n{% if loop.last and add_generation_prompt %}n{{ โ€˜<|assistant|>โ€™ }}n{% endif %}n{% endfor %}โ€,

See more details in https://huggingface.co/docs/transformers/main/en/chat_templating

Read more about catalog usage here.