πŸ“„ Chat ApiΒΆ

formats.chat_api

Explanation about ChatAPIFormatΒΆ

Formats output for LLM APIs using OpenAI’s chat schema.

Many API services use OpenAI’s chat format as a standard for conversational models. OpenAIFormat prepares the output in this API-compatible format, converting input instances into OpenAI’s structured chat format, which supports both text and multimedia elements, like images.

The formatted output can be easily converted to a dictionary using json.loads() to make it ready for direct use with OpenAI’s API.

Example:

Given an input instance:

{
    "source": "<img src='https://example.com/image1.jpg'>What's in this image?",
    "target": "A dog",
    "instruction": "Help the user.",
},

When processed by:

system_format = OpenAIFormat()

The resulting formatted output is:

{
    "target": "A dog",
    "source": '[{"role": "system", "content": "Help the user."}, '
              '{"role": "user", "content": [{"type": "image_url", '
              '"image_url": {"url": "https://example.com/image1.jpg", "detail": "low"}}, '
              '{"type": "text", "text": "What\'s in this image?"}]}]'
}

This source field is a JSON-formatted string. To make it ready for OpenAI’s API, you can convert it to a dictionary using json.loads():

import json

messages = json.loads(formatted_output["source"])

response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
)

The resulting messages is now a dictionary ready for sending to the OpenAI API.

By default, the instruction in the template is placed in a turn with a β€˜system’ role. However, some chat tokenizers, will not place the default system prompt for the model, if there is turn with an explicit β€˜system’ role. To keep the default system prompt, set β€˜place_instruction_in_user_turns=True’. This will cause the instruction of the template to be placed in a turn with a β€˜user’ role. Note the instruction will also be placed in every demo turn (if demos are generated.)

Read more about catalog usage here.