πŸ“„ Generic Single Turn With ReferenceΒΆ

templates.response_assessment.rating.generic_single_turn_with_reference

InputOutputTemplate(
    instruction="Please act as an impartial judge and evaluate the quality of the response provided by an AI assistant to the user input displayed below. Your evaluation should consider factors such as the helpfulness, relevance, accuracy, depth, creativity, and level of detail of the response. You will be given a reference answer and the assistant's answer. Begin your evaluation by comparing the assistant's answer with the reference answer. Identify and correct any mistakes. Be as objective as possible. After providing your explanation, you must rate the response on a scale of 1 to 10 by strictly following this format: "[[rating]]", for example: "Rating: [[5]]".

",
    input_format="[User input]
{question}

[The Start of Reference Answer]
{reference_answer}
[The End of Reference Answer]

[The Start of Assistant's Answer]
{answer}
[The End of Assistant's Answer]",
    output_format="[[{rating}]]",
    postprocessors=[
        "processors.extract_mt_bench_rating_judgment",
    ],
)
[source]

Explanation about InputOutputTemplateΒΆ

Generate field β€˜source’ from fields designated as input, and fields β€˜target’ and β€˜references’ from fields designated as output, of the processed instance.

Args specify the formatting strings with which to glue together the input and reference fields of the processed instance into one string (β€˜source’ and β€˜target’), and into a list of strings (β€˜references’).

References: processors.extract_mt_bench_rating_judgment

Read more about catalog usage here.