π Mt Bench Single TurnΒΆ
templates.response_assessment.rating.mt_bench_single_turn
type: InputOutputTemplate
instruction: "Please act as an impartial judge and evaluate the quality of the response provided by an AI assistant to the user question displayed below. Your evaluation should consider factors such as the helpfulness, relevance, accuracy, depth, creativity, and level of detail of the response. Begin your evaluation by providing a short explanation. Be as objective as possible. After providing your explanation, you must rate the response on a scale of 1 to 10 by strictly following this format: \"[[rating]]\", for example: \"Rating: [[5]]\".\n\n"
input_format: "[Question]\n{question}\n\n[The Start of Assistant's Answer]\n{answer}\n[The End of Assistant's Answer]"
output_format: [[{rating}]]
postprocessors:
- processors.extract_mt_bench_rating_judgment
[source]Explanation about InputOutputTemplateΒΆ
Generate field βsourceβ from fields designated as input, and fields βtargetβ and βreferencesβ from fields designated as output, of the processed instance.
Args specify the formatting strings with which to glue together the input and reference fields of the processed instance into one string (βsourceβ and βtargetβ), and into a list of strings (βreferencesβ).
References: processors.extract_mt_bench_rating_judgment
Read more about catalog usage here.