π Judge Answer Relevance Verbal Good BadΒΆ
templates.rag_eval.answer_relevance.judge_answer_relevance_verbal_good_bad
InputOutputTemplateWithCustomTarget(
input_format="Question: {question}
Prediction: {answer}
",
output_format="{is_relevant}",
postprocessors=[
"processors.extract_from_double_brackets",
"processors.extract_verbal_judgement_bad_good",
"processors.cast_to_float_return_zero_if_failed",
],
reference="{number_val}",
target_prefix="Answer: ",
instruction="You are given a question and a prediction from a model. Please determine to what extent, on a scale of 0 to 10, the prediction answers the question.
Reply with your rating score without any preceding explanation.
",
)
[source]References: processors.cast_to_float_return_zero_if_failed, processors.extract_verbal_judgement_bad_good, processors.extract_from_double_brackets
Read more about catalog usage here.