π Vision FullΒΆ
benchmarks.vision_full
Benchmark
(
subsets={
"doc_vqa_default": DatasetRecipe
(
card="cards.doc_vqa.lmms_eval",
),
"info_vqa_default": DatasetRecipe
(
card="cards.info_vqa_lmms_eval",
),
"chart_qa_default": DatasetRecipe
(
card="cards.chart_qa_lmms_eval",
),
"ai2d_default": DatasetRecipe
(
card="cards.ai2d",
),
"websrc_default": DatasetRecipe
(
card="cards.websrc",
),
"doc_vqa_llama_vision_template": DatasetRecipe
(
card="cards.doc_vqa.lmms_eval",
template=MultiReferenceTemplate
(
input_format="{context} Read the text in the image carefully and answer the question with the text as seen exactly in the image. For yes/no questions, just respond Yes or No. If the answer is numeric, just respond with the number and nothing else. If the answer has multiple words, just respond with the words and absolutely nothing else. Never respond in a sentence or a phrase.
Question: {question}",
references_field="answers",
),
format="formats.chat_api",
),
"info_vqa_llama_vision_template": DatasetRecipe
(
card="cards.info_vqa_lmms_eval",
template=MultiReferenceTemplate
(
input_format="{context} Read the text in the image carefully and answer the question with the text as seen exactly in the image. For yes/no questions, just respond Yes or No. If the answer is numeric, just respond with the number and nothing else. If the answer has multiple words, just respond with the words and absolutely nothing else. Never respond in a sentence or a phrase.
Question: {question}",
references_field="answers",
),
format="formats.chat_api",
),
"chart_qa_llama_vision_template": DatasetRecipe
(
card="cards.chart_qa_lmms_eval",
template=MultiReferenceTemplate
(
input_format="{context} {question}
Answer the question with a single word.",
references_field="answers",
),
format="formats.chat_api",
),
"ai2d_llama_vision_template": DatasetRecipe
(
card="cards.ai2d",
template=MultipleChoiceTemplate
(
input_format="{context} Look at the scientific diagram carefully and answer the following question: {question}
{choices}
Respond only with the correct option digit.",
choices_separator="
",
target_field="answer",
enumerator="capitals",
),
format="formats.chat_api",
),
},
)
[source]Explanation about MultipleChoiceTemplateΒΆ
Formats the input that specifies a multiple-choice question, with a list of possible answers to choose from, and identifies the correct answer.
- Args:
- target_prefix (str): Optional prefix that can be added before the target label in
generated prompts or outputs.
- choices_field (str): The key under which the multiple choices are stored in the
input and reference dictionaries.
- target_field (str): The key under which the correct choice is stored in the
reference dictionary (can be integer index or textual label).
- choices_separator (str): A string used to join formatted
choices (e.g. β, β).
- source_choice_format (str): A Python format string used for displaying each choice
in the input fields (e.g. β{choice_numeral}. {choice_text}β).
- target_choice_format (str): A Python format string used for displaying each choice
in the target or final output (e.g. β{choice_numeral}β).
- enumerator (str): Determines how choice numerals are enumerated. Possible values
include βcapitalsβ, βlowercaseβ, βnumbersβ, or βromanβ.
- shuffle_choices (bool): If True, shuffle the choices. The shuffling seed can be
set with shuffle_choices_seed.
- shuffle_choices_seed (int, optional): If provided, the choices are shuffled with
this fixed integer seed for reproducibility.
- sort_choices_by_length (bool): If True, sorts choices
by their length (ascending).
- sort_choices_alphabetically (bool): If True, sorts choices
in alphabetical order.
- reverse_choices (bool): If True, reverses the order of the choices after any
sorting has been applied. Defaults to False to preserve backward compatibility.
Explanation about DatasetRecipeΒΆ
This class represents a standard recipe for data processing and preparation.
This class can be used to prepare a recipe. with all necessary steps, refiners and renderers included. It allows to set various parameters and steps in a sequential manner for preparing the recipe.
- Args:
- card (TaskCard):
TaskCard object associated with the recipe.
- template (Template, optional):
Template object to be used for the recipe.
- system_prompt (SystemPrompt, optional):
SystemPrompt object to be used for the recipe.
- loader_limit (int, optional):
Specifies the maximum number of instances per stream to be returned from the loader (used to reduce loading time in large datasets)
- format (SystemFormat, optional):
SystemFormat object to be used for the recipe.
- metrics (List[str]):
list of catalog metrics to use with this recipe.
- postprocessors (List[str]):
list of catalog processors to apply at post processing. (Not recommended to use from here)
- group_by (List[Union[str, List[str]]]):
list of task_data or metadata keys to group global scores by.
- train_refiner (StreamRefiner, optional):
Train refiner to be used in the recipe.
- max_train_instances (int, optional):
Maximum training instances for the refiner.
- validation_refiner (StreamRefiner, optional):
Validation refiner to be used in the recipe.
- max_validation_instances (int, optional):
Maximum validation instances for the refiner.
- test_refiner (StreamRefiner, optional):
Test refiner to be used in the recipe.
- max_test_instances (int, optional):
Maximum test instances for the refiner.
- demos_pool_size (int, optional):
Size of the demos pool. -1 for taking the whole of stream βdemos_taken_fromβ.
- demos_pool(List[Dict[str, Any]], optional):
a list of instances to make the demos_pool
- num_demos (int, optional):
Number of demos to add to each instance, to become part of the source to be generated for this instance.
- demos_taken_from (str, optional):
Specifies the stream from where the demos are taken. Default is βtrainβ.
- demos_field (str, optional):
Field name for demos. Default is βdemosβ. The num_demos demos selected for an instance are stored in this field of that instance.
- demos_pool_field_name (str, optional):
field name to maintain the demos_pool, until sampled from, in order to make the demos. Defaults to constants.demos_pool_field.
- demos_removed_from_data (bool, optional):
whether to remove the demos taken to demos_pool from the source data, Default is True
- sampler (Sampler, optional):
The Sampler used to select the demonstrations when num_demos > 0.
- skip_demoed_instances (bool, optional):
whether to skip pushing demos to an instance whose demos_field is already populated. Defaults to False.
- steps (List[StreamingOperator], optional):
List of StreamingOperator objects to be used in the recipe.
- augmentor (Augmentor) :
Augmentor to be used to pseudo randomly augment the source text
- instruction_card_index (int, optional):
Index of instruction card to be used for preparing the recipe.
- template_card_index (int, optional):
Index of template card to be used for preparing the recipe.
- Methods:
- prepare():
This overridden method is used for preparing the recipe by arranging all the steps, refiners, and renderers in a sequential manner.
- Raises:
- AssertionError:
If both template and template_card_index are specified at the same time.
References: cards.info_vqa_lmms_eval, cards.chart_qa_lmms_eval, cards.doc_vqa.lmms_eval, formats.chat_api, cards.websrc, cards.ai2d
Read more about catalog usage here.