πŸ“„ IdΒΆ

Global-MMLU-Lite is a streamlined multilingual evaluation set covering 15 languages. The dataset includes 200 Culturally Sensitive (CS) and 200 Culturally Agnostic (CA) questions per language. The samples in Global-MMLU-Lite correspond to languages that were fully human-translated or post-edited in the original dataset. This initiative was led by Cohere For AI in collaboration with external contributors from industry and academia. The test spans subjects in humanities, social sciences, hard sciences, and other areas. For more information, see: https://huggingface.co/datasets/CohereForAI/Global-MMLU-Lite

Tags: annotations_creators:expert-generated, language:id, language_creators:expert-generated, license:apache-2.0, multilinguality:multilingual, size_categories:10K<n<100K, source_datasets:original, task_categories:question-answering, task_ids:multiple-choice-qa, region:global, category:dataset

cards.global_mmlu_lite_ca.id

type: TaskCard
loader: 
  type: LoadHF
  path: CohereForAI/Global-MMLU-Lite
  name: id
  filtering_lambda: lambda x: x['cultural_sensitivity_label'] == 'CA'
preprocess_steps: 
  - type: SplitRandomMix
    mix: 
      test: test[100%]
      train: test[10%]
  - type: Deduplicate
    by: 
      - question
      - subject
      - answer
  - type: MapInstanceValues
    mappers: 
      answer: 
        A: 0
        B: 1
        C: 2
        D: 3
  - type: ListFieldValues
    fields: 
      - option_a
      - option_b
      - option_c
      - option_d
    to_field: choices
  - type: Rename
    field_to_field: 
      subject: topic
  - type: MapInstanceValues
    mappers: 
      topic: 
        abstract_algebra: abstract algebra
        anatomy: anatomy
        astronomy: astronomy
        business_ethics: business ethics
        clinical_knowledge: clinical knowledge
        college_biology: college biology
        college_chemistry: college chemistry
        college_computer_science: college computer science
        college_mathematics: college mathematics
        college_medicine: college medicine
        college_physics: college physics
        computer_security: computer security
        conceptual_physics: conceptual physics
        econometrics: econometrics
        electrical_engineering: electrical engineering
        elementary_mathematics: elementary mathematics
        formal_logic: formal logic
        global_facts: global facts
        high_school_biology: high school biology
        high_school_chemistry: high school chemistry
        high_school_computer_science: high school computer science
        high_school_european_history: high school european history
        high_school_geography: high school geography
        high_school_government_and_politics: high school government and politics
        high_school_macroeconomics: high school macroeconomics
        high_school_mathematics: high school mathematics
        high_school_microeconomics: high school microeconomics
        high_school_physics: high school physics
        high_school_psychology: high school psychology
        high_school_statistics: high school statistics
        high_school_us_history: high school us history
        high_school_world_history: high school world history
        human_aging: human aging
        human_sexuality: human sexuality
        international_law: international law
        jurisprudence: jurisprudence
        logical_fallacies: logical fallacies
        machine_learning: machine learning
        management: management
        marketing: marketing
        medical_genetics: medical genetics
        miscellaneous: miscellaneous
        moral_disputes: moral disputes
        moral_scenarios: moral scenarios
        nutrition: nutrition
        philosophy: philosophy
        prehistory: prehistory
        professional_accounting: professional accounting
        professional_law: professional law
        professional_medicine: professional medicine
        professional_psychology: professional psychology
        public_relations: public relations
        security_studies: security studies
        sociology: sociology
        us_foreign_policy: us foreign policy
        virology: virology
        world_religions: world religions
task: tasks.qa.multiple_choice.with_topic
templates: templates.qa.multiple_choice.with_topic.all
[source]

Explanation about TaskCardΒΆ

TaskCard delineates the phases in transforming the source dataset into model input, and specifies the metrics for evaluation of model output.

Args:
loader:

specifies the source address and the loading operator that can access that source and transform it into a unitxt multistream.

preprocess_steps:

list of unitxt operators to process the data source into model input.

task:

specifies the fields (of the already (pre)processed instance) making the inputs, the fields making the outputs, and the metrics to be used for evaluating the model output.

templates:

format strings to be applied on the input fields (specified by the task) and the output fields. The template also carries the instructions and the list of postprocessing steps, to be applied to the model output.

default_template:

a default template for tasks with very specific task dataset specific template

Explanation about MapInstanceValuesΒΆ

A class used to map instance values into other values.

This class is a type of InstanceOperator, it maps values of instances in a stream using predefined mappers.

Args:
mappers (Dict[str, Dict[str, Any]]):

The mappers to use for mapping instance values. Keys are the names of the fields to undergo mapping, and values are dictionaries that define the mapping from old values to new values. Note that mapped values are defined by their string representation, so mapped values are converted to strings before being looked up in the mappers.

strict (bool):

If True, the mapping is applied strictly. That means if a value does not exist in the mapper, it will raise a KeyError. If False, values that are not present in the mapper are kept as they are.

process_every_value (bool):

If True, all fields to be mapped should be lists, and the mapping is to be applied to their individual elements. If False, mapping is only applied to a field containing a single value.

Examples:

MapInstanceValues(mappers={"a": {"1": "hi", "2": "bye"}}) replaces "1" with "hi" and "2" with "bye" in field "a" in all instances of all streams: instance {"a": 1, "b": 2} becomes {"a": "hi", "b": 2}. Note that the value of "b" remained intact, since field-name "b" does not participate in the mappers, and that 1 was casted to "1" before looked up in the mapper of "a".

MapInstanceValues(mappers={"a": {"1": "hi", "2": "bye"}}, process_every_value=True): Assuming field "a" is a list of values, potentially including "1"-s and "2"-s, this replaces each such "1" with "hi" and "2" – with "bye" in all instances of all streams: instance {"a": ["1", "2"], "b": 2} becomes {"a": ["hi", "bye"], "b": 2}.

MapInstanceValues(mappers={"a": {"1": "hi", "2": "bye"}}, strict=True): To ensure that all values of field "a" are mapped in every instance, use strict=True. Input instance {"a":"3", "b": 2} will raise an exception per the above call, because "3" is not a key in the mapper of "a".

MapInstanceValues(mappers={"a": {str([1,2,3,4]): "All", str([]): "None"}}, strict=True) replaces a list [1,2,3,4] with the string "All" and an empty list by string "None".

Explanation about ListFieldValuesΒΆ

Concatenates values of multiple fields into a list, and assigns it to a new field.

Explanation about RenameΒΆ

Renames fields.

Move value from one field to another, potentially, if field name contains a /, from one branch into another. Remove the from field, potentially part of it in case of / in from_field.

Examples:

Rename(field_to_field={β€œb”: β€œc”}) will change inputs [{β€œa”: 1, β€œb”: 2}, {β€œa”: 2, β€œb”: 3}] to [{β€œa”: 1, β€œc”: 2}, {β€œa”: 2, β€œc”: 3}]

Rename(field_to_field={β€œb”: β€œc/d”}) will change inputs [{β€œa”: 1, β€œb”: 2}, {β€œa”: 2, β€œb”: 3}] to [{β€œa”: 1, β€œc”: {β€œd”: 2}}, {β€œa”: 2, β€œc”: {β€œd”: 3}}]

Rename(field_to_field={β€œb”: β€œb/d”}) will change inputs [{β€œa”: 1, β€œb”: 2}, {β€œa”: 2, β€œb”: 3}] to [{β€œa”: 1, β€œb”: {β€œd”: 2}}, {β€œa”: 2, β€œb”: {β€œd”: 3}}]

Rename(field_to_field={β€œb/c/e”: β€œb/d”}) will change inputs [{β€œa”: 1, β€œb”: {β€œc”: {β€œe”: 2, β€œf”: 20}}}] to [{β€œa”: 1, β€œb”: {β€œc”: {β€œf”: 20}, β€œd”: 2}}]

Explanation about DeduplicateΒΆ

Deduplicate the stream based on the given fields.

Args:

by (List[str]): A list of field names to deduplicate by. The combination of these fields’ values will be used to determine uniqueness.

Examples:
>>> dedup = Deduplicate(by=["field1", "field2"])

Explanation about SplitRandomMixΒΆ

Splits a multistream into new streams (splits), whose names, source input stream, and amount of instances, are specified by arg β€˜mix’.

The keys of arg β€˜mix’, are the names of the new streams, the values are of the form: β€˜name-of-source-stream[percentage-of-source-stream]’ Each input instance, of any input stream, is selected exactly once for inclusion in any of the output streams.

Examples: When processing a multistream made of two streams whose names are β€˜train’ and β€˜test’, by SplitRandomMix(mix = { β€œtrain”: β€œtrain[99%]”, β€œvalidation”: β€œtrain[1%]”, β€œtest”: β€œtest” }) the output is a multistream, whose three streams are named β€˜train’, β€˜validation’, and β€˜test’. Output stream β€˜train’ is made of randomly selected 99% of the instances of input stream β€˜train’, output stream β€˜validation’ is made of the remaining 1% instances of input β€˜train’, and output stream β€˜test’ is made of the whole of input stream β€˜test’.

When processing the above input multistream by SplitRandomMix(mix = { β€œtrain”: β€œtrain[50%]+test[0.1]”, β€œvalidation”: β€œtrain[50%]+test[0.2]”, β€œtest”: β€œtest[0.7]” }) the output is a multistream, whose three streams are named β€˜train’, β€˜validation’, and β€˜test’. Output stream β€˜train’ is made of randomly selected 50% of the instances of input stream β€˜train’ + randomly selected 0.1 (i.e., 10%) of the instances of input stream β€˜test’. Output stream β€˜validation’ is made of the remaining 50% instances of input β€˜train’+ randomly selected 0.2 (i.e., 20%) of the original instances of input β€˜test’, that were not selected for output β€˜train’, and output stream β€˜test’ is made of the remaining instances of input β€˜test’.

Explanation about LoadHFΒΆ

Loads datasets from the HuggingFace Hub.

It supports loading with or without streaming, and it can filter datasets upon loading.

Args:
path:

The path or identifier of the dataset on the HuggingFace Hub.

name:

An optional dataset name.

data_dir:

Optional directory to store downloaded data.

split:

Optional specification of which split to load.

data_files:

Optional specification of particular data files to load.

revision:

Optional. The revision of the dataset. Often the commit id. Use in case you want to set the dataset version.

streaming (bool):

indicating if streaming should be used.

filtering_lambda (str, optional):

A lambda function for filtering the data after loading.

num_proc (int, optional):

Specifies the number of processes to use for parallel dataset loading.

Example:

Loading glue’s mrpc dataset

load_hf = LoadHF(path='glue', name='mrpc')

References: templates.qa.multiple_choice.with_topic.all, tasks.qa.multiple_choice.with_topic

Read more about catalog usage here.