📄 Normalized Sacrebleu¶

metrics.normalized_sacrebleu

NormalizedSacrebleu(
    language_to_tokenizer={
        "german": None,
        "deutch": None,
        "de": None,
        "french": None,
        "fr": None,
        "romanian": None,
        "ro": None,
        "english": None,
        "en": None,
        "spanish": None,
        "es": None,
        "portuguese": None,
        "pt": None,
        "arabic": "intl",
        "ar": "intl",
        "korean": "ko-mecab",
        "ko": "ko-mecab",
        "japanese": "ja-mecab",
        "ja": "ja-mecab",
    },
)
[source]

Explanation about NormalizedSacrebleu¶

SacreBLEU metric implementation using MapReduceMetric pattern.

This implementation uses the official sacrebleu library for tokenization and BLEU computation, while supporting the map-reduce pattern for proper corpus-level evaluation that matches the behavior of the HuggingFace version.

Range: [0, 1] (higher is better) Reference: Post, M. 2018. A Call for Clarity in Reporting BLEU Scores.