Technology

BLEU

BLEU (Bilingual Evaluation Understudy) is the industry-standard metric for automatically assessing machine translation quality: it correlates MT output with human reference translations using modified n-gram precision.

BLEU is a core metric for machine translation (MT) evaluation, introduced by IBM Researchers Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu at the 2002 ACL conference. It quantifies translation quality by comparing the machine-generated text (candidate) against one or more human-created reference translations. The algorithm primarily relies on modified n-gram precision, counting the overlap of word sequences (up to 4-grams are common) between the candidate and the references. A brevity penalty is applied to discourage overly short translations. The final BLEU score is a single number between 0 and 1: a score closer to 1.0 indicates higher similarity to the human reference, establishing it as a quick, inexpensive, and highly correlated alternative to costly human evaluation.

https://aclanthology.org/P02-1040/

1 project · 1 city

Related technologies

BERTScore 1 ChatGPT 79 GitHub Copilot 20

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

NextGen Communications Copilot

Seattle Aug 8

GitHub Copilot ChatGPT