Diagnostic evaluation of machine translation(MT) is an approach to evaluation thatprovides finer-grained information comparedto state-of-the-art automatic metrics.This paper evaluates DELiC4MT, a diagnosticmetric that assesses the performanceof MT systems on user-defined linguisticphenomena. We present the results obtainedusing this diagnostic metric whenevaluating three MT systems that translatefrom English to French, with a comparisonagainst both human judgements anda set of representative automatic evaluationmetrics. In addition, as the diagnosticmetric relies on word alignments, thepaper compares the margin of error in diagnosticevaluation when using automaticword alignments as opposed to gold standardmanual alignments. We observed thatthis diagnostic metric is capable of accuratelyreflecting translation quality, can beused reliably with automatic word alignmentsand, in general, correlates well withautomatic metrics and, more importantly,with human judgements.

Meta-Evaluation of a Diagnostic Quality Metric for Machine Translation

Federico Gaspari;
2013-01-01

Abstract

Diagnostic evaluation of machine translation(MT) is an approach to evaluation thatprovides finer-grained information comparedto state-of-the-art automatic metrics.This paper evaluates DELiC4MT, a diagnosticmetric that assesses the performanceof MT systems on user-defined linguisticphenomena. We present the results obtainedusing this diagnostic metric whenevaluating three MT systems that translatefrom English to French, with a comparisonagainst both human judgements anda set of representative automatic evaluationmetrics. In addition, as the diagnosticmetric relies on word alignments, thepaper compares the margin of error in diagnosticevaluation when using automaticword alignments as opposed to gold standardmanual alignments. We observed thatthis diagnostic metric is capable of accuratelyreflecting translation quality, can beused reliably with automatic word alignmentsand, in general, correlates well withautomatic metrics and, more importantly,with human judgements.
2013
978-3-9524207-0-6
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12078/27331
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact