Diagnostic evaluation of machine translation(MT) is an approach to evaluation thatprovides finer-grained information comparedto state-of-the-art automatic metrics.This paper evaluates DELiC4MT, a diagnosticmetric that assesses the performanceof MT systems on user-defined linguisticphenomena. We present the results obtainedusing this diagnostic metric whenevaluating three MT systems that translatefrom English to French, with a comparisonagainst both human judgements anda set of representative automatic evaluationmetrics. In addition, as the diagnosticmetric relies on word alignments, thepaper compares the margin of error in diagnosticevaluation when using automaticword alignments as opposed to gold standardmanual alignments. We observed thatthis diagnostic metric is capable of accuratelyreflecting translation quality, can beused reliably with automatic word alignmentsand, in general, correlates well withautomatic metrics and, more importantly,with human judgements.
Meta-Evaluation of a Diagnostic Quality Metric for Machine Translation
Federico Gaspari;
2013-01-01
Abstract
Diagnostic evaluation of machine translation(MT) is an approach to evaluation thatprovides finer-grained information comparedto state-of-the-art automatic metrics.This paper evaluates DELiC4MT, a diagnosticmetric that assesses the performanceof MT systems on user-defined linguisticphenomena. We present the results obtainedusing this diagnostic metric whenevaluating three MT systems that translatefrom English to French, with a comparisonagainst both human judgements anda set of representative automatic evaluationmetrics. In addition, as the diagnosticmetric relies on word alignments, thepaper compares the margin of error in diagnosticevaluation when using automaticword alignments as opposed to gold standardmanual alignments. We observed thatthis diagnostic metric is capable of accuratelyreflecting translation quality, can beused reliably with automatic word alignmentsand, in general, correlates well withautomatic metrics and, more importantly,with human judgements.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.