This paper aims to automatically identify which linguistic phenomena represent barriers to better MT quality. We focus on thetranslation of news data for two bidirectional language pairs: EN↔ES and EN↔DE. Using the diagnostic MT evaluation toolkitDELiC4MT and a set of human reference translations, we relate translation quality barriers to a selection of 9 source-side PoS-basedlinguistic checkpoints. Using output from the winning SMT, RbMT, and hybrid systems of the WMT 2013 shared task, translationquality barriers are investigated (in relation to the selected linguistic checkpoints) according to two main variables: (i) the type of theMT approach, i.e. statistical, rule-based or hybrid, and (ii) the human evaluation of MT output, ranked into three quality groupscorresponding to good, near miss and poor. We show that the combination of manual quality ranking and automatic diagnosticevaluation on a set of PoS-based linguistic checkpoints is able to identify the specific quality barriers of different MT system typesacross the four translation directions under consideration.

Relating Translation Quality Barriers to Source-Text Properties

Federico Gaspari;
2014-01-01

Abstract

This paper aims to automatically identify which linguistic phenomena represent barriers to better MT quality. We focus on thetranslation of news data for two bidirectional language pairs: EN↔ES and EN↔DE. Using the diagnostic MT evaluation toolkitDELiC4MT and a set of human reference translations, we relate translation quality barriers to a selection of 9 source-side PoS-basedlinguistic checkpoints. Using output from the winning SMT, RbMT, and hybrid systems of the WMT 2013 shared task, translationquality barriers are investigated (in relation to the selected linguistic checkpoints) according to two main variables: (i) the type of theMT approach, i.e. statistical, rule-based or hybrid, and (ii) the human evaluation of MT output, ranked into three quality groupscorresponding to good, near miss and poor. We show that the combination of manual quality ranking and automatic diagnosticevaluation on a set of PoS-based linguistic checkpoints is able to identify the specific quality barriers of different MT system typesacross the four translation directions under consideration.
2014
MT quality barriers
diagnostic evaluation
statistical/rule-based/hybrid MT
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12078/27332
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact