Comparing Standard and Factored Models in Statistical Machine Translation from English to Slovene Using the Moses System

Authors

  • Sašo Kuntarič
  • Simon Krek
  • Marko Robnik Šikonja

DOI:

https://doi.org/10.4312/slo2.0.2017.1.1-26

Keywords:

statistical machine translation, factored machine translation, Moses system, BLEU, human evaluation

Abstract

Machine translation is a field in computational linguistics that explores the use of software to translate text from one language to another. Factored statistical translation is an extension of statistical machine translation, where linguistic annotation is added on the word level. Words are turned into vectors in an attempt to improve translation quality. We describe the use of the open-source Moses system for factored statistical machine translation from English to Slovenian. We created several factored and non-factored language and translation models from a text corpus, containing IT-related texts. We translated two different IT-related documents. The first one was marketing-orientated with a complex structure, while the second one was technical with a simpler structure. We used two methods to compare the generated translations with two independent human translations and a translation, created by the Google Translate service. The first comparison method was the BLEU metrics and the second one were evaluations of human reviewers. The latter method expressed a subjective score, which is still very important in the machine translation field. Even though the results can’t be compared directly due to different metrics, the movement of the grades is well correlated for both texts. The only bigger difference can be seen while implementing factored models for translating the second text. In the conclusion we analysed the inter-evaluator coherence and the obtained results. We discovered that our models are more suitable for technical texts, and that factored models improve the translation of complex texts more.

Downloads

Download data is not yet available.

Downloads

Published

07.03.2018

How to Cite

Kuntarič, S., Krek, S., & Robnik Šikonja, M. (2018). Comparing Standard and Factored Models in Statistical Machine Translation from English to Slovene Using the Moses System. Slovenščina 2.0: Empirical, Applied and Interdisciplinary Research, 5(1), 1–26. https://doi.org/10.4312/slo2.0.2017.1.1-26

Issue

Section

Articles