Prague hosts machine translation marathon
Prague’s Charles University recently hosted an unusual marathon which tested the capacity of various machine translating systems. The annual event is part of the Euromatrix project, which aims to establish machine translation systems for all European languages. The participants had a week to translate some 12,000 sentences from various newspapers and news sites. In the coming weeks their output will be confronted with translations done by professional „human” translators. Ruth Fraňková spoke to Ondřej Bojar from the Institute of Formal and Applied Linguistics, which is taking part in the Euromatrix project:
Is Czech a difficult language compared to other European languages?
“Czech has some specific properties that make it particularly difficult for translation, for example from English, and the difference between Czech and English is the rich morphology in Czech. While in English you have just a single form of a word, say green for the colour, in Czech you have seven cases, four genders and two numbers. Not all these combinations are different on the surface level but the number of possible Czech word forms is much higher and the system has to choose a correct one so this is a challenge.”
What about word order?
“The word order actually helps us when we are translating from English to Czech because Czech allows nearly any permutation of words in the sentence as a correct word order provided that the case markings and things like that are correct. When we are translating back from Czech into English and the Czech is produced by native speakers the situation is much more difficult. You have to identify where is the subject, where is the verb, where is the object, and these have to be in the canonical English order otherwise the sentence wouldn’t be comprehensible for a speaker of English.”
“I do believe that machine translation systems can replace humans in case of repetitive texts. For example weather reports were translated from English to French already in the 1970s. Now I think we are moving towards European legislation and I estimate that 60 percent of the texts or even more can be automatically translated with no human intervention.”