Czech-designed software can simultaneously translate into over 40 languages at once

Machine translation experts from Charles University have developed a new system that can translate between 43 languages in real time. Intended for international conferences, Elitr shows words said by a speaker on a screen within milliseconds, and in all languages at once.

Ondřej Bojar | Photo: Charles University

Elitr is unique in that it not only accurately transcribes what the speaker is saying using speech-to-text technology, but at the same time can show it in up to 42 other languages, depending on what the user selects. And this all happens in real time. Ondřej Bojar leads the Czech team that developed it.

“We created the system in such a way that it would follow what the speaker is saying and as a backup, it takes what the human interpreters, who are physically present at the conference, are saying. And from these multiple sources, we can show the translation of the spoken words in real time.”

Photo: mohamed_hassan,  Pixabay,  Pixabay License

Using this combination of languages, from the speaker and up to five human interpreters simultaneously translating into other languages, the system can draw on the huge corpora created for machine learning to iron out ambiguities caused by homonyms and homophones. Dominik Macháček, a PhD student who worked on the project, explains how this works.

“For example, the Czech word ‘zámek’ has several meanings – it can mean ‘chateau’ or ‘lock’. But if we get the English translation at the same time, we know which meaning the speaker intended, and then the word can be better translated into the other languages.”

Elitr was developed at the request of the Czech Supreme Audit Office, which was organising a large international congress. Rather than the online machine translation services that we are all familiar with, Elitr has been developed specifically for the needs of conference attendees.

Photo: falarcompaulo,  Pixabay,  Pixabay License

Unlike when translating text, these are translations of live people speaking, with all that entails. There may be hesitations and pauses, the speaker may stop halfway through a sentence and re-start it, or trail off altogether. Therefore, the translation is created word by word, rather than sentence by sentence.

This also has to do with the differing language competencies of the users, says Ondřej Bojar.

“People have different preferences depending on how well they know the language which is being spoken. Those who don’t understand anything prefer to wait for the stabilised output, while those who understand a little want to see the word immediately, while they still have it in their short-term memory.”

In the future, Ondřej Bojar and his team hope that Elitr will be able to do even more – they are working on developing the system to make it able to make relevant notes in all 43 languages based on the lecture or speech, and Dominik Macháček is working on improving the quality of the translations.

However, due to the enormous amount of computing power required to run the software, Elitr won’t be available as a mobile app for the time being.