Yandex Blog

One model is better than two. Yandex.Translate launches a hybrid machine translation system

Today Yandex.Translate launched a hybrid machine translation system that combines neural and statistical approaches to machine translation to deliver our users an even higher quality translation that utilizes the complementary strengths of both translation models. The new system first translates users’ queries using both a statistical and a neural machine translation model.  Next, CatBoost, our gradient boosting library ranks the outputs of each model, ultimately selecting the highest quality translation. 

There are several approaches to machine translation and over the years, a number of technological advances have improved the quality of machine translation.  Since its launch in 2011, Yandex.Translate has been powered by statistical machine translation, a widely used approach that works by comparing example translations to find statistical correspondences between words in the two languages.

With today’s launch, Yandex.Translate now also includes a neural machine translation component, a method that has led to more fluent, human-like translations in the last few years. The new Yandex.Translate system is unique in offering users a free machine translation service that combines these two methods.

Statistical translation and neural translation models each have different strengths that complement each other. When combined in our new hybrid machine translation system, they will produce higher quality results than either of the underlying models alone. 

Statistical models prove extremely efficient at memorizing example translations and can produce better translations of words or phrases that are seen less frequently in the training data.  However, statistical machine translation break sentences up into words or phrases during the translation process, which sometimes makes it challenging to construct fluent translations.  

Neural machine translation models, on the other hand, can process entire sentences at once.  Neural models choose a translation based on the full context of a query, often resulting in much more fluent, human-like translations. But, because the neural network uses context to understand how a word is translated, it often fails to learn reasonable translations for words that it saw very few times in the training data. By combining the two systems, which excel in different areas, we see significant improvements in translation quality over either of the individual methods. 

The hybrid system will initially be launched for the English and Russian language pair, which accounts for 80 percent of the tens of millions of daily Yandex.Translate requests. The Yandex.Translate team also hopes to add other language pairs in the near future. 

Yandex’s new Head of Machine Translation, David Talbot explains, “We are excited to launch our new hybrid system for Yandex.Translate users. Ultimately, we want to develop a deeper understanding of how we can better assist Yandex users with their language needs, be it communication, language learning or simply accessing the huge amounts of information on the web available in other languages.”

Currently, Yandex.Translate offers users text, speech, and image translation and supports 94 languages pairs. Start using Yandex.Translate today at https://translate.yandex.ru/ or download the mobile app for iOS and Android.

Simultaneous Translation with Predictive Typing on Android Smartphones

Following the success of our automated translation app, Yandex.Translate, for iOS (over 230,000 downloads since launch in March; about 100,000 translations per day), we are now bridging the communication gap for Android smartphone users by releasing the first edition of Yandex.Translate for Android.

With the English language user interface, the app is of good help to anyone in need of a quick and accurate (well, as accurate as it gets with a machine) translation between English and eleven other languages – Russian, Ukrainian, Belarusian, Czech, French, Italian, Spanish, German, Turkish, Portuguese, Swedish, Danish and Dutch. What makes it much better than your standard robot language is our database of billions of word combinations. We scour the content of huge amount of webpages for patterns and collocations and use statistical algorithms to calculate the best possible equivalent for any given word combination.

Yandex.Translate for Android shares some of its most popular features with its iOS counterpart – predictive typing and simultaneous translation. Just like in Yandex.Translate for iOS, the Android app can translate the source text as it is being typed, while predictive typing can accurately predict the next word in the source text, which cuts input time by more than half. Try typing on an Android smartphone a 40-character phrase, “My flight was delayed due to bad weather", in 14 taps. Yandex.Translate can do that!

Instead of continually changing translations trying to guess what the user wants to say by each new letter they type, Yandex.Translate makes sure the new letter is the end of the word and only then translates this word. In result, the translated text makes sense. In most cases.

The Android app can voice fragments of translated text of up to 100 characters in English, Russian, Turkish, Italian, French, Spanish, German, Czech and Polish. Enough for the user to learn how to ask for a bill, say thank you or offer help in a foreign language. 

Both, the iOS app and the Android app detect the input language automatically and, of course, just like the iOS app, the Android version has an extensive dictionary with full entries, which include word definitions, usage examples, transcription and an opportunity to sound the key word.

Our mobile apps for automated translation use proprietary machine translation technology based on statistical regularities rather sets of rules and complement our eponymous web-based translation service, which runs since March 2011. 

Yandex.Translate for Android is available for free on Google Play and in Yandex.Store.