Yandex’s text generator Balaboba goes bilingual

Yandex has officially launched a bilingual version of its Balaboba text generator, which is now available both in Russian and English. Balaboba showcases the capabilities of Yandex's YaLM language models family, used in more than 20 of the company’s services, including Yandex Search and AI-based personal assistant Alice. 

With Balaboba, a user can enter just a couple of words in Russian or English and select a style in which they want the target text to be. The generator will then create a meaningful text on any subject similar to those texts from the Internet the model was trained on. To make the text coherent and grammatically correct, the model generates it word by word, evaluating whether the next predicted word would fit the context.

Balaboba can write a short story, come up with a recipe, user manual or a folk wisdom. And if the user enters the name of a movie, Balaboba could write a plot to it. Users are free to use the AI-generated texts at their discretion. For example, they could create product descriptions for their on-line store, find inspiration or ideas for advertising, or just share fun examples with their friends on social networks.

Balaboba generates its texts using Yandex's YaLM language model, which deals with tasks related to natural language processing. YaLM helps Alice keep the conversation going, recognizes the subject of user queries in Yandex Q, improves order descriptions for Yandex Services, and generates cards for quick answers in Search. Also, YaLM searches videos for highlights, generates advertisements and website snippets.

To help Balaboba stick to the rules of a language or pick the right words, the model uses a set of dynamic parameters, which would change depending on whether the predicted word is correct or not. Language models from the YaLM family can contain anywhere between 1 billion to 100 billion parameters. 

YaLM 100B — Yandex’s biggest bilingual model yet — featuring 100 billion parameters has recently been open sourced. Balaboba uses its lighter version with 3 billion parameters. The model was trained on terabytes of texts taken in equal proportions from the English- and Russian-speaking segments of the internet.

More info on YaLM 100B can be found on Medium.

Contacts:

Press Office
Ilya Grabovskiy
Phone: +7 495 739-70-00
E-mail: pr@yandex-team.com

Logo
/Download (PDF, 324,8 КБ)
/Download (PDF, 324,7 КБ)
Please follow these rules