How Balaboba works

Inside Balaboba is the first version of the YaLM language model (Yet another Language Model) developed by Yandex. The model is Transformer-based, just like a variety of other large language models (BERT, GPT, LaMDA) from the leading global developers. This type of model has exactly one task — to generate each subsequent word in a sentence. The model evaluates each predicted word during training to make sure the text is coherent and grammatically correct. For example, when looking at “Humpty Dumpty sat on a…”, its job is to decide which word comes next – “running” or “wall”.

Balaboba memorizes all the language rules and chooses appropriate words thanks to YaLM’s inherent parameters, which change depending on whether the word is predicted correctly. You can compare them to a set of small levers – each needs to be pulled its own way to start the mechanism. The YaLM family of language models contains 1 to 100 billion “levers”.

We used terabytes of text to make Balaboba’s writing both grammatical and lexically rich. The YaLM neural network is trained on a wide variety of pages indexed by Yandex, like Wikipedia articles, news pieces, books, and texts written by users of social media sites and forums. To avoid overloading the model, we cleared out repetitive, unfinished, and unnatural-sounding texts from the sample.

Some of the YaLM family models are trained to “speak” not only Russian, but English as well. Our largest bilingual model YaLM 100B recently became open-source.

Right now, Balaboba’s model is a light version of YaLM 100B with 3 billion “levers”. We use other models from the YaLM family in more than 20 projects. Thanks to this neural network, voice assistant Alice can have better conversations with users, and our search engine can generate flashcards with quick answers. YaLM can also be used to generate ads and site descriptions.

YaLM’s main feature, however, is the ability to learn new things based on just a few examples. To write meaningful movie plot summaries, user manuals, or folk sayings, it needs to see anywhere from five to a few dozen examples of these types of text. This is exactly what you can observe when choosing a style. When we taught Balaboba to generate folk wisdom, we showed it only a few well-known examples, including “better late than never”.

Learn more about how we trained our largest open model — YaLM 100B — here.