Yandex Blog

Now We’re Looking for Lepton Flavour Violation

Wouldn’t we all like to think that the world that we’re living in is more or less stable? Isn’t there a certain pleasure to be sure that our feet will be pulled to the ground as firmly tomorrow as they are today? Isn’t it reassuring to know that the cup of tea we’ve just put on our desk won’t disappear instantly and reappear on the bottom of the sea on the other side of the planet having traveled its diameter on a straight line? In classical physics, Newton’s laws give us this reassurance. These laws bestow predictability on objects or events as they exist or happen in our reality - on a macroscopic level. On a microscopic level - in particle physics - Fermi’s interaction theory, for instance, postulates that the laws of physics remain the same even after a particle undergoes substantial transformation.

In 1964, however, it became apparent that this isn’t always the case. James Cronin and Val Fitch showed, by examining the decay of subatomic particles called kaons, that a reaction run in reverse does not necessarily retrace the path of the original reaction. This discovery opened a pathway to the theory of electroweak interaction, which in turn gave rise to the theory we all now know as the Standard Model of particle physics.

Although the Standard Model is currently the most convenient paradigm to live with, it doesn’t explain a number of problems, including gravity or dark matter. Other theories compete very actively for the leading role in describing the laws of nature in the most accurate and comprehensive way. To succeed, they have to provide evidence of something that happens outside the limitations of the Standard Model. A promising area to look for this kind of evidence is the decay of a charged lepton (tau lepton) into three lighter leptons (muons), which happen to have a certain characteristic - flavour - that is different from the same characteristic of their ‘mother’ particle. According to the Standard Model, the probability of this decay is vanishingly low, but it can be much higher in other theories.

One experiment at CERN, LHCb, aims at finding this τ → 3μ decay. How are they going to find it? By searching for statistically significant anomalies in an unthinkably large amount of data. How can they find statistically significant anomalies in an unthinkably large amount of data? By using algorithms. These can be trained to separate signal (lepton decays) from background (anything else, really) better than humans. The problem here, however, is not only to find these lepton decays, but also find them in statistically significant numbers. If the Standard Model is correct, the τ → 3μ decays are so rare that their observations are below experimental sensitivity.

To come up with a more sensitive and scale-appropriate solution that would help physicists find evidence of the tau lepton decay into three muons at a statistically significant level, Yandex and CERN’s LHCb experiment have launched a contest for a perfect algorithm. The contest, called ‘Flavours of Physics’, starts on July 20th with the deadline for code submissions on October 12th. It is co-organised with an associated member of the LHCb collaboration, the Yandex School of Data Analysis, and Yandex Data Factory - a big data analytics division of Yandex - and is hosted on a website for predictive modeling and analytics competitions, Kaggle. The winning team or participant will claim a cash prize of $7,000, with $5,000 and $3,000 awarded to the first and the second runners-up. An additional prize in the form of an opportunity to participate in an LHCb workshop at the University of Zurich and $2,000 provided by Intel will be given to the creator of an algorithm that will prove to be the most useful to the LHCb experiment. The data used in this contest will consist both of simulated and real data, acquired in 2011 and 2012, that was used for the τ → 3μ decay analysis in the LHCb experiment.

Contest participants can build on the algorithm provided by the Yandex School of Data Analysis and Yandex Data Factory to make an algorithm of their own.

The metric for evaluation of the algorithms submitted for this contest is very similar to the one used by physicists to evaluate significance of their results, but is much more simple and robust thanks to the collective effort of the Yandex School of Data Analysis and LHCb specialists who have adapted procedures routinely used in the LHCb experiment specifically for this contest. Our expectation is that this metric will help scientists choose the algorithms that they could use on data that will be collected in the LHCb experiment in 2015, and in a wide range of other experiments.

Finding the tau lepton decay might take us out of the comfort zone of the Standard Model, but it just as well may open the door to extra dimensions, shed light on dark matter, and finally explain how gravity works on a quantum level.


Collisions as seen within the LHCb experiment's detector (Image: LHCb/CERN)

Yandex’s School of Data Analysis Joins LHCb Collaboration

The Yandex School of Data Analysis has joined in collaboration with CERN’s Large Hadron Collider beauty (LHCb) experiment. The project is one of four large particle detector experiments at the Large Hadron Collider, and collects data to study the interactions of heavy particles, called b-hadrons.

As a result of this collaboration, the LHCb researchers will receive continuous support from existing applications (EventIndex, EventFilter) and the development of new services designed for the LHCb by the Yandex School of Data Analysis. YSDA will contribute its data processing skills and capabilities, and perform interdisciplinary research and development on the edge of physics and data science that will serve the aims and needs of the LHCb experiment.

LHCb 81 copy.jpg

LHCb experiment. Photo by Tim Parchikov.

The researchers at the LHCb experiment are seeking, among other things, to explain the imbalance of matter and antimatter in the observable universe. This programme requires collecting, processing and analysing a very large amount of data. Yandex has already been contributing its search technologies, computing capabilities and machine-learning methods to the LHCb experiment since 2011, helping the physicists gain quick access to the data they need. Since January 2013, Yandex has been providing its core machine-learning technology MatrixNet for the needs of particle physics as an associate member of CERN openlab, CERN’s collaboration with industrial partners.

The Yandex School of Data Analysis is now part of the game, with its exceptional talent, a strong tradition in hard-core mathematics, and proven experience of converting new theoretical knowledge into practical solutions. The YSDA is the only member of the LHCb collaboration that does not specialise in physics. Other collaborators in the project include such prestigious institutions as MIT (USA), EPFL (Switzerland), University of Oxford and Imperial College, London (UK).

The Yandex School of Data Analysis is a free Master’s-level program in computer science and data analysis, offered by Yandex since 2007 to graduates in engineering, mathematics, computer science or related fields. It trains specialists in data analysis and information retrieval. The school’s program includes courses in machine learning, data structures and algorithms, computational linguistics and other related subjects. It runs a number of joint programs, both at Master’s and PhD levels, with leading education and research institutions including the Moscow Institute of Physics and Technology, the National Research University Higher School of Economics (HSE), and the Department of Mechanics and Mathematics of Moscow State University. In seven years, the Yandex School of Data analysis has prepared more than 320 specialists.

MatrixNet helps CERN physicists find what they are looking for

One of the four big experiments at the world’s largest and most powerful particle accelerator, the Large Hadron Collider, is now testing Yandex’s machine learning technology, MatrixNet, on their data on B-meson decay.

This is a new stage in a long-term collaboration between the European Organization for Nuclear Research (CERN) and Yandex, which began in 2011 when the LHCb experiment started using our servers for some of their data simulation and continued in 2012 with Yandex supplying a prototype of a custom-built search tool for the LHC events. Now, Yandex’s machine learning technology is expected to help the CERN physicists boost precision levels in identifying extremely rare particle decays in the vast amount of data collected by LHCb. Comparing the number of observed events against predictions, scientists can confirm or refute their theories.


Bs0->mumu decay candidate observed at the LHCb experiment (photo by CERN)


In November last year the LHCb researchers reported that they had observed the decay of a Bs meson into a muon-antimuon pair for the first time. The statistical level of significance for this decay, however, did not allow to unequivocally qualify this a discovery. But, that there is not a statistically significant number of decays in the LHCb data does not mean that they are not there. It only means that a better tool or more data are needed to observe them with confidence. With MatrixNet, which allows to make decisions about data relevancy based on a very large number of factors, statistical levels of particle decay detection might turn out to be dramatically different. And this is one of the reasons why CERN liked the idea of using MatrixNet.

CERN is a very large-scale international laboratory where hypotheses, theories and models in theoretical physics are tested by running experiments and accumulating data, which can then be analysed and interpreted. Since the LHC was started in 2008, the LHCb experiment has been collecting data about over 10 billions particle events per year. When the LHC stops for an upgrade in spring this year, scientists will go into the analytical phase of their research.

MatrixNet is a high-precision tool that can make a difference in the quality of results obtained during data analysis. By joining CERN openlab, a framework to test and validate cutting-edge information technologies and services in partnership with industry at CERN, we will work this year on helping scientists find what they are looking for. As a CERN openlab Associate, the objective is to develop a service that could allow the CERN researchers to use MatrixNet for their purposes without additional assistance from our engineers, as it happens now. The launch of the MatrixNet service at CERN, scheduled for May 2013, will give the physicists an opportunity to detect particle decays more precisely, while we will be able to improve our machine learning technology by running it on a very large dataset. What MatrixNet does when applied to CERN’s event data is much like what it does to build a ranking formula for Yandex’s search engine. CERN’s use of MatrixNet on their data gives us an opportunity to expand the application range for our machine learning technology beyond web search into a new field – theoretical physics.