At Yandex Labs, we had a chance to work on a 3-month practicum with students and Professor Ian Lane from Silicon Valley department of Carnegie Mellon University. The project was ambitious but also fun: we wanted to build a new TV experience - personalized and interactive. We developed an application for TV that shows personalized content on a TV screen and allows the users to easily manipulate and interact with the content using hand gestures. The app is still a prototype and is not available for download, but we made this video to share our ideas with you.
The app brings users’ social network streams to their TV screens and allows them to navigate over this information using hand gestures. It is built on Mac OS X platform and we used Microsoft Kinect for gesture recognition.
The application features videos, music, photos and news shared by the user’s friends on social networks in a silent ‘screen saver’ mode. As soon as the user notices something interesting on the TV screen, they can easily play, open or interact with the current media object using hand gestures. For example, they can swipe their hand horizontally to flip through featured content, push a “magnetic button” to play music or video, move hands apart to open a news story for reading and then swipe vertically to scroll through it.
To train gesture recognition, the Carnegie Mellon students together with Professor Ian Lane evaluated several machine learning techniques, which included Neural Networks, Hidden Markov Models and Support Vector Machines (SVM), with SVM showing 20% better accuracy. They put a lot of effort in building a real training set – they collected 1,500 gesture recordings, each gesture sequenced into 90 frames, and manually labeled from 4,500 to 5,600 examples of each gesture. By limiting the number of gestures to be recognized at any given moment and taking into account the current type of content, the students were able to significantly improve the gesture recognition rate.
We have been thinking of controlling a social application with gestures for quite a while. When we found a team of like-minded enthusiasts, we took this opportunity and did a nearly three-month research project. The results of this effort were quite impressive and now we are looking whether we can implement them in a real life application.