The Cyber-Circus: A Google-Powered Gesture Interface Showcase

Earlier this year, during a keynote at the Google I/O developer conference, company representatives demonstrated an under-development HTML5- and CSS-based web application called Movi.Kanti.Revo, co-developed with Cirque du Soleil. As of a week ago, it's now available for you to try out for yourself. And notably for an embedded vision audience, it's gesture-based via the webcam-tapping capabilities of WebRTC, which (quoting Wikipedia) strives to "enable applications such as voice calling, video chat and P2P file sharing without plugins." Per the Google blog post announcing the unveiling:

Breaking with the tradition of point and click web browsing, you can navigate through this unique experience simply by gesturing in front of your device’s camera. This was made possible using the getUserMedia feature of WebRTC, a technology supported by modern browsers, that, with your permission, gives web pages access to your computer’s camera and microphone without installing any additional software.

A Chromium Blog post provides more technical details:

The experiment was created using just HTML5, and the environment is built entirely with markup and CSS. Like set pieces on stage, divs, images and other elements are positioned in a 3D space using CSS. To create movement, CSS animations and 3D transforms were applied making the elements appear closer and further away. Everything is positioned and scaled individually to create a highly realistic interactive environment. In addition, the experiment uses HTML5 <audio> to play music and sounds.

Movi.Kanti.Revo breaks with the tradition of keyboard or mouse navigation; instead users navigate through an interactive Cirque du Soleil world with their gestures. To accomplish this, the experiment asks users for permission to access their web cam using the new getUserMedia API. With this new API, the experiment renders the camera output to a small <video> element on the page. A facial detection JavaScript library then looks for movement and applies a CSS 3D transform to the elements on the page, making environment move with the user.

And for even more implementation information, a detailed tutorial article is also available.

Not surprisingly, Movi.Kanti.Revo worked well for me in Chrome browser v22.0.1.229.79 for Mac OS X, complete with creating a small window at the bottom of the screen that showed me how my head and hands were being detected and tracked. I had no such similar webcam-enabled luck in either Firefox v15.0.1 or Safari v6.0, whose WebRTC support is presumably comparatively immature, although in both of these cases a mouse click-based navigation interface was alternatively offered. And my attempt to access www.movikantirevo.com via Internet Explorer 8 for (virtualized) Windows XP was even less successful, rendering only "garbage" text and graphical UI still shots on-screen.

For more on the Movi.Kanti.Revo "Chrome experiment", see the following additional coverage:

If you're building AI or vision-enabled products, you've come to the right place.

The Cyber-Circus: A Google-Powered Gesture Interface Showcase

Pages

Topics

Contact

Address

Phone