The IEEE Hot Chips Conference has historically been a can't-miss event for me each year, but I wasn't able to attend this year's 23rd iteration held last week. Tipped off by a 'tweet' from a former co-worker, I realized too late that Microsoft was on the program, discussing the objectives and implementation of the Kinect system design. Fortunately, Dean Takahashi from VentureBeat was there, and he subsequently filed a compelling writeup.
The article title, 'How Microsoft engineered Kinect to withstand gamers and lightning strikes,' does a solid job of summarizing the themes of the presentation (titled 'Electrons, Photons, Phonons, Waves, Bits, and Industrial Design: Microsoft Kinect') delivered by Microsoft engineers Dawson Yee and Scott McEldowney, with John Sell listed as a co-author in the program. The company had previously gotten a lot of heat, both figuratively and literally, for its decision to launch the Xbox 360 in the fall of 2005 with 90 nm-fabricated ICs and, as it turned out, a sub-optimal thermal subsystem. While doing so enabled Microsoft to start selling the Xbox 360 one critical market year ahead of Sony's PlayStation 3 (as well as ahead of the Christmas 2005 shopping season), the resultant abundant RRODs (red rings of death) on first-generation systems:
prompted expensive-to-Microsoft warranty extensions and system replacements for early adopters. Admittedly, I personally saw several Xbox 360s sitting directly on thick shag carpets, "hermetically sealed" within entertainment system cabinets, left turned on 24:7, and in other airflow-constraining and otherwise excessively harsh (at least to this thermally clued-in engineer) usage settings. Nonetheless, consumer ignorance is ultimately no excuse. And to wit, Microsoft clearly was unmotivated to repeat its past design decisions with Kinect, judging from these excerpts from Takahashi's piece:
Microsoft engineers said they deliberately over-designed the Kinect system so that it could withstand anything that consumers could throw at it: hot temperatures, drops, careless shipping, abusive gamers, a sudden loss of power, and even surge protection from lightning strikes. (To be clear, it won’t survive if hit by lightning. But if your house or electrical wires are hit by lightning and the power surges, then Kinect has a chance of surviving).
As far as the actual design of Kinect is concerned, as well as how the hardware-and-software architecture implements its 3D vision, facial detection-and-recognition, gesture identification, and audio location-and-background-noise-suppression features, there may not be any notable surprises cultivated from a perusal of Takahashi's presentation summary…at least to those of you who are already familiar with iFixit's Kinect teardown, or with past writeups in Wired Magazine and elsewhere. Nonetheless, it answered at least one question that'd long nagged me; what was the function of the Marvell-developed ARM SoC in the unit, given that the bulk of the image processing code ran not in Kinect itself but in the USB2-tethered console?
The audio array was a Microsoft research invention with four microphones that enabled Microsoft to identify spoken words and tell the direction that a sound was coming from. The box includes a Marvell 88ap1 audio chip and a Texas Instruments TAS1020 motor controller. The audio has to be capable of handling speech commands, doing simultaneous voice chat between gamers, and video conferencing. For that purpose, Microsoft went with higher-quality wideband 24-bit audio. The quality had to be good enough so Kinect could pick up a quiet voice or a loud voice. The Marvell chip handled the audio.
The Microsoft presenters also confirmed, as I most recently pointed out earlier today, that the USB2 interface to the console was left unencrypted by design, to enable subsequent hacking. And they described how Kinect not only self-calibrates for each unique operating environment, but was also intentionally over-designed to allow for future software-enabled capability expansion (the system includes VGA-resolution image sensors, for example, although only downscaled QVGA-resolution frames are employed by the various embedded vision algorithms):
Each system has to be calibrated before it ships, but the system also has to be capable of calibrating itself in tests with the user. The requirement for calibration meant that the system had to have a tilt motor which could automatically raise or lower the sensor. With the motor came more requirements for precision manufacturing and reliability. When you turn on Kinect, the first thing the camera does is look for the floor. When it finds the floor, it knows a user won’t be far away. The system was “over-designed” to be more accurate than necessary because the engineers anticipated future applications that would need the accuracy.
Speaking of Kinect, if you're interested either in mating it to an existing Xbox 360 in your home, in trying out the SDK either standalone or in conjunction with the Microsoft Robotics kit, or in doing a bit of unofficial platform hacking, I encourage you to check out an in-progress one-day promotion at CowBoom (Best Buy's online outlet site). $79.99 plus $5 shipping for a refurbished Kinect is a great deal.