Tech Analysis: Kinect
Digital Foundry on latency, CPU overheads and how it all works.
It's the day after the night before, and a chance to reflect on our hands-on playtest of the new Microsoft Kinect for Xbox 360 platform, contact our sources and attempt to put together some semblance of the technical picture behind the device formerly known as Project Natal.
It's difficult to dislike what Microsoft has done, despite the fact that none of the games on offer were designed to appeal to the core audience that has loyally stuck by the platform over the last five years. Behind the Avatar-driven, cutesy, cartoon-style games is a technological masterpiece that is simply a breathtaking achievement: full-motion capture of multiple players simultaneously combined with excellent quality voice recognition, all in a consumer-level package.
So what are the crucial components within Kinect for Xbox 360, and how have we seen them implemented in the titles we got to play on Monday night?
Kinect has a traditional RGB camera in it, as found in a multitude of webcams and mobile phones, and it's capable of a standard 640x480 resolution, operating at 30 frames per second. Alongside this are the depth sensors. These bathe the area in an infra-red wash, colour-coding the scene based on how far away the objects are. This is the key to Kinect's unique capabilities. Not only does it allow games to know where everyone and everything is in 3D space, but it also means that even without the RGB data it can operate just fine in any lighting conditions - even pitch black.
The depth map is the most crucial weapon in Kinect's arsenal, and it can also be integrated with the traditional RGB webcam image in a process known as registration, although the integration of the two planes together does incur a small additional CPU load. However, even without registration, we can see that developers are making use of it in the launch titles, visualising it directly into the game.
Perhaps the most dramatic example of this is in Ubisoft's Your Shape: Fitness Evolved. Here your on-screen persona is effectively a post-processed rendition of the depth map, with the main figure (i.e. the player) cut out, with additional particles effects overlaid to create a much smoother look.
We also see the depth map in effect in Harmonix's Dance Central. Occasionally the on-screen dancers fade out, to be replaced with another heavily post-processed rendition of the depth map, complete with a range of psychedelic effects. Dance Central is actually an interesting case in point because unlike Your Shape, the map isn't quite so clearly filtered: background items and players will "leak" into the image.
The question is, fancy technical trickery aside, does it actually work? Up in the massive penthouse suite Microsoft had reserved for the event, things were already getting busy when we arrived. While the gameplay areas around each pod were taped off, there was still plenty of potential interference from people wandering about into the camera's field of view and also from flash photography potentially upsetting the IR beams from the depth cams.
However, in all but one instance Kinect worked beautifully, with only a single pod - running cartoon racing title Joy Ride - posing any sort of issue. Even this turned out to be a blessing in disguise. Looking to debug the problem, the game's caretaker returned to the development dash and loaded up the "NUI" debug tool. I managed to get a sneaky photo of this tool in action - essentially it shows the people in view picked up on the depth camera, and then assigns skeletal movement points to them.
So, having established that the system actually works, it was then time to revisit our thoughts on the lag. If you recall, the latency inherent within the new control scheme was one of our biggest reservations about Kinect when we saw it last year in its pre-production Project Natal guise. To give some idea of comparison, we chose to run our patented wavy-arm test on the very same game, albeit an updated rendition.
So, not much has changed in terms of the performance level compared to what we played a year ago. You still need to think ahead and react in advance somewhat to make sure that you hit all those balls, with lag in the 200ms range (including the latency from the display, of course). It's the sort of shift that you're likely to make naturally as you get to grips with the way the system works.