What Went Wrong with Kinect?
Digital Foundry looks at the highs and lows of Microsoft's controller-free platform and considers a next-gen Kinect.
"It became very important for us to create a new control scheme where anybody - no matter what your age or gaming ability - can just get in there can play with Xbox. No instructions, just very simple and easy to use. But at the same time we wanted to give extra fidelity for core gamers. So, simple and approachable, extra fidelity - it seems like opposite things, but those are both things we can do with Project Natal."
It's almost three years now since Digital Foundry first went "hands-off" with Microsoft's seemingly magical depth camera technology, and out of all the things creative director Kudo Tsunoda said at that initial presentation, this one quote is perhaps the most interesting in pinpointing exactly what went wrong with Kinect. Seemingly devised as a device to break down barriers to gaming while at the same time improving the quality of the core game experience, it's safe to say that the fastest-selling consumer electronics launch of all time only achieved some of its goals.
At its height, Kinect produced some great new mainstream-friendly games and delighted players of all backgrounds with some key exclusive titles built around the strengths - and weaknesses - of the hardware. These were games that simply wouldn't have been as much fun with the traditional controller and, while they may not have appealed to the core gamer, they did indeed manage to expand the reach of the Xbox console. However, in the wake of this week's release of the genuinely disappointing Steel Battalion, one thing is clear: Kinect categorically does not offer "extra fidelity for core gamers" and, these days, Tsunoda would be laughed out of the room for suggesting it.
Tsunoda himself would surely have been well aware of Kinect's limitations, even back then. Speaking to developers who absorbed everything Microsoft had to offer about the new technology, it's safe to say that the platform holder had quantified the camera's performance on every conceivable level at around the same time that kits were sent out to developers. It knew exactly what it could do and what it couldn't, and was busy writing white papers and devising entire seminars aimed at briefing devs on optimising for the hardware.
Dealing with Kinect's inherent latency - the key factor that kills its implementation in core titles - we were told how Microsoft broke down the entire image capture/data transmission/image recognition pipeline, providing precise latency timings for each stage, and even going to the trouble of providing best-case/worst-case lag scenarios depending on the type of processing/rendering set-up developers were using. According to our sources, Microsoft conceived of around 300ms lag in a worst-case scenario, and something close to 100ms at its absolute best. However, most game devs weren't going to build an entire game engine around Kinect, so the end point was often something in-between. Rare actually went on the record to say that lag in Kinect Sports was 150ms, which would make it one of the more responsive titles available for the platform.
Even that only puts Kinect gaming at the same kind of response level as cloud gaming service OnLive at its best, but with the additional 'lag' of the human body itself, because jumping, kicking or waving your arms about simply takes that much longer than pressing a button. There are also built-in latency issues with games based around gesture commands in that the system takes a certain amount of time to figure out what you're actually doing before it can even begin to turn your input into a motion in-game.
This is not to say that Kinect can't host entertaining games. Kinect Adventures might not have been the Wii Sports-level killer app Microsoft hoped it would be, but it introduced the hardware and its capabilities admirably - Kinect Sports and its sequel likewise. In Dance Central, the technology combined with a specific style of gameplay that proved a match made in heaven: in a game defined by processing set movements, lag could be totally factored out of the equation. In Child of Eden and The Gunstringer, Kinect demonstrated it could even handle shooting titles. While precision headshots were off the table, these games proved that grand, sweeping gestures could actually translate into some satisfying gameplay.
Microsoft's depth-cam had its strengths, it had its weaknesses, but as long as the games were built around them, it could succeed and thrive as a platform in its own right. It may not have appealed to the core in the way that Kudo Tsunoda suggested it would, but Kinect initially seemed as if it could overcome the dance and fitness fads and evolve into a companion platform to the Xbox 360 aimed at the less committed gamer. But Microsoft wanted more.
"Microsoft's depth-cam had its strengths, it had its weaknesses, but as long as the games were built around them, it could succeed and thrive as a platform in its own right."
Hostile Territory
E3 2011 is perhaps where things started to go wrong, where the messaging started to get rather confusing and where we saw Kinect begin to encroach into 'hostile territory' - the core market. But even at this point there was still plenty of evidence that Kinect was a thriving platform. We saw titles like Fable: The Journey, Kinect Sports: Season 2 and Disneyland Adventures - designed from the ground up for the hardware, very much in the mould of the more successful first-generation titles. Kinect Fun Labs showed another potential path for the tech in smartphone-style, "snackable" concept-driven games - a rich vein of potential that never really evolved much further.
Purpose-built titles could see developers design around the weaknesses of the platform, but in the move to integrate Kinect into core titles, it suddenly started to look wholly inadequate for the task in hand. Forza Motorsport 4 gained an almost completely pointless control implementation where the player had no say over acceleration or braking - instead Kinect only offered value for its head-tracking and Autovista modes. Ghost Recon: Future Soldier saw Ubisoft attempt to graft Kinect onto the typical third-person controller set-up resulting in an exercise which demonstrated the lack of "extra fidelity" in spectacular style. The eagerly-awaited Star Wars Kinect - teased at the Kinect reveal a year earlier - looked risible. And then there was Mass Effect 3's voice control - another value-added extra that left most of Kinect's expensive technology unused.
Kinect and traditional Xbox 360 titles were clearly and obviously a poor fit, but Microsoft told an unconvinced audience otherwise:
"It's always great to see Kinect showing up in all different types of entertainment and gaming genres. Those new technologies - the longer that developers are able to play around with it and develop on it, the better their understanding of how to use the tech in a way that best fits the experiences they want to build is going to be," Kudo Tsunoda told VentureBeat at the time.
"You see such a wider variety of Kinect content now. It's great to see stuff showing up in more hardcore genres. And I think the way that people are using it in their experiences really shows the breadth of what Kinect can do. It enables, I think, creative people to use Kinect in a way that really enhances their experiences in a meaningful way for people who love their franchises."
"A combination of latency issues and inaccurate tracking makes Kinect fundamentally ill-suited to fast action core titles."
The reality proved to be somewhat different. Across the board, the arrival of Kinect in core titles manifested merely as novelty bonus features that could safely be ignored. In a business where development resources are at a premium, nobody outside of Microsoft really had the time to invest in getting the most out of the Kinect hardware - and they certainly weren't going to fundamentally redesign their games to suit camera functionality only a minority of the userbase could use.
In the meantime, Gears of War: Exile - hotly rumoured to be an on-rails Kinect-exclusive chapter in Epic Games' franchise - was canned. To this day we've yet to see any kind of convincing Kinect implementation that can accurately use the system for the kind of precision gunplay a traditional core shooter requires. Put simply, you can point and shoot with the Wiimote and PlayStation Move, but you can't with Kinect, which is wholly reliant on big, obvious movement.
Kinect: The Nadir
As we moved into 2012, it became clear that the platform was losing momentum and Kinect reached its nadir at E3: the big titles unveiled at the press conference were predictably fitness and dance-related - a Nike tie-in and another Dance Central game. A platform that demanded innovative thinking was declining into 'me-too' territory - and it was Microsoft itself that was championing these titles as the best the platform could offer.
Fable: The Journey - still unreleased, and possibly the only major franchise title Microsoft developed that could bridge the gap between casual and core - was criminally overlooked at the E3 conference, represented only by a short trailer. Crytek's combat title Ryse was a no-show, fuelling speculation that it's been bumped to the next-gen Durango, while Gore Verbinski's Matter sounded intriguing but, again, had nothing to show. Wreckateer - which has you hurling rocks at castles with a motion-propelled catapult - was the only title that captured even the merest shadow of the fun, concept-driven gaming that gave the Kinect format its initial burst of success.
Meanwhile, core games creators had finally settled upon a use for Kinect - in the process disregarding the vast majority of its innovative technology. In Bethesda's Skyrim, audio commands replaced convoluted menu manipulation - when the system actually recognises voice properly, of course. Audio-enabled activations of the game's "Shouts" was a given. Fus ro duh.
The concept of using voice to keep you in the game without needing to duck into menu sub-systems clearly has some merit and it's not surprising that forthcoming titles such as FIFA 13 (tactics/substitution), Forza Horizon (voice-controlled GPS), and Madden NFL (audibles) implement it, as do the new South Park and Splinter Cell titles. Assuming voice recognition works consistently, these are features worth having - but it's difficult to avoid the conclusion that in these cases Kinect is morphing into one hell of an expensive microphone in a world where almost every 360 owner already has a perfectly good one built into their Live headset. Utilisation of the RGB and depth cams in all of these core titles is notable only by its absence: game developers simply don't know how to implement them into their designs, or can't dedicate enough resources to do anything meaningful when only a minority of gamers will ever actually use them.
Microsoft's biggest failure at E3 wasn't just the complete omission of any kind of exciting "must have" Kinect title, but the notion that it just didn't even seem to be trying. Just three years after Kudo Tsunodo unveiled this magical new technology that promised so much to a new generation of gamers, it was difficult to avoid the conclusion that Microsoft was out of ideas.
Microsoft vs. the Future of Motion Control Gaming
Will Kinect and core games ever truly converge? Why was Kinect shoe-horned into these titles to begin with? Perhaps it's something to do with the rumours that Microsoft aims to bundle the technology with its next console, preparing its userbase for a new launch where camera-based gameplay elements are going to be part and parcel of the overall package. Perhaps the singular lack of new ideas at this year's E3 had something to do with them being saved for the next-gen Durango.
The recently leaked summer 2010 Microsoft presentation - confirmed by sources as being genuine, but very out-of-date - throws some light on what we could expect. Diagrams in the presentation suggest that the single Kinect sensor is replaced in favour of two smaller satellite cameras, offering a significantly improved field of view that could suit more playspaces and support up to four players simultaneously.
"Developer sources - not to mention Microsoft's own leaked documents - suggest that a new Kinect will play a major part in its plans for the next-gen Durango console project."
There's also discussion on the use of "props" - this is the idea of giving the player something to hold that can be tracked by Kinect - like a baseball bat, or tennis racquet for example. The last time we spoke to a Kinect/Xbox 360 developer, these concepts were expressly prohibited by Microsoft's Technical Requirement Checklist - the argument being that using props directly contradicted the "You Are The Controller" marketing message. Now it looks as though Microsoft accepts that completely hands-free gameplay may not be such a great idea.
Beyond that, not much is covered, but it's clear that incorporating Kinect into the console at the design stage leads to some obvious wins for performance and functionality: simply liberating the cameras from the clutches of the Xbox 360's lacklustre USB controller could cut down latency by as much as 60ms, while next-gen processing power offers the chance to take skeletal tracking and image recognition to the next level. From a developer and publisher standpoint, Kinect integration makes a lot more sense if every user has access to its features.
Despite plenty of Durango leaks floating about, there's scant detail on the next-gen Kinect. It's interesting to note that the second-generation PrimeSense camera (the original reference design is extremely similar to the existing Kinect) ups RGB resolution to 1280x960, but depth resolution remains at VGA standard 640x480. Instead, frame-rate increases to 60FPS. We can't help but feel that depth perception really needs to improve significantly to make Kinect 2 flexible enough for all gameplay scenarios - but maybe stereo viewpoints could make the difference.
Elsewhere, innovation in motion control technology is progressing at an astonishing pace, with the Leap motion sensor in particular looking truly impressive. It features a level of tracking that apparently offers 100x the precision of Kinect, and could be scaled up to a living room playspace - presumably at a price. But the worry with Leap - and indeed any controller-free set-up - is the lack of tangible feedback to a basic function like a button press, not to mention the fact that waving your hands about is simply far more fatigue-inducing than handling a traditional controller. Is 3D motion-sensing really the future of gaming?
"The worry with any 'controller free' solution is the lack of feedback players get from the simple action of pressing a button, or moving a stick."
"There are some experiences that it can do that are really neat but there just weren't enough experiences that made it make enough sense as a platform-level controller," Sony's Dr Richard Marks told Digital Foundry during E3 2011, discussing his own research into 3D cameras.
"Coming back is that sometimes we need buttons to have certain kinds of experiences. Other times we need more precision than we can get out of those cameras. We need to know exactly what you're doing with your hands, especially in the more hardcore experiences... So the click of a button is equally the input, but also the feeling that it actually occurred. That's such an important thing. If you make a gesture to make something happen all the time, you don't have that immediate feeling of knowing that it worked. You have to wait and see if it happened and that just slows everything down. A click gives instant knowledge..."
Looking back at Kudo Tsunoda's vision for Kinect - a technology that opens up gaming for newcomers while offering extra fidelity for core games - it's difficult to avoid the conclusion that it was Microsoft's competitors that got closer to the ideal. Wii reduced the gaming interface to a simple remote control - and everyone knows how to use one of those - while PlayStation Move perhaps over-complicated things in comparison but undoubtedly had that extra fidelity that Tsunoda craved. Can a bundled Kinect 2 actually deliver immediacy and additional precision over the standard controller? We'll see. Maybe next year's E3 will be a more positive turning point for the technology.