Editorial

Xbox One has a Kinect problem, but can 'Cortana' fix that?

Microsoft announced its new Xbox One console in May with a much-ballyhooed press event that largely proclaimed the greatness of the console’s TV features, made possible by the next-generation Kinect sensor.

The not-so-subtle assertion at the event was that the bundled Kinect sensor was a large part of the Xbox One experience – a feature that would set the console apart from its competitors. So far, however, the sensor has a major drawback that should be a major advantage: its voice recognition capabilities. Microsoft surely knew it was making a big gamble by selling the console with a bundled Kinect sensor, thereby increasing the cost, and so far the gamble hasn’t paid off in terms of capabilities.


The next-gen Kinect sensor has primarily frustrated rather than satisfied gamers so far.

When the Xbox One launched in November, reviews praised the console’s all-in-one features with one big caveat: Kinect’s voice recognition, as with the Xbox 360’s Kinect sensor, was sporadic at best. Now that users have played several games that make use of voice recognition, complaints of games consistently not recognizing speech are becoming commonplace. Wired, for instance, published an article on Thursday proclaiming that “Kinect completely ruins ‘Need for Speed [Rivals]’ on Xbox One.”

Voice recognition in Xbox One games released thus far has been alarmingly bad.

The problem boils down to poor recognition of natural language. In the “Need for Speed Rivals” case, the command “open map” was confused with similar-sounding phrases, including – amusingly enough – “the Kinect is a piece of crap.” I’ve experienced similar issues in “Dead Rising 3,” where even coughing or a chair squeaking will accidentally resume a game from the pause menu.

Obviously developers have to do better, but any help Microsoft could provide in terms of the tools that go along with Kinect would clearly make their lives easier – and soon that may just happen.

Recent reports indicate Microsoft is working on a digital assistant codenamed “Cortana” (after the Halo franchise’s A.I. companion) that will use voice recognition. Cortana is expected to first launch as part of Windows Phone 8.1 early next year before making its way to Windows 8.1 and Xbox One. It’s believed to be similar to Google Now in that it will recognize natural language – a significant step up from Kinect’s current command-based system.

As far as Windows and Windows Phone are concerned, Cortana is a much-needed feature in the battle against Android and iOS, which have Google Now and Siri, respectively. The new assistant will reportedly replace Bing as the default search service for Windows Phone, and it will be able to “learn and adapt” to user input, according to ZDNet’s Mary Jo Foley. Those are great features for personal computing devices, but they could be even more important on Xbox One, where memorized commands often aren’t recognized.

The Halo franchise's Cortana A.I. assistant

Since the first-generation Kinect sensor was introduced at E3 2009, it’s always been a technology that promised more than it delivered, thanks in large part to Microsoft’s unclear strategy. Games such as “Kinect Sports” and “Kinect Adventures” showed the accessory’s potential, but despite selling 8 million units in 60 days, the amount of games released with Kinect features slowed to a trickle. “Zumba Fitness: World Party” was the only game released in 2013 that required the original Kinect, and only a few other games offered additional functionality with the sensor.

Microsoft has shown little interest in bucking that trend with Xbox One’s Kinect, as only a handful of games have been announced with extensive support for it. So far, only fitness games and the abysmal “Fighter Within” make significant use of the device; “D4,” “Fantasia: Music Evolved” and “Kinect Sports Rivals” are the only upcoming announced games that can be controlled solely with Kinect.

On the Xbox One itself, Kinect is useful for basic interactions, but users still have to memorize set commands to properly control their consoles. It’s a neat added bonus, but probably not something most people would spend $100 on when a controller serves the same purpose while being both quicker and more precise. The SmartGlass app for Android, iOS, Windows and Windows Phone also tends to have more real-world uses than Kinect at the moment. The only navigation-related task Kinect can do that a controller or SmartGlass app can’t is use Xbox One’s integrated recording feature.

Why use Kinect to navigate the Xbox One interface when either a controller or the SmartGlass app is both easier and more precise?

In an interview with canada.com published Thursday, Larry Hryb, Xbox Live’s director of programming, promised the Xbox team would “continue moving forward and refining the technology” used in Kinect, presumably referring to Cortana. There’s no reason to doubt Hryb’s statements, but that doesn’t diminish the need for a better long-term strategy for Kinect.

It’s one thing to promise improvements, but it’s far more important that those improvements are actually used for the benefit of users. Put simply, there needs to be a compelling reason for users to want to talk to their consoles. After its initial launch, Kinect owners had little reason to use the first-generation sensor once they beat the few good games available. The resale value of the original Kinect has plummeted to reflect that: eBay auctions for used versions of the accessory typically top out at about $50, while GameStop offers a paltry $10.

Microsoft needs a better plan if it wants the new Kinect to meet a different fate than its predecessor. It’s great that the sensor serves as the backbone for Xbox One’s TV integration, but that feature alone doesn’t warrant a significant portion of the console’s price tag.

Images via Microsoft

Report a problem with article
Previous Story

Microsoft: Missing 8 of Xbox One's planned markets was the worst part of launch

Next Story

Germaphobes rejoice: Corning reveals new antimicrobial smartphone glass

44 Comments

Commenting is disabled on this article.

Watching YouTube videos is a joke at the moment. "Xbox, Stop Listening!" is uttered ever 30 seconds in a room that isn't absolutely silent.

When I first read the statement, "Recent reports indicate Microsoft is working on a digital assistant" it reminded me of Clippy.

I have used voice control since Nov 22, Work great. The only thing I have to think about is "Xbox Pause" to pause a movie with the first command, I wait until people in the movie stop talking (this is only when it is loud)

You shouldn't as long as you're using the HDMI pass-through and did the calibration. It filters out the HDMI audio. They did a demo on it and you could hear the voice clearly even with a really loud movie.

When you setup your Kinect (calibrate it) you need to have your TV volume turned up very loudly. Then you should be able to speak to Kinect instead of yelling at Kinect.

Doing the proper calibration with the TV turned up loudly make a world of difference in the Kinect's performance.

It is still not 100% perfect. It seems to be more of a software issue than a Kinect issue. I hope MSFT can fix it. Seems almost like the X1 has a task going on and therefore isn't ready for the voice command.

I can say Xbox Watch TV. It won't react. In the same voice and volume I repeat it and then it works.

I've found when it doesn't response it is when you say the command too quickly after "xbox." That's to prevent it from picking up those sounds out of other words and going to listen at unexpected times. That's not a bug but a design parameter with no exact solution. Everything else that uses voice requires you push a button or something to make it listen.

After Sony undercut with PS4 and the X1 had to 180 degree on other plans, they should have also said, "hey maybe we will launch at $389 without Kinect" (you know, since it's unfinished and unnecessary and undesirable to many). I definitely wouldn't have cancelled my preorder if they'd done that.

Sorry kids, but the voice stuff works great. I rarely pick up my remote anymore. It does not need some goofy Apple style novelty act and certainly shouldn't do something annoying like talk back.

My Kinect for the Xbox One has been boxed up since Christmas. For some reason it won't pick up some of my commands and at random times it would pick up our voices while just sitting around talking. I returned it and got a new one just in case. Same issue with the Kinect. The Kinect on my Xbox 360 worked better.

Does the "Cortana" product require an internet connection like Siri? If so, latency could be a big issue, as well as connectivity for some. If it does require connectivity, I would hope that it has the option to disable it as well as retain a fallback mode based on the current design. I wouldn't want everything I say to be streamed out somewhere, and even if it only sends 30 seconds after it detects the "Xbox" command or whatever, it could still get that wrong, especially if a chair squeeking could set it off. A "live" mic in my living room feels too creepy.

On a side note, I haven't had a problem with PS4 voice commands and haven't heard much criticism, can anyone comment on their experiences or find any criticisms?

Edited by Geezy, Jan 6 2014, 10:56pm :

Geezy said,

On a side note, I haven't had a problem with PS4 voice commands and haven't heard much criticism, can anyone comment on their experiences or find any criticisms?

I've been using them as well and they have been fine for me. Of course I've also not had issues with Kinect not hearing me on my X1, so take that for what you will.

The PS4's vocabulary list is smaller than the X1, so there is less you can do with the voice commands. But what they do have has worked reliably for me.

Maybe it's an accent problem, because mine works really well. Other than "Xbox on" (which usually takes 3-4 tries) I'm probably seeing about a 95% success rate with the voice commands. Being able to launch a game, switch to TV, change channels, pause, play, and ask for a medic with voice is waaaaaaay better than reaching for a controller or waiting for Smartglass to load...

I was having no issues at all but lately my Xbox is refusing to understand "Xbox On" taking several tries to turn it on.

Nimdock said,
I was having no issues at all but lately my Xbox is refusing to understand "Xbox On" taking several tries to turn it on.

I have the opposite -- Xbox On was a chore when I first got it, but either I'm learning how to say it better, or the Xbox One is learning how I talk, not sure which... But it now turns on relatively easily (never more than 3-4 tries, when it used to take 10-20 sometimes and led us to saying things like, "Xbox, turn the F on already you piece of S!" Once it's on, it understands commands pretty well. And just yesterday, I walked over to the dinner table, realized the NFL playoff game was on, so said, "Xbox, On," followed by "Xbox, watch TV" and "Xbox, volume up" so I could watch the game while eating.

Where have I heard this line before? Oh YEAH! It was when the original Kinect launched! "Just give us a few months." and "just wait, compelling games using Kinect ARE coming!". It's sad that every sucker who bought one now gets to beta test for MS for the next year or two. And I have yet to see ONE GOOD GAME announced for it, either.

It works for me.

I need a pick up, Need Ammo and Thank You are just some of the useful voice commands I use on Battlefield 4. I've not even turned head tracking on yet, but the voice commands make playing a joy. I can keep my hands on the sticks and triggers but get me some Ammo Luv it.

Edited by Andrew, Jan 6 2014, 8:03pm :

fugee said,
Bla bla. It works for me.

I need a pick up, Need Ammo and Thank You are just some of the useful voice commands I use on Battlefield 4. I've not even turned head tracking on yet, but the voice commands make playing a joy. I can keep my hands on the sticks and triggers but get me some Ammo Luv it.


I agree. And I absolutely LOVE the motion control in Dead Rising 3 -- a zombie starts grappling with you, and you shove them off by shoving your controller. MUCH better than "Oh no, a zombie has me, let me press X or B or A to get him off of me." It's very satisfying to throw them off, IMHO. I look forward to more things like that - very natural, but very cool at the same time.

John Nemesh said,
Where have I heard this line before? Oh YEAH! It was when the original Kinect launched! "Just give us a few months." and "just wait, compelling games using Kinect ARE coming!". It's sad that every sucker who bought one now gets to beta test for MS for the next year or two. And I have yet to see ONE GOOD GAME announced for it, either.

+1
Also, like i said, Kinect 2 will also be used as an expensive mic.

I actually really like the kinect 2 and use it every time I use my Xbox One but for voice commands only but not in game. It works well in game too when I want to see my party so i just say "Xbox Snap Party" and when i am done just "Xbox Unsnap". Its much easier doing it that way and launching some apps are quicker just by telling kinect to open them.

Its unfortunate that the feature so many complained that they don't need isn't working as flawlessly as others expected it.

The only problem with the Kinect is that it wont come on by saying Xbox on, I've exchanged the whole system and still it doesn't work. Once its on though voice commands work excellent. The UI while good should support small and large tiles like on phones and tablets.

TBH I don't care for tiles so much on Xbox One like I do on my phone or desktop or tablet.
As you can just tell the Xbox what to do. I never search or pin BF4 I just say:
"Xbox, go to Battlefield 4"

_ <boing> Battlefield 4 loads Simple stuff this Xbox One

I have to agree that the Xbox One's voice recognition is a downgrade from the 360. If you have the commands memorized, you still have to wait for them to be presented on the screen before it will accept them. The 360 could handle several repetitive "Next song" commands in the Music app without hesitation. Now I have to pause at least 2 seconds between each command, or the Xbox One ignores it.

Then, the command structure is inconsistent when the Xbox is already listening: sometimes you can say "Pause," other times you must say "Xbox Pause." And I can't even count the number of times I've said "Xbox Volume Up" when the Xbox was already listening and it just ignored my command and stopped listening.

Sadly, as bad as voice is on the Xbox One, SmartGlass is even worse. First, they removed the remote access capability of the app. On the 360, you could use SmartGlass over a cellular connection, but with Xbox One I now have to wait for my phone to connect to Wi-Fi every time I unlock it in order for SmartGlass to work.

With the 360 app, you could shrink the app to the 1/3 Snap view on Windows 8 and use it like a remote control while doing something else. The new Xbox One app for Windows 8/RT simply shows the green background with the app logo when you collapse it. It's Microsoft's own first-party app and isn't even compiled for Windows 8.1! This is all in spite of the app being first made available after Windows 8.1 reached GA.

The lack of preparation Microsoft has put into the Xbox One is downright disgusting, and the sluggish pace at which they are fixing these issues on software that is clearly still incomplete is alarming. People paid $499 for this console over a month ago, and they still haven't delivered a finished product. Honestly, the TV functionality is cool. It works fairly well considering the sad state of cable/satellite set-top boxes today. But everything else, the parts where Microsoft actually controls the entire experience, is shockingly poor, unpolished, and incomplete.

LOL, I presume (and hope) you don't work in IT.

Feedback -> Diagnosis -> Patch/Re-write - > Test -> Approve -> Document -> Distribute.

You can't be serious about expecting anything in 1 month? Oh wait, you are the I want it now generation. Come on, think about it. And the experience isn't half as bad as you make out.
And who uses Smartglass over a data connection? (And why?)
And just turn your phone lock off I use Smarglass on my Surface RT and it works well.

Its a v1.0 product, just like the PS4 is and just like the 360 was. Things will be buffed and polished in time. Does it play the games? Yes. Does it all work? Yes. Just some minor polishing to do. Disgusting is an overly emotive word.

The problem here will be latency. Google Now and Siri both off load the actual voice recognition problem to backend servers, decoding is therefore more accurate (more processing power) but takes longer which on a phone or desktop isn't a big issue - a two second wait is acceptable when trying to find a restaurant for example.

That same wait is completely unacceptable when trying to throw a grenade in a multiplayer game. Kinect needs to do voice recognition quickly and respond quickly, the more hops involved the more accurate, but the less useful it is.

So yes, this may work for the dashboard, but I wouldn't pin any hopes on this resolving in-game issues.

Part of the "XBOX Bing" recognition failure perception is due to improper documentation and marketing.
Microsoft failed to explain that the grammar used for recognition is NOT a full narrative one but rather limited to the game, music and video store content.

If you search for anything within this realm, you get excellent recognition results. If you try anything else outside this scope, the results are abysmal.

And since consumers were not instructed about this behavior, you get this catastrophic perception that the Kinect has poor speech recognition performance.

Microsoft clearly failed to properly market and present its fantastic device.

Oh yeah?
You really think consumers understand that the Bing Search grammar is limited to the store content?
Comments about the Kinect speech failure clearly show that.

Translating to Kinect for Windows, I don't really see myself waving around to control Windows, but some actual voice control built in would be great, especially if it can be integrated into games (as would make sense given its twin Xbox unit). Voice control has been promised and tried for 2 decades. Having it really work would be nice. Pausing a video with a voice command... yum.

I love the voice commands on Xbox One, however there are issues if you don't know how to speak to it. And it does not always wake up on command. Typical early adopters basically being used as "beta testers". A new ballsy trend that is somewhat annoying considering us early adopters pay full price and earlier than the other guys, we should get much more respect.

Unfortunately "Gestures" on my box at least are embarrassingly bad. Way, WAY worse then the X360. The lag of waving your hand around the screen make it so I won't even go there. Maybe there is something wrong with mine, but I am not interested in chasing that one down until Cortana arrives.

Considering MS has been in this voice game longer than anyone, I will look at "Cortana" as the possible culmination of all these years of experience. However I am setting my expectations at an all time low. Will we be just beta testers yet again? Buck the trend MS, blow us away, I dare you, we deserve it.

Agree. Gestures aren't great. BUT...

"Xbox On"
"Xbox Mute"
"Need Ammo" (Battlefield 4)
"Xbox Turn Off"
Are already very handy and convenient ways of navigating my Xbox, and those and a few more like "Xbox Go Home" all work every time.
The browser commands sometimes take a couple of goes, but the basics work like a charm.

Voice was the only feature that I found compelling on the Xbone, very glad I waited. Would really like to see a windows offering that works...such great potential here.

Article
Why use Kinect to navigate the Xbox One interface when either a controller or the SmartGlass app is both easier and more precise?

I'm sorry that is the dumbest thing I have read today. It is 100 times easier to say Xbox Go to Nextflix than it is to get my stupid phone or tablet out.

Yeah because it works so well if you don't have the english accent.. This product is utter crap and at the end the controller will prevail. 24/7 internet plugged camera in your living room, if that makes sense to you you're a weird person.

boumboqc said,
Yeah because it works so well if you don't have the english accent.. This product is utter crap and at the end the controller will prevail. 24/7 internet plugged camera in your living room, if that makes sense to you you're a weird person.

But you have no problem with a 24/7 internet plugged in camera and microphone built into your phone?

notchinese said,

I'm sorry that is the dumbest thing I have read today. It is 100 times easier to say Xbox Go to Nextflix than it is to get my stupid phone or tablet out.

That's true but the nice thing is you have a lot of options. I can use the voice, my IR remote, the game controller or smartglass. Usually I find voice works the best if I know what I want while the remote is better for channel surfing.

I'd love for that to happen, it'd be especially great to have an option where "Cortana" speaks back when it recognises a command. If it used her voice I'd have a nerdgasm.

Right now I only use voice commands because it's at least twice as fast to navigate around the ui because its so unpolished

Yeah, 'Xbox go to Netflix' or 'Xbox watch TV' is mainly what I use it for for now... and I like it. However I find it hard to disagree with this article. Like you though McKay, an interactive Cortana would be awesome.

I think that opening apps or starting games is just as easy on the One as the 360, but I find using voice commands to be easier than both. Yes, there are a lot of unneccesary commands (that perhaps children or the easily-entertained like to use), but there are also a lot of truly useful ones that I often employ.

Agreed, of course, though, that having Jen Taylor voice the Cortana assistant is simply a must. Her work as Princess Peach, Cortana/Halsey, and Zoey in L4D is always a crowd-pleaser.