Here is a video of Siri and TellMe in use side-by-side


Recommended Posts

Why should I have to adjust everything I know in order to use a tool? Surely a tool and its subset of features should be designed around the way we speak?

Again, if you'd read the thread you would see that this has already been discussed. "Text ccuk", "Call ccuk", "Find ccuk" are all perfectly natural things to say so you're hardly being forced to compromise. Anyway, whether you think natural language is important is irrelevant, TellMe doesn't offer it and he shouldn't try to use the software incorrectly. It's the equivalent of complaining that you can't view Twitter feeds in the iPhone's contacts application - iPhone doesn't offer this feature so there's no point claiming that it's broken.

TellMe performs at least some of its voice recognition on the phone. I've just turned off all data connections on my phone and TellMe can launch applications, call and text contacts. The only thing it can't do is search using Bing.

That's more than can be said for crappy vlingo on android at least :p driving along the motorway with a skittish data connection renders it nearly useless.

I know you've already conceeded that the recognition isn't as good but that is exactly the reason for the failure in this video, it's not because he used invalid commands.

I think it is based on my experience of TellMe. Recognition is better when I use the correct commands or use natural language in the right place. If I say "Text Jake" it will generally text me. However, if I say "Send a text to Jake" it fails every time.

Again, if you'd read the thread you would see that this has already been discussed. "Text ccuk", "Call ccuk", "Find ccuk" are all perfectly natural things to say so you're hardly being forced to compromise. Anyway, whether you think natural language is important is irrelevant, TellMe doesn't offer it and he shouldn't try to use the software incorrectly. It's the equivalent of complaining that you can't view Twitter feeds in the iPhone's contacts application - iPhone doesn't offer this feature so there's no point claiming that it's broken.

See you keep dodging around points. Regardless of what you are saying the system couldn't understand the words he was speaking, context aside. I personally don't think it's as good as other offerings.

My WP7 works much better than that. Holy ****, I mean it's no Siri, but that guy was just terrible.

I wonder how long MS spent on research and development for accents which aren't American.

Also, can TellMe work with foreign languages?

I think it is based on my experience of TellMe. Recognition is better when I use the correct commands or use natural language in the right place. If I say "Text Jake" it will generally text me. However, if I say "Send a text to Jake" it fails every time.

You're probably right, but that seems like a crazy way of doing things. It shouldn't assume the correct command will be used every time as it's being spoken to by humans not machines. It should decipher the sentence and then decide if a correct command is used, not decide there's no valid command and crap out.

I wonder how long MS spent on research and development for accents which aren't American.

Also, can TellMe work with foreign languages?

Yes, TellMe works with foreign accents. But, TellMe is NOT directly comparable to Siri. Siri is a complete solution that listens to speech with context and performs and action. TellMe itself is just voice recognition, with no context. Context is provided by the application developer using the TellMe service. It'd be perfectly easy to create something similar to Siri using TellMe. For example, people praise Kinect's voice recognition - but that's also powered by the same TellMe service, but just with more context provided by the actual software on the Xbox.

And of course, what makes speech recognition works properly is context. Non-contextual speech recognition is never mind blowing, and it's what lets Siri distinguish between "10 am" and "teen anal". Siri is expecting a time to fit in with the rest of the sentence, so it assumes a time. Note that voice recognition doesn't usually return exact fixed words, they tend to return words accompanied by liklihood probabilities - i.e. it's 90% likely to be this, but maybe 70% sounds like this. With proper context in Siri's case it does better because it can looks at those at all those choices, and then look at the types of commands it accepts, and be able to choose appropriately. If Windows Phone was programmed to let you make appointments over speech, it would work fine. But it's not, so it doesn't.

In the end, Siri is programmed to recognise a lot more, Microsoft's Windows Phone speech is only programmed to recognise a tiny subset of commands, nothing to do with problems on TellMe's ends - it's just the say it's been programmed on Windows Phone. Hell, even if one company knows another companies implementation is better, they're not going to be stupid enough to tell people their competition is better are they @___@ Microsoft are hardly going to toot Apple's horn ever, and neither are Google :p

You're probably right, but that seems like a crazy way of doing things. It shouldn't assume the correct command will be used every time as it's being spoken to by humans not machines. It should decipher the sentence and then decide if a correct command is used, not decide there's no valid command and crap out.

A crazy way of doing things? Why? You're just telling TellMe to do something (command) and not having a conversation (natural human speech) with it. In my opinion, "text John" is a lot clearer, precise, and shorter than "Can you please send a text to John?" Don't get me wrong, Siri's speech abilities are good and based on this video, it's natural speech recognition is great but TellMe is not as bad as the video makes it seem as it just needs to be updated.

See you keep dodging around points. Regardless of what you are saying the system couldn't understand the words he was speaking, context aside. I personally don't think it's as good as other offerings.

I'm not trying to dodge around anything. You are just making points without having bothered to read the rest of the thread which is leading to confusion. Read what I've said before trying to pick my posts apart.

A crazy way of doing things? Why? You're just telling TellMe to do something (command) and not having a conversation (natural human speech) with it. In my opinion, "text John" is a lot clearer, precise, and shorter than "Can you please send a text to John?" Don't get me wrong, Siri's speech abilities are good and based on this video, it's natural speech recognition is great but TellMe is not as bad as the video makes it seem as it just needs to be updated.

You need to read the comment I replied to to see the context. Jakem1 was suggesting the entire sentence was misread because no correct command was used. He said the voice recognition is more accurate when correct commands are used.

I said that is a crazy way of doing things, because the way the software recognises speech shouldn't change based on whether the correct command is used or not. It's expecting human input so occasionally incorrect commands will be used and the software should account for that.

I'm not saying that commands are a bad idea, just that the voice recognition should be equally as accurate whether the correct command is used or not.

You need to read the comment I replied to to see the context. Jakem1 was suggesting the entire sentence was misread because no correct command was used. He said the voice recognition is more accurate when correct commands are used.

I said that is a crazy way of doing things, because the way the software recognises speech shouldn't change based on whether the correct command is used or not. It's expecting human input so occasionally incorrect commands will be used and the software should account for that.

I'm not saying that commands are a bad idea, just that the voice recognition should be equally as accurate whether the correct command is used or not.

Siri is still command based - it just features a lot more commands.

Unfortunately most current voice recognition systems ARE based on context. They don't tend to return exact matches of words. - they return a list of words and probabilities, and then the software forms a sentence using the probabilities provided by the speech recognition service, and the context and sentence structures the software supports. Speech recognition isn't advanced far enough to perfectly understand speech with no context. Siri is still ultimately command base, it's just more likely to get it right because it's implicitly programmed to recognise more variations of the same thing.

Siri is still command based - it just features a lot more commands.

Both Siri and TellMe can be used for transcribing. Siri is system wide, TellMe can be used inside the Text Messaging app. How accurate are both systems when doing pure transcription? I know from using Siri it's about 98% correct. How is TellMe though? If it's anything like when trying to tell it to use commands I'm not impressed.

Both Siri and TellMe can be used for transcribing. Siri is system wide, TellMe can be used inside the Text Messaging app. How accurate are both systems when doing pure transcription? I know from using Siri it's about 98% correct. How is TellMe though? If it's anything like when trying to tell it to use commands I'm not impressed.

It works perfectly well for me. The only things it messes up for me tends to just be odd names, but apart from that I've had no problems with it.

For what it's worth - Siri's use of Nuance is directly comparable to Windows Phone's use of TellMe for speech synthesis. So it's really Nuance speech synthesis vs TellMe speech synthesis for pure transcription, rather than Siri or Windows Phone :p

You need to read the comment I replied to to see the context. Jakem1 was suggesting the entire sentence was misread because no correct command was used. He said the voice recognition is more accurate when correct commands are used.

I said that is a crazy way of doing things, because the way the software recognises speech shouldn't change based on whether the correct command is used or not. It's expecting human input so occasionally incorrect commands will be used and the software should account for that.

I'm not saying that commands are a bad idea, just that the voice recognition should be equally as accurate whether the correct command is used or not.

I agree with you as I did read the comment and understand why you believe the software should account for any speech inconsistencies. You're right, speech software should understand both but to be honest I prefer the command way of telling my speech recognition software how to do things. However, I love the fact that companies are pushing the software in new directions as it benefits us the consumers.

Siri is still command based - it just features a lot more commands.

Unfortunately most current voice recognition systems ARE based on context. They don't tend to return exact matches of words. - they return a list of words and probabilities, and then the software forms a sentence using the probabilities provided by the speech recognition service, and the context and sentence structures the software supports. Speech recognition isn't advanced far enough to perfectly understand speech with no context. Siri is still ultimately command base, it's just more likely to get it right because it's implicitly programmed to recognise more variations of the same thing.

If that was the case then transcription would suck. It's clear to me that siri is just better..a lot better.

To text you say ... "Text" then the name for tell me.. its so dumb because this guy did not say the commands needed for tell me.. he would say something like.. send this person a text.. instead of just saying text and then the name..

Also how it works is that what ever you say without the commands, is just searched on bing.. i like this because i dont need to be like, "siri, where is the closest indian resaraunt".. I can be like "Indian Restaurant" and bing local shows me.

If that was the case then transcription would suck. It's clear to me that siri is just better..a lot better.

From that video? That video evidently proves how context makes recognition better. Find a video comparing direct voice transcription on a text message where context doesn't exist for a better idea of how they compare on that front.

From that video? That video evidently proves how context makes recognition better. Find a video comparing direct voice transcription on a text message where context doesn't exist for a better idea of how they compare on that front.

No it doesn't..not everything is context based, what about Bing/Google searches? What's the context there, words?? :laugh:

If contextual expectations made that big a difference to any voice recognition software then transcription would suck, but it clearly doesn't.

Siri is incredibly accurate at sending text messages, emails etc where the message can be anything.

Your explanation doesn't wash.

No it doesn't..not everything is context based, what about Bing/Google searches? What's the context there, words?? :laugh:

If contextual expectations made that big a difference to any voice recognition software then transcription would suck, but it clearly doesn't.

Siri is incredibly accurate at sending text messages, emails etc where the message can be anything.

Your explanation doesn't wash.

Did you not see the point where I said "in this video"? Windows Phone and Android do great transcriptions too of text messages, and side by side with an iPhone give largely similar results.

Did you not see the point where I said "in this video"? Windows Phone and Android do great transcriptions too of text messages, and side by side with an iPhone give largely similar results.

The windows phone tried to search bing for something completely different to what was said, and there's not really any contextual guesswork involved with Bing searches..logic dictates transcription would offer the same poor recognition for the man in the video.

The windows phone tried to search bing for something completely different to what was said, and there's not really any contextual guesswork involved with Bing searches..logic dictates transcription would offer the same poor recognition for the man in the video.

I think you may be missing Johnny's point but dictation is more accurate in TellMe than the stuff you've seen in this thread. Check out this video from 56 seconds on for an example of dictation in a text message:

Also, dictation into the Bing app for searching is mostly accurate although it does fall over on some names.

EDIT: It just occurred to me that (as I guess you'd expect) the dictation gets passed to a server for translation so that's one more thing that can't be done without a data connection. It might also explain why it's more accurate than some of the command-based stuff.

I think you may be missing Johnny's point but dictation is more accurate in TellMe than the stuff you've seen in this thread. Check out this video from 56 seconds on for an example of dictation in a text message:

Also, dictation into the Bing app for searching is mostly accurate although it does fall over on some names.

EDIT: It just occurred to me that (as I guess you'd expect) the dictation gets passed to a server for translation so that's one more thing that can't be done without a data connection. It might also explain why it's more accurate than some of the command-based stuff.

Unfair comparison..the person in that video is American. It tends to be other accents these types of software fail on. Show me an Australian and well talk :p

Unfair comparison..the person in that video is American. It tends to be other accents these types of software fail on. Show me an Australian and well talk :p

:laugh:

It's not Australian but this one's a little closer to home for us and it shows off search and dictation in action:

http://www.youtube.com/watch?v=0vW8vE10Snk

The windows phone tried to search bing for something completely different to what was said, and there's not really any contextual guesswork involved with Bing searches..logic dictates transcription would offer the same poor recognition for the man in the video.

Or ergo, logic dictates that the only reason Siri COULD properly understand it is because there was added context. Certainly for the very first one, you can see where context comes in handy. And second is actually phonetically very similar, and context would easily sway a speech recognition system in favour of "Send a text to <contact>" if that's programmed into it's grammar. And same with the third. This video proves nothing, apart from the fact that Siri has a great context engine. (And it is infact a better system than what Windows Phone has, but I'm only defending TellMe, not what Microsoft have done with it in Windows Phone)

The actual underling accuracy of the speech -> text can only be shown in a situation where no context is used, in raw transcription, which that original video didn't. And unfortunately I have no Australians in my house to make a video comparing how they work for Australians :p

Or ergo, logic dictates that the only reason Siri COULD properly understand it is because there was added context. Certainly for the very first one, you can see where context comes in handy. And second is actually phonetically very similar, and context would easily sway a speech recognition system in favour of "Send a text to <contact>" if that's programmed into it's grammar. And same with the third. This video proves nothing, apart from the fact that Siri has a great context engine. (And it is infact a better system than what Windows Phone has, but I'm only defending TellMe, not what Microsoft have done with it in Windows Phone)

The actual underling accuracy of the speech -> text can only be shown in a situation where no context is used, in raw transcription, which that original video didn't. And unfortunately I have no Australians in my house to make a video comparing how they work for Australians :p

http://m.youtube.com/index?desktop_uri=%2F&gl=GB#/watch?v=E91Qu1nVQtE

I can't find a dictation video but this shows the conversation features...very accurate :p

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • WhatsApp slams Isreali firm, NSO Group, for trying to spy on its users by David Uzondu WhatsApp has come out accusing Israeli cyber-intelligence firm, NSO Group, of deploying a fresh wave of highly targeted "spear phishing" attacks against users, which its security teams successfully thwarted. The Israeli firm, according to WhatsApp, ran this operation like its usual one-click phishing campaigns, trying to get people to click malicious links that lead them to external sites. To coordinate the campaign, the spyware vendor created fake test accounts and groups on the messaging app. WhatsApp said it is sharing the specific malicious domains, ikhwancast[.]com, ghazacast[.]com, and fr24cast[.]com, because potential victims need this data to check if they were targeted across other messaging systems or email platforms. The NSO Group is infamous for creating and selling Pegasus, a military-grade commercial spyware capable of silently compromising smartphones simply by sending a message or placing a missed call via apps like WhatsApp or iMessage. Users do not even have to interact with the incoming notification before the infection takes hold. Once Pegasus manages to break in, the spyware harvests private data, letting operators read private messages, emails, photos, and documents. It also tracks precise GPS locations, records keystrokes, activates the device's camera, and monitors live microphone audio. Independent investigations by cybersecurity watchdogs like The Citizen Lab and human rights organizations like Amnesty International have proven that governments use this software to track humanitarian workers, journalists, diplomats, and political dissidents. These findings directly contradict NSO Group claims that clients use the technology to spy on criminals and terrorists only. In late 2021, the U.S. Department of Commerce added the firm to its Entity List, effectively banning the vendor from buying hardware and software from American tech companies. WhatsApp said in its blog post that the spyware vendor violated a permanent court injunction with this new spear-phishing campaign. This injunction, which took effect in 2025, strictly prohibited NSO Group from targeting WhatsApp and its users. The platform is now asking a federal court to hold the firm in contempt.
    • It would be surprising if even 3 year old Apple Watches (and not SE models at that) cannot run watchOS 27. Granted, it doesn't mean it would work as well. We'll see.
    • Apple launches new website for parents and a revamped Screen Time experience by Aditya Tiwari At WWDC 2026, Apple announced new parental control features for iOS 27, iPadOS 27, and macOS 27 to keep kids' device usage in check and keep them safe online. As a parent, you'll get access to a simpler setup experience, Ask to Browse, Time Allowances, and a redesigned Screen Time. You'll be able to pick exactly which apps your child can access on their device, choosing from just a few essential apps, a curated set, or the apps you feel are appropriate. There will be an option to gradually add more apps. Ask to Browse is a new feature that requires kids to request permission before visiting a new website in Safari on iPhone, iPad, or Mac. You can also turn on a setting that requires your kid to ask for approval before connecting with an unknown contact via Messages, FaceTime, or the Phone app. You can manage your child's screen time more effectively with Time Allowances that work across categories, including Games, Entertainment, and Social Media. You can set time limits based on your kid's age and get suggestions informed by expert research. Speaking of expert guidance, Apple added that it's working with the American Academy of Pediatrics (AAP) to adapt its Family Media Plan to create a guide parents can refer to when using Apple products. Moreover, the company has also set up a new dedicated website, where you can find tools, resources, and answers to common questions around parental controls and child safety. Apple also allows parents to set daily schedules to manage screen time, configuring access to different apps at different times of the day and across the week. Overall, the Screen Time section has been revamped and provides a bird's-eye view of your kid's average device usage and most-used apps. "For example, to help protect important family moments, parents can quickly limit access during meals, outdoor play, and other times that deserve full attention. If kids need a little extra time to finish something in an app, parents can also easily extend access," Apple said. Apart from these, Communication Safety has been updated to block gore or violent content when detected in shared images or videos. The feature already blurs nudity in Messages and FaceTime calls, and is enabled by default for users under 18.
    • They already threw the Ultra 1st gen under the bus? It really feels like someone screwed up, and that advanced AI features will only be available on the list above but someone screwed up and said that Watch OS 27 in general will only be available for those models. I think they will pull it back. I did see someone also put a snap shot of their series 9 watch being offered the watch os 27 beta. As the backlash online is already in full swing.
  • Recent Achievements

    • Very Popular
      Captain_Eric earned a badge
      Very Popular
    • One Month Later
      amusc earned a badge
      One Month Later
    • One Month Later
      DJC50PLUS earned a badge
      One Month Later
    • Week One Done
      DJC50PLUS earned a badge
      Week One Done
    • Proficient
      Eric Biran went up a rank
      Proficient
  • Popular Contributors

    1. 1
      +primortal
      513
    2. 2
      PsYcHoKiLLa
      231
    3. 3
      ATLien_0
      87
    4. 4
      +Edouard
      84
    5. 5
      Steven P.
      80
  • Tell a friend

    Love Neowin? Tell a friend!