November 23, 2014

Android 2.1 Voice Input Feature is "Experimental", Requires Data Connection

We were digging through the user guide for the Nexus One and came across something that we hadn’t seen reported elsewhere.  You know that nifty little feature that comes with Android 2.1 where you can speak instead of typing?  Pretty cool, right?  Did you know that in order to use it, you need a data connection available?  Neither did we. Now to be fair, this probably doesn’t affect too many people as 3G and EDGE networks practically blanket the country.  There are however going to be holes and instances where this feature won’t work.  Google does bill the feature as “experimental” so we can’t really complain.  We’re surprised it doesn’t say “beta” anywhere near the  feature.  All kidding aside, maybe a little more upfronted-ness would have been nice.

You might ask yourself what is the benefit of having it work this way?  Well, since it’s being given back to the cloud, the algorithms and special formulas being used can get constant updates without pushing out incremental releases of Android.  It’s probably tied to the same system that that helps transcribe your voice mails in Google Voice and help you with GOOG-411 calls.  Kinda smart if you ask us.

Anyone else have a chance to look through the guide?  Find anything else cool?  Hit us up in the comments.

  • http://intensedebate.com/people/t3mujin João Almeida

    Here in Portugal, with limited data plans, some kind of offline option would be nice.

  • Sully

    The user guide (to me) was unclear about adding POP3 and Hotmail accounts? Has anyone found any problems?

  • Ryan

    It was clear from the videos that the data was sent to google, who returned the text. that's why it doesn't need training, and will probably enable google to finally perfect text-to-voice.

  • Bryan

    It's mainly because of two things:

    1) Mobile processors have a hard time doing speech synthesis.
    2) Google's speech software is highly regarded, and they don't want anyone hacking it on their phones, lest trade secrets get out.

  • PhineasJW

    Of course(!) it's processed in the cloud (on Google's bank of supercomputers).

    Do you guys have any idea how large the application would need to be to process speech? Not to mention having it run on a cell phone CPU…

    • http://www.cyberwizzard.nl Cyberwizzard

      Bollocks: ever heard of Viterbi? You can run that fine on a mobile phone (as it has been doing for years) – especially on those high speed CPU's that are put in Android phones…

  • bry

    All of the reports that I have seen stated that the voice to text transcription is handled on the back end by Google. The same as voice transcription for Google voice messages.

    • http://tech.desiblogs.net A S

      Commenter bry is correct. That is what I've always been reading too – voice to text is processed in the back-end for Nexus One. This actually has interesting implications for VoIP: the voice input travels really fast to Google's servers and then gets served back to the phone as text, equally fast. This implies that the necessary bandwidth and back-end capacity is in place on both the phone side and Google's side to support true VoIP over mobile data connection. It is only a matter of Google turning it on for GTalk + GVoice.

  • Rich

    The amount of data that would need to be stored onto the phone for this to work offline would be far too great, and processing it locally would be incredibly slow. Seems obvious they'd have done it this way to me, especially as it fits with Google's increasing push to move services to the cloud.

  • Joao Martins

    Joao

    Viver em Portugal e uma chatice nao e, mas e a melhor terra do mundo e talvez qualquer dia ja se acompanhe o mundo em technologia.

    Joao Martins

  • http://intensedebate.com/people/Davest010 Davest010

    If this is going to use the same system that the GV transcriptions use, then it's going to be worthless. I've never gotten a transcription that was anywhere near legible.

    • http://intensedebate.com/people/AndroidNewb AndroidNewb

      well I think there are a couple of factors to consider, first is that the technology is always improving, so it should get better. Also, when someone is leaving a VM they are assuming that you are going to listen to it, not that it's going to be transcribed. Whereas if you are dictating an email, text, tweet, whatever, I would assume that one would expect the need to ennunciate clearly and speak a bit slower. I would expect this even when dictating to a person.

  • jeremy

    This is not a new feature… voice recognition has been in Android for months and has always worked this way. This shouldn't come as a surprise to anyone who's actually paying attention

  • bobalot

    I thought it was common knowledge that googles voice recognition was done in the cloud, you can't use voice dialer without being connected, why would this be any different? I don't see how it matters anyway, most people that get smartphones do so on a contract, which almost always includes a data plan. Besides Googles voice recognition has matured massively since it was launched, much better than offline systems that have to "learn" your accent.

  • David Z

    I second bobalot's observations. I have the DRoid, and am amazed at how accurate the VR function is on the maps & search. It's quite obvious that the processing is in the cloud, because it sometimes fails if the connection isn't up to par. Does anyone know if the Droid's OS can/will be uprgraded to handle VR for the mail as well?

  • Sokar

    This was stated at the press release, Please quit searching for some reason to flame this phone.

  • Brian

    I never thought different. Why do you think it takes so long to get the results when you say more than one line? Why does it "ping" the data connection while it's processing? Please tell me this story is a joke.

  • DevDroid

    I distinctly remember hearing Google say during the press conference that all speech processing happened on the server side not on the client. As a few others have pointed out, mobile processors are slow when it comes to handling speech syntheses. Taxing the processor may also compromise battery life so server side processing was the best option. I thought it was very clear that it was all done server side and thus over the network? This was the best decision Google made in integrating voice processing IMO. Otherwise great post and keep up the good work!

  • Henrik

    That sucks when you go roaming since roaming charges are ridiculous…

  • Ryan Patterson

    We. . . came across something that we hadn’t seen reported elsewhere.

    Sorry wrong. This was made clear during the original google press conference. The processing of the voice capture is done by a remote backend over the network.

  • Tim

    Does anybody know if the voice input works in browser form text input and textarea tags? They say everywhere, but I've learned to get independent verification before buying.