Follow Us

We use cookies to provide you with a better experience. If you continue to use this site, we'll assume you're happy with this. Alternatively, click here to find out how to manage these cookies

hide cookie message

Google boosts Voice Search speech recognition

Search giant throwing resources at voice research

Article comments

Google is taking advantage of its cloud infrastructure and the huge volume of typed search queries to refine its Voice Search function, part of a massive research effort in voice that spans both mobile devices and the web. Voice Search, introduced about 18 months ago, lets mobile users search the web by speaking into their phones rather than typing in a query. It's available on the iPhone, BlackBerry, Nokia Series 60 devices and some Android phones.

Accuracy is a major factor for success, driving useful results that cause users to return to the service, said Michael Cohen, manager of speech technology at Google, in a speech at the Mobile Voice Conference. The company strives to make Voice Search a "frictionless" experience for the user, with correct results obtained easily. Making speech recognition more accurate has been a decadeslong effort, and Google is applying its massive scale to the problem, Cohen said.

Voice Search is based on "language models," which are statistical models of what sequences of words are most likely to occur. For example, a good language model would know that it's more likely a speaker would say "the dog barked" than "the dog talked."

Google is constantly "training" new language models for its speech recognition engine, Cohen said. In doing so, it taps into the search terms that users type into Google.com. From 230 billion words typed in search requests at Google.com, researchers have compiled the 1 million most frequently used unique words to form a vocabulary with which to train the voice system. Both numbers are arbitrary, and 230 billion does not represent the total number of words entered at Google in any given period, Cohen said. AskOxford.com, from the publisher of the Oxford English Dictionary, estimates that there are at least 250,000 words in the English language. Cohen said the 1 million unique words include plurals and other versions of words.

It takes 70 "CPU years", the amount of work one CPU can perform in a year, to process those 230 billion words from Google.com and train a new language model, Cohen said. Google trains new language models constantly as part of its research.

"There are huge computational demands as we're taking on lots and lots of data (and) bigger and bigger models," Cohen said. "Luckily, we have a lot of compute power we can apply to that. And there are demands on infrastructure, and luckily, Google has a very well designed software infrastructure, so we can do things like quickly parallelise something," running it on thousands of computers at the same time, he said.

A cloud infrastructure offers other advantages in speech recognition, he said. For one thing, Google can rapidly test and refine its speech recognition software, sending out new versions, while consumers are using it in the field. In addition, as consumers use Voice Search, Google learns from real world experiences.

In addition to making speech recognition easier to use, Google wants to make it ubiquitously available. A big step in that direction was a feature included in the Nexus One handset that gives the user the option of speaking instead of typing every time the keyboard pops up on the phone's screen, Cohen said.

Speech recognition is also a big part of Google Voice, powering its voicemail transcription feature. But Google's interest in voice goes beyond mobile phones, Cohen said. Voice is the biggest group in Google Research, and findings in this area can be useful in many areas, he said. The company wants to be able to understand and deliver spoken content on the Web as well as the written information it finds now through its search engine. One recent move was the addition of a closed caption option for YouTube videos. Using that capability, Google is also beginning to offer foreign-language subtitles through text-to-text translation of those captions.

Cohen was a co-founder of Nuance Communications and has been working on speech recognition for 25 years. In that time, "It's come a long way, but it has a long way to go," he said.

Microsoft is also developing voice search capabilities for its Bing search engine.



Share:

More from Techworld

More relevant IT news

Comments

vjaivox said: well but I did some tests in 10 languages The results are still such that recognized strings are misleading DThis search as well as Google recognition in general works reasonably etails are in an article at httpwwwjaivoxcomgooglela



Send to a friend

Email this article to a friend or colleague:

PLEASE NOTE: Your name is used only to let the recipient know who sent the story, and in case of transmission error. Both your name and the recipient's name and address will not be used for any other purpose.

Techworld White Papers

Choose – and Choose Wisely – the Right MSP for Your SMB

End users need a technology partner that provides transparency, enables productivity, delivers...

Download Whitepaper

10 Effective Habits of Indispensable IT Departments

It’s no secret that responsibilities are growing while budgets continue to shrink. Download this...

Download Whitepaper

Gartner Magic Quadrant for Enterprise Information Archiving

Enterprise information archiving is contributing to organisational needs for e-discovery and...

Download Whitepaper

Advancing the state of virtualised backups

Dell Software’s vRanger is a veteran of the virtualisation specific backup market. It was the...

Download Whitepaper

Techworld UK - Technology - Business

Innovation, productivity, agility and profit

Watch this on demand webinar which explores IT innovation, managed print services and business agility.

Techworld Mobile Site

Access Techworld's content on the move

Get the latest news, product reviews and downloads on your mobile device with Techworld's mobile site.

Find out more...

From Wow to How : Making mobile and cloud work for you

On demand Biztech Briefing - Learn how to effectively deliver mobile work styles and cloud services together.

Watch now...

Site Map

* *