

Sierra Jiminez 2011-11-23

    语音识别并不是什么新鲜事物。多年来,消费电子产品、汽车和自动呼叫中心一直就在“倾听”使用者的指令。从2009年开始,谷歌公司(Google)就一直在采录语音信箱的信息。而在此之前三年,微软公司(Microsoft)也将类似的技术置入了Windows Vista。那么,苹果这个名为Siri的全新虚拟个人助理到底有什么神奇之处呢?


    换句话说,Siri不仅仅是语音识别技术,它还能理解语言——正是这一点开始改变用户与手机的互动方式。现在,很多人预测,Siri将对这项长期以来呼之欲出的技术起到重大推动作用,正如苹果iPhone的触控系统让触控技术跻身主流一样。这项技术将扫清众多创新应用发展道路上的障碍。市场调研公司Opus Research称,今年语音识别行业的产值将达到约27亿美元。该公司还预计,2012年,市场将掀起后Siri语音应用热潮。

    是什么让Siri如此与众不同呢?战略咨询公司Creative Strategies总裁提姆•巴佳瑞称,答案在于精确性。他说:“Siri推出的是真正的新一代人机界面,它对语音理解及精确把握语音的市场产生了重大影响。”

    Siri当然谈不上完美无缺。这项技术在理解某些口音上还颇为困难,不过苹果已经在努力解决这些小问题了。但对一款软件来说,Siri的表现可圈可点。Siri的始创者是位于加州的门罗帕克市的研究实验室SRI International,据它称,Siri的关键在于自然语言处理技术。Siri的工作原理是:捕捉语音信号,直接将其转换为文本,它们与用户在手机屏幕上看到的文本并无二致。Siri然后将这些语句与某些预先编制好的指令配比,比如“拨打电话”,或“编辑短信”。


    Speech recognition is nothing new.Consumer electronics, cars and automated call centers have been "listening" to commands for years. Google has been transcribing voicemail messages since 2009, and Microsoft baked similar technology into Windows Vista three years before that. So what's the big deal about Apple's new virtual personal assistant named Siri?

    She gets you.

    In other words, Siri isn't just voice recognition technology, but voice comprehension -- and that's changing the way users interact with their mobile devices. Now, many predict Siri could provide a major boost to a perennially around-the-corner technology, much the way Apple's (AAPL) touch-based iPhone controls vaulted that technology into mainstream use. That could clear the way for a wide range of innovative applications. The voice recognition industry was worth some $2.7 billion this year, according to Opus Research. It is predicting a post-Siri boom in 2012.

    What makes Siri so different? Accuracy, according to Tim Bajarin, president of strategy firm Creative Strategies. "What Siri has really introduced is the next man-to-machine interface, and it's making a significant impact on the market of speech comprehension and accuracy," Bajarin says.

    Siri's not perfect, of course. The technology still has a hard time understanding some accents, and Apple has scrambled to fix early glitches. But for a piece of software, Siri still does pretty well. The key to that, according to Siri's original creators, Menlo Park, California-based research lab SRI International, is natural language processing. Essentially, Siri takes speech signals, translates them directly into the text users see on their screens and maps those terms to one of its pre-programmed commands such as place a call or compose a text message.

    That technology has potential outside of tablets and smartphones. Nuance (NUAN), the creator of Dragon speech recognition software, has been working in healthcare for a decade. Nuance's latest program runs on a physician's desktop, recording speech using a clip-on microphone. The program updates patients' electronic health records as appointments are going on. "One second the patient could be talking about the medical history of their mom, and then the next they're talking about their dad. And the application understands that," says Joe Petro, senior vice president of research and development at Nuance Communication's health care division.
