Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:

 

Research Aims to Improve Speech Recognition Software

13.08.2010
Anyone who has used an automated airline reservation system has experienced the promise – and the frustration – inherent in today’s automatic speech recognition technology. When it works, the computer “understands” that you want to book a flight to Austin rather than Boston, for example. Research conducted by Binghamton University’s Stephen Zahorian aims to improve the accuracy of such programs.

Zahorian, a professor of electrical and computer engineering, recently received a grant of nearly half a million dollars from the Air Force Office of Scientific Research. The funds will support the two-year development of a multi-language, multi-speaker audio database that will be available for spoken-language processing research. Zahorian and his team plan to gather and annotate recordings of several hundred speakers each in English, Spanish and Mandarin Chinese.

“The challenge,” he said, “is to get speech recognition working better in real-life situations.”

That’s why the samples in the new database will come from publicly available sources such as YouTube.

Zahorian’s team will annotate each sample, creating a more detailed version of closed captioning, including time stamps and descriptions of background sounds. Once the human listener has finished with the transcription, automatic speech recognition algorithms will be used to align the recording with the captions. Next, software will be developed to verify and correct errors in the time alignment.

“Speech-recognition algorithms begin by mimicking what your ear does,” Zahorian said. “But we want the algorithms to extract just the most useful characteristics of the speech, not all of the possible data. That’s because more detail can actually hurt performance, past a certain point.”

The field of automatic speech recognition has a long history, dating back to projects at Bell Labs before the computer age. These days, much of the technology relies on algorithms that convert sounds into numbers.

In Zahorian’s research, he represents speech as a picture in a time-frequency plane. He then uses image-processing techniques to extract features of the speech, which has led him to focus more on time than on frequency.

When researchers are ready to test an algorithm, they rely on a common set of databases held by the Linguistic Data Consortium. Zahorian’s unusual image-based approach has given his team some of the best results ever reported for automatic speech recognition experiments using two of the consortium’s best-known databases.

The database Zahorian develops with the new funding will join these others, offering researchers around the world a new way to test their theories with samples of real-life speech.

Some mistakes are inevitable, given the variations in pitch, tone and pronunciation from person to person. Still, the field does have a clear standard, Zahorian said: “In order to be useful, a system should have a word-error rate of no more than 10 percent.”

Zahorian is interested in language modeling – if someone has said these three words, what’s the fourth word likely to be? – as well as conversation modeling – that is, predicting when the speakers will switch. He’s also intrigued by the potential to make advances by using established methods from other fields, including the neural networks developed by researchers working in artificial intelligence.

He sees a future in which automatic speech recognition will enable technology to extract the meaning of speech as well as the words.

“The dream,” Zahorian said, “is that someday travelers will be able to speak into a little gadget that will translate what they’ve said into another language instantly and accurately.”

For more Binghamton University research news, visit http://discovere.binghamton.edu/

Gail Glover | Newswise Science News
Further information:
http://www.binghamton.edu

More articles from Information Technology:

nachricht Magnetic Quantum Objects in a "Nano Egg-Box"
25.07.2017 | Universität Wien

nachricht 3-D scanning with water
24.07.2017 | Association for Computing Machinery

All articles from Information Technology >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: Carbon Nanotubes Turn Electrical Current into Light-emitting Quasi-particles

Strong light-matter coupling in these semiconducting tubes may hold the key to electrically pumped lasers

Light-matter quasi-particles can be generated electrically in semiconducting carbon nanotubes. Material scientists and physicists from Heidelberg University...

Im Focus: Flexible proximity sensor creates smart surfaces

Fraunhofer IPA has developed a proximity sensor made from silicone and carbon nanotubes (CNT) which detects objects and determines their position. The materials and printing process used mean that the sensor is extremely flexible, economical and can be used for large surfaces. Industry and research partners can use and further develop this innovation straight away.

At first glance, the proximity sensor appears to be nothing special: a thin, elastic layer of silicone onto which black square surfaces are printed, but these...

Im Focus: 3-D scanning with water

3-D shape acquisition using water displacement as the shape sensor for the reconstruction of complex objects

A global team of computer scientists and engineers have developed an innovative technique that more completely reconstructs challenging 3D objects. An ancient...

Im Focus: Manipulating Electron Spins Without Loss of Information

Physicists have developed a new technique that uses electrical voltages to control the electron spin on a chip. The newly-developed method provides protection from spin decay, meaning that the contained information can be maintained and transmitted over comparatively large distances, as has been demonstrated by a team from the University of Basel’s Department of Physics and the Swiss Nanoscience Institute. The results have been published in Physical Review X.

For several years, researchers have been trying to use the spin of an electron to store and transmit information. The spin of each electron is always coupled...

Im Focus: The proton precisely weighted

What is the mass of a proton? Scientists from Germany and Japan successfully did an important step towards the most exact knowledge of this fundamental constant. By means of precision measurements on a single proton, they could improve the precision by a factor of three and also correct the existing value.

To determine the mass of a single proton still more accurate – a group of physicists led by Klaus Blaum and Sven Sturm of the Max Planck Institute for Nuclear...

All Focus news of the innovation-report >>>

Anzeige

Anzeige

Event News

Clash of Realities 2017: Registration now open. International Conference at TH Köln

26.07.2017 | Event News

Closing the Sustainability Circle: Protection of Food with Biobased Materials

21.07.2017 | Event News

»We are bringing Additive Manufacturing to SMEs«

19.07.2017 | Event News

 
Latest News

CCNY physicists master unexplored electron property

26.07.2017 | Physics and Astronomy

Molecular microscopy illuminates molecular motor motion

26.07.2017 | Life Sciences

Large-Mouthed Fish Was Top Predator After Mass Extinction

26.07.2017 | Earth Sciences

VideoLinks
B2B-VideoLinks
More VideoLinks >>>