Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:

 

Research Aims to Improve Speech Recognition Software

13.08.2010
Anyone who has used an automated airline reservation system has experienced the promise – and the frustration – inherent in today’s automatic speech recognition technology. When it works, the computer “understands” that you want to book a flight to Austin rather than Boston, for example. Research conducted by Binghamton University’s Stephen Zahorian aims to improve the accuracy of such programs.

Zahorian, a professor of electrical and computer engineering, recently received a grant of nearly half a million dollars from the Air Force Office of Scientific Research. The funds will support the two-year development of a multi-language, multi-speaker audio database that will be available for spoken-language processing research. Zahorian and his team plan to gather and annotate recordings of several hundred speakers each in English, Spanish and Mandarin Chinese.

“The challenge,” he said, “is to get speech recognition working better in real-life situations.”

That’s why the samples in the new database will come from publicly available sources such as YouTube.

Zahorian’s team will annotate each sample, creating a more detailed version of closed captioning, including time stamps and descriptions of background sounds. Once the human listener has finished with the transcription, automatic speech recognition algorithms will be used to align the recording with the captions. Next, software will be developed to verify and correct errors in the time alignment.

“Speech-recognition algorithms begin by mimicking what your ear does,” Zahorian said. “But we want the algorithms to extract just the most useful characteristics of the speech, not all of the possible data. That’s because more detail can actually hurt performance, past a certain point.”

The field of automatic speech recognition has a long history, dating back to projects at Bell Labs before the computer age. These days, much of the technology relies on algorithms that convert sounds into numbers.

In Zahorian’s research, he represents speech as a picture in a time-frequency plane. He then uses image-processing techniques to extract features of the speech, which has led him to focus more on time than on frequency.

When researchers are ready to test an algorithm, they rely on a common set of databases held by the Linguistic Data Consortium. Zahorian’s unusual image-based approach has given his team some of the best results ever reported for automatic speech recognition experiments using two of the consortium’s best-known databases.

The database Zahorian develops with the new funding will join these others, offering researchers around the world a new way to test their theories with samples of real-life speech.

Some mistakes are inevitable, given the variations in pitch, tone and pronunciation from person to person. Still, the field does have a clear standard, Zahorian said: “In order to be useful, a system should have a word-error rate of no more than 10 percent.”

Zahorian is interested in language modeling – if someone has said these three words, what’s the fourth word likely to be? – as well as conversation modeling – that is, predicting when the speakers will switch. He’s also intrigued by the potential to make advances by using established methods from other fields, including the neural networks developed by researchers working in artificial intelligence.

He sees a future in which automatic speech recognition will enable technology to extract the meaning of speech as well as the words.

“The dream,” Zahorian said, “is that someday travelers will be able to speak into a little gadget that will translate what they’ve said into another language instantly and accurately.”

For more Binghamton University research news, visit http://discovere.binghamton.edu/

Gail Glover | Newswise Science News
Further information:
http://www.binghamton.edu

More articles from Information Technology:

nachricht One Step Ahead: Adaptive Radar Systems for Smart Driver Assistance
20.09.2018 | Fraunhofer-Institut für Hochfrequenzphysik und Radartechnik FHR

nachricht Enjoying virtual-reality-entertainment without headache or motion sickness
19.09.2018 | Fraunhofer-Institut für Organische Elektronik, Elektronenstrahl- und Plasmatechnik FEP

All articles from Information Technology >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: Scientists present new observations to understand the phase transition in quantum chromodynamics

The building blocks of matter in our universe were formed in the first 10 microseconds of its existence, according to the currently accepted scientific picture. After the Big Bang about 13.7 billion years ago, matter consisted mainly of quarks and gluons, two types of elementary particles whose interactions are governed by quantum chromodynamics (QCD), the theory of strong interaction. In the early universe, these particles moved (nearly) freely in a quark-gluon plasma.

This is a joint press release of University Muenster and Heidelberg as well as the GSI Helmholtzzentrum für Schwerionenforschung in Darmstadt.

Then, in a phase transition, they combined and formed hadrons, among them the building blocks of atomic nuclei, protons and neutrons. In the current issue of...

Im Focus: Patented nanostructure for solar cells: Rough optics, smooth surface

Thin-film solar cells made of crystalline silicon are inexpensive and achieve efficiencies of a good 14 percent. However, they could do even better if their shiny surfaces reflected less light. A team led by Prof. Christiane Becker from the Helmholtz-Zentrum Berlin (HZB) has now patented a sophisticated new solution to this problem.

"It is not enough simply to bring more light into the cell," says Christiane Becker. Such surface structures can even ultimately reduce the efficiency by...

Im Focus: New soft coral species discovered in Panama

A study in the journal Bulletin of Marine Science describes a new, blood-red species of octocoral found in Panama. The species in the genus Thesea was discovered in the threatened low-light reef environment on Hannibal Bank, 60 kilometers off mainland Pacific Panama, by researchers at the Smithsonian Tropical Research Institute in Panama (STRI) and the Centro de Investigación en Ciencias del Mar y Limnología (CIMAR) at the University of Costa Rica.

Scientists established the new species, Thesea dalioi, by comparing its physical traits, such as branch thickness and the bright red colony color, with the...

Im Focus: New devices based on rust could reduce excess heat in computers

Physicists explore long-distance information transmission in antiferromagnetic iron oxide

Scientists have succeeded in observing the first long-distance transfer of information in a magnetic group of materials known as antiferromagnets.

Im Focus: Finding Nemo's genes

An international team of researchers has mapped Nemo's genome

An international team of researchers has mapped Nemo's genome, providing the research community with an invaluable resource to decode the response of fish to...

All Focus news of the innovation-report >>>

Anzeige

Anzeige

VideoLinks
Industry & Economy
Event News

"Boston calling": TU Berlin and the Weizenbaum Institute organize a conference in USA

21.09.2018 | Event News

One of the world’s most prominent strategic forums for global health held in Berlin in October 2018

03.09.2018 | Event News

4th Intelligent Materials - European Symposium on Intelligent Materials

27.08.2018 | Event News

 
Latest News

Astrophysicists measure precise rotation pattern of sun-like stars for the first time

21.09.2018 | Physics and Astronomy

Brought to light – chromobodies reveal changes in endogenous protein concentration in living cells

21.09.2018 | Life Sciences

"Boston calling": TU Berlin and the Weizenbaum Institute organize a conference in USA

21.09.2018 | Event News

VideoLinks
Science & Research
Overview of more VideoLinks >>>