Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:


Research Aims to Improve Speech Recognition Software

Anyone who has used an automated airline reservation system has experienced the promise – and the frustration – inherent in today’s automatic speech recognition technology. When it works, the computer “understands” that you want to book a flight to Austin rather than Boston, for example. Research conducted by Binghamton University’s Stephen Zahorian aims to improve the accuracy of such programs.

Zahorian, a professor of electrical and computer engineering, recently received a grant of nearly half a million dollars from the Air Force Office of Scientific Research. The funds will support the two-year development of a multi-language, multi-speaker audio database that will be available for spoken-language processing research. Zahorian and his team plan to gather and annotate recordings of several hundred speakers each in English, Spanish and Mandarin Chinese.

“The challenge,” he said, “is to get speech recognition working better in real-life situations.”

That’s why the samples in the new database will come from publicly available sources such as YouTube.

Zahorian’s team will annotate each sample, creating a more detailed version of closed captioning, including time stamps and descriptions of background sounds. Once the human listener has finished with the transcription, automatic speech recognition algorithms will be used to align the recording with the captions. Next, software will be developed to verify and correct errors in the time alignment.

“Speech-recognition algorithms begin by mimicking what your ear does,” Zahorian said. “But we want the algorithms to extract just the most useful characteristics of the speech, not all of the possible data. That’s because more detail can actually hurt performance, past a certain point.”

The field of automatic speech recognition has a long history, dating back to projects at Bell Labs before the computer age. These days, much of the technology relies on algorithms that convert sounds into numbers.

In Zahorian’s research, he represents speech as a picture in a time-frequency plane. He then uses image-processing techniques to extract features of the speech, which has led him to focus more on time than on frequency.

When researchers are ready to test an algorithm, they rely on a common set of databases held by the Linguistic Data Consortium. Zahorian’s unusual image-based approach has given his team some of the best results ever reported for automatic speech recognition experiments using two of the consortium’s best-known databases.

The database Zahorian develops with the new funding will join these others, offering researchers around the world a new way to test their theories with samples of real-life speech.

Some mistakes are inevitable, given the variations in pitch, tone and pronunciation from person to person. Still, the field does have a clear standard, Zahorian said: “In order to be useful, a system should have a word-error rate of no more than 10 percent.”

Zahorian is interested in language modeling – if someone has said these three words, what’s the fourth word likely to be? – as well as conversation modeling – that is, predicting when the speakers will switch. He’s also intrigued by the potential to make advances by using established methods from other fields, including the neural networks developed by researchers working in artificial intelligence.

He sees a future in which automatic speech recognition will enable technology to extract the meaning of speech as well as the words.

“The dream,” Zahorian said, “is that someday travelers will be able to speak into a little gadget that will translate what they’ve said into another language instantly and accurately.”

For more Binghamton University research news, visit

Gail Glover | Newswise Science News
Further information:

More articles from Information Technology:

nachricht Next Generation Cryptography
20.03.2018 | Fraunhofer-Institut für Sichere Informationstechnologie SIT

nachricht TIB’s Visual Analytics Research Group to develop methods for person detection and visualisation
19.03.2018 | Technische Informationsbibliothek (TIB)

All articles from Information Technology >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: Researchers Discover New Anti-Cancer Protein

An international team of researchers has discovered a new anti-cancer protein. The protein, called LHPP, prevents the uncontrolled proliferation of cancer cells in the liver. The researchers led by Prof. Michael N. Hall from the Biozentrum, University of Basel, report in “Nature” that LHPP can also serve as a biomarker for the diagnosis and prognosis of liver cancer.

The incidence of liver cancer, also known as hepatocellular carcinoma, is steadily increasing. In the last twenty years, the number of cases has almost doubled...

Im Focus: Researchers at Fraunhofer monitor re-entry of Chinese space station Tiangong-1

In just a few weeks from now, the Chinese space station Tiangong-1 will re-enter the Earth's atmosphere where it will to a large extent burn up. It is possible that some debris will reach the Earth's surface. Tiangong-1 is orbiting the Earth uncontrolled at a speed of approx. 29,000 km/h.Currently the prognosis relating to the time of impact currently lies within a window of several days. The scientists at Fraunhofer FHR have already been monitoring Tiangong-1 for a number of weeks with their TIRA system, one of the most powerful space observation radars in the world, with a view to supporting the German Space Situational Awareness Center and the ESA with their re-entry forecasts.

Following the loss of radio contact with Tiangong-1 in 2016 and due to the low orbital height, it is now inevitable that the Chinese space station will...

Im Focus: Alliance „OLED Licht Forum“ – Key partner for OLED lighting solutions

Fraunhofer Institute for Organic Electronics, Electron Beam and Plasma Technology FEP, provider of research and development services for OLED lighting solutions, announces the founding of the “OLED Licht Forum” and presents latest OLED design and lighting solutions during light+building, from March 18th – 23rd, 2018 in Frankfurt a.M./Germany, at booth no. F91 in Hall 4.0.

They are united in their passion for OLED (organic light emitting diodes) lighting with all of its unique facets and application possibilities. Thus experts in...

Im Focus: Mars' oceans formed early, possibly aided by massive volcanic eruptions

Oceans formed before Tharsis and evolved together, shaping climate history of Mars

A new scenario seeking to explain how Mars' putative oceans came and went over the last 4 billion years implies that the oceans formed several hundred million...

Im Focus: Tiny implants for cells are functional in vivo

For the first time, an interdisciplinary team from the University of Basel has succeeded in integrating artificial organelles into the cells of live zebrafish embryos. This innovative approach using artificial organelles as cellular implants offers new potential in treating a range of diseases, as the authors report in an article published in Nature Communications.

In the cells of higher organisms, organelles such as the nucleus or mitochondria perform a range of complex functions necessary for life. In the networks of...

All Focus news of the innovation-report >>>



Industry & Economy
Event News

Virtual reality conference comes to Reutlingen

19.03.2018 | Event News

Ultrafast Wireless and Chip Design at the DATE Conference in Dresden

16.03.2018 | Event News

International Tinnitus Conference of the Tinnitus Research Initiative in Regensburg

13.03.2018 | Event News

Latest News

Modular safety concept increases flexibility in plant conversion

22.03.2018 | Trade Fair News

New interactive map shows climate change everywhere in world

22.03.2018 | Earth Sciences

New technologies and computing power to help strengthen population data

22.03.2018 | Earth Sciences

Science & Research
Overview of more VideoLinks >>>