Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:


Engineers aim to make average singers sound like virtuosos


Karaoke may never be the same, thanks to research being presented in Nashville detailing the latest findings in efforts to create a computerized system that makes average singers sound like professionals.

"Our ultimate goal is to have a computer system that will transform a poor singing voice into a great singing voice," said Mark J.T. Smith, a professor and head of Purdue University’s School of Electrical and Computer Engineering.

To that end Smith, a former faculty member at the Georgia Institute of Technology, is working with Georgia Tech graduate student Matthew Lee to create computer models for voice analysis and synthesis. These models, or programs called algorithms, break the human singing voice into components that can then be modified to produce a more professional-sounding rendition of the original voice.

Far more work is needed before the system is finished, Smith said. He said the specialized programs are, however, able to alter certain important characteristics of a person’s voice, such as pitch, duration, and "vibrato," or the modulation in frequency produced by professional singers.

Lee will present the latest research findings on April 30 during the 145th Meeting of the Acoustical Society of America in Nashville, Tenn., the nation’s country music capital. Lee will demonstrate the system by playing before-and-after country music audio clips to researchers attending the conference.

The system uses a special technique to break down the original voice. The voice is then reconstructed using a mathematical method called the fast Fourier transform, which enables the system to resynthesize the voice quickly.

Smith, who specializes in an area of electrical engineering known as signal processing, began working on the underlying "sinusoidal model" in the mid-1980s with former doctoral student E. Bryan George, who pioneered the method. The model enables the human singing voice to be broken into components, or sine wave segments. More recently, Smith and Lee developed a method for modifying sine wave parameters in the segments to improve the quality of singing.

"While we have had success in improving the quality of the singing voice samples in our database, we have a way to go before we are able to handle all types of voices reliably," Smith said. "There are many challenges in developing a system of this type.

"Being able to characterize the properties of a good voice in terms of the sine wave components that we compute is not a trivial task. The problem is further complicated by the wide variety of singing styles and voice types that are present in our population."

For example, the sine wave components for male voices and female voices are significantly different.

"It turns out that we are having greater difficulty with the male singers than with the female singers," Smith said. "The higher pitched voices are easier for us to work with, in general."

Other challenges include finding ways to improve a person’s singing without dramatically altering the original voice, identifying the parameters that need to be modified for specific types of quality improvements, and then operating the system in real time on available hardware.

An important feature of the sinusoidal model technique is an "overlap-add" construction, in which a singing voice is partitioned into segments and processed in blocks. The model is designed around blocks that overlap, which results in voice synthesis that sounds natural and not choppy, Smith said.

Singing is first converted into a sequence of numbers, which is modified into a new set of numbers that represents a more professional singing voice. The new numbers are then fed to a digital-to-analog converter and to a speaker, Smith said.

The sinusoidal model Smith and Lee use could have broader applications, such as synthesizing musical instruments and improving the quality of text-to-speech programs in which words typed on a computer are automatically converted into spoken language. Former Georgia Tech doctoral student Michael Macon and his adviser Mark Clements used the sinusoidal model Smith and George developed to create a system that changes text into speech and typed lyrics into singing.

Other possible applications include programs for the hearing-impaired that make it easier to hear speech and systems that change the playback speed of digital recordings.

"The idea of digitally enhanced human singing has been brewing in my mind for a long time," Smith said. "What I would really like is for us to cut an album one of these days."

Early portions of the research were funded by the National Science Foundation.

Writer: Emil Venere, (765) 494-4709,

Sources: Mark J.T. Smith, (765) 494-3539,

Matthew Lee, (404) 664-8323,

Purdue News Service: (765) 494-2096;

Emil Venere | Purdue News
Further information:

More articles from Information Technology:

nachricht Green Light for Galaxy Europe
15.03.2018 | Albert-Ludwigs-Universität Freiburg im Breisgau

nachricht Tokyo Tech's six-legged robots get closer to nature
12.03.2018 | Tokyo Institute of Technology

All articles from Information Technology >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: Locomotion control with photopigments

Researchers from Göttingen University discover additional function of opsins

Animal photoreceptors capture light with photopigments. Researchers from the University of Göttingen have now discovered that these photopigments fulfill an...

Im Focus: Surveying the Arctic: Tracking down carbon particles

Researchers embark on aerial campaign over Northeast Greenland

On 15 March, the AWI research aeroplane Polar 5 will depart for Greenland. Concentrating on the furthest northeast region of the island, an international team...

Im Focus: Unique Insights into the Antarctic Ice Shelf System

Data collected on ocean-ice interactions in the little-researched regions of the far south

The world’s second-largest ice shelf was the destination for a Polarstern expedition that ended in Punta Arenas, Chile on 14th March 2018. Oceanographers from...

Im Focus: ILA 2018: Laser alternative to hexavalent chromium coating

At the 2018 ILA Berlin Air Show from April 25–29, the Fraunhofer Institute for Laser Technology ILT is showcasing extreme high-speed Laser Material Deposition (EHLA): A video documents how for metal components that are highly loaded, EHLA has already proved itself as an alternative to hard chrome plating, which is now allowed only under special conditions.

When the EU restricted the use of hexavalent chromium compounds to special applications requiring authorization, the move prompted a rethink in the surface...

Im Focus: Radar for navigation support from autonomous flying drones

At the ILA Berlin, hall 4, booth 202, Fraunhofer FHR will present two radar sensors for navigation support of drones. The sensors are valuable components in the implementation of autonomous flying drones: they function as obstacle detectors to prevent collisions. Radar sensors also operate reliably in restricted visibility, e.g. in foggy or dusty conditions. Due to their ability to measure distances with high precision, the radar sensors can also be used as altimeters when other sources of information such as barometers or GPS are not available or cannot operate optimally.

Drones play an increasingly important role in the area of logistics and services. Well-known logistic companies place great hope in these compact, aerial...

All Focus news of the innovation-report >>>



Industry & Economy
Event News

Ultrafast Wireless and Chip Design at the DATE Conference in Dresden

16.03.2018 | Event News

International Tinnitus Conference of the Tinnitus Research Initiative in Regensburg

13.03.2018 | Event News

International Virtual Reality Conference “IEEE VR 2018” comes to Reutlingen, Germany

08.03.2018 | Event News

Latest News

Wandering greenhouse gas

16.03.2018 | Earth Sciences

'Frequency combs' ID chemicals within the mid-infrared spectral region

16.03.2018 | Physics and Astronomy

Biologists unravel another mystery of what makes DNA go 'loopy'

16.03.2018 | Life Sciences

Science & Research
Overview of more VideoLinks >>>