Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:

 

Engineers aim to make average singers sound like virtuosos

24.04.2003


Karaoke may never be the same, thanks to research being presented in Nashville detailing the latest findings in efforts to create a computerized system that makes average singers sound like professionals.



"Our ultimate goal is to have a computer system that will transform a poor singing voice into a great singing voice," said Mark J.T. Smith, a professor and head of Purdue University’s School of Electrical and Computer Engineering.

To that end Smith, a former faculty member at the Georgia Institute of Technology, is working with Georgia Tech graduate student Matthew Lee to create computer models for voice analysis and synthesis. These models, or programs called algorithms, break the human singing voice into components that can then be modified to produce a more professional-sounding rendition of the original voice.


Far more work is needed before the system is finished, Smith said. He said the specialized programs are, however, able to alter certain important characteristics of a person’s voice, such as pitch, duration, and "vibrato," or the modulation in frequency produced by professional singers.

Lee will present the latest research findings on April 30 during the 145th Meeting of the Acoustical Society of America in Nashville, Tenn., the nation’s country music capital. Lee will demonstrate the system by playing before-and-after country music audio clips to researchers attending the conference.

The system uses a special technique to break down the original voice. The voice is then reconstructed using a mathematical method called the fast Fourier transform, which enables the system to resynthesize the voice quickly.

Smith, who specializes in an area of electrical engineering known as signal processing, began working on the underlying "sinusoidal model" in the mid-1980s with former doctoral student E. Bryan George, who pioneered the method. The model enables the human singing voice to be broken into components, or sine wave segments. More recently, Smith and Lee developed a method for modifying sine wave parameters in the segments to improve the quality of singing.

"While we have had success in improving the quality of the singing voice samples in our database, we have a way to go before we are able to handle all types of voices reliably," Smith said. "There are many challenges in developing a system of this type.

"Being able to characterize the properties of a good voice in terms of the sine wave components that we compute is not a trivial task. The problem is further complicated by the wide variety of singing styles and voice types that are present in our population."

For example, the sine wave components for male voices and female voices are significantly different.

"It turns out that we are having greater difficulty with the male singers than with the female singers," Smith said. "The higher pitched voices are easier for us to work with, in general."

Other challenges include finding ways to improve a person’s singing without dramatically altering the original voice, identifying the parameters that need to be modified for specific types of quality improvements, and then operating the system in real time on available hardware.

An important feature of the sinusoidal model technique is an "overlap-add" construction, in which a singing voice is partitioned into segments and processed in blocks. The model is designed around blocks that overlap, which results in voice synthesis that sounds natural and not choppy, Smith said.

Singing is first converted into a sequence of numbers, which is modified into a new set of numbers that represents a more professional singing voice. The new numbers are then fed to a digital-to-analog converter and to a speaker, Smith said.

The sinusoidal model Smith and Lee use could have broader applications, such as synthesizing musical instruments and improving the quality of text-to-speech programs in which words typed on a computer are automatically converted into spoken language. Former Georgia Tech doctoral student Michael Macon and his adviser Mark Clements used the sinusoidal model Smith and George developed to create a system that changes text into speech and typed lyrics into singing.

Other possible applications include programs for the hearing-impaired that make it easier to hear speech and systems that change the playback speed of digital recordings.

"The idea of digitally enhanced human singing has been brewing in my mind for a long time," Smith said. "What I would really like is for us to cut an album one of these days."

Early portions of the research were funded by the National Science Foundation.

Writer: Emil Venere, (765) 494-4709, venere@purdue.edu

Sources: Mark J.T. Smith, (765) 494-3539, mjts@purdue.edu

Matthew Lee, (404) 664-8323, mattlee@ece.gatech.edu

Purdue News Service: (765) 494-2096; purduenews@purdue.edu

Emil Venere | Purdue News
Further information:
http://news.uns.purdue.edu/html4ever/030423.Smith.singing.html

More articles from Information Technology:

nachricht Intelligent Deletion of Superfluous Digital Files
21.02.2020 | Otto-Friedrich-Universität Bamberg

nachricht High-Performance Computing Center of the University of Stuttgart Receives new Supercomuter "Hawk"
19.02.2020 | Universität Stuttgart

All articles from Information Technology >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: A step towards controlling spin-dependent petahertz electronics by material defects

The operational speed of semiconductors in various electronic and optoelectronic devices is limited to several gigahertz (a billion oscillations per second). This constrains the upper limit of the operational speed of computing. Now researchers from the Max Planck Institute for the Structure and Dynamics of Matter in Hamburg, Germany, and the Indian Institute of Technology in Bombay have explained how these processes can be sped up through the use of light waves and defected solid materials.

Light waves perform several hundred trillion oscillations per second. Hence, it is natural to envision employing light oscillations to drive the electronic...

Im Focus: Freiburg researcher investigate the origins of surface texture

Most natural and artificial surfaces are rough: metals and even glasses that appear smooth to the naked eye can look like jagged mountain ranges under the microscope. There is currently no uniform theory about the origin of this roughness despite it being observed on all scales, from the atomic to the tectonic. Scientists suspect that the rough surface is formed by irreversible plastic deformation that occurs in many processes of mechanical machining of components such as milling.

Prof. Dr. Lars Pastewka from the Simulation group at the Department of Microsystems Engineering at the University of Freiburg and his team have simulated such...

Im Focus: Skyrmions like it hot: Spin structures are controllable even at high temperatures

Investigation of the temperature dependence of the skyrmion Hall effect reveals further insights into possible new data storage devices

The joint research project of Johannes Gutenberg University Mainz (JGU) and the Massachusetts Institute of Technology (MIT) that had previously demonstrated...

Im Focus: Making the internet more energy efficient through systemic optimization

Researchers at Chalmers University of Technology, Sweden, recently completed a 5-year research project looking at how to make fibre optic communications systems more energy efficient. Among their proposals are smart, error-correcting data chip circuits, which they refined to be 10 times less energy consumptive. The project has yielded several scientific articles, in publications including Nature Communications.

Streaming films and music, scrolling through social media, and using cloud-based storage services are everyday activities now.

Im Focus: New synthesis methods enhance 3D chemical space for drug discovery

After helping develop a new approach for organic synthesis -- carbon-hydrogen functionalization -- scientists at Emory University are now showing how this approach may apply to drug discovery. Nature Catalysis published their most recent work -- a streamlined process for making a three-dimensional scaffold of keen interest to the pharmaceutical industry.

"Our tools open up whole new chemical space for potential drug targets," says Huw Davies, Emory professor of organic chemistry and senior author of the paper.

All Focus news of the innovation-report >>>

Anzeige

Anzeige

VideoLinks
Industry & Economy
Event News

70th Lindau Nobel Laureate Meeting: Around 70 Laureates set to meet with young scientists from approx. 100 countries

12.02.2020 | Event News

11th Advanced Battery Power Conference, March 24-25, 2020 in Münster/Germany

16.01.2020 | Event News

Laser Colloquium Hydrogen LKH2: fast and reliable fuel cell manufacturing

15.01.2020 | Event News

 
Latest News

NUI Galway highlights reproductive flexibility in hydractinia, a Galway bay jellyfish

24.02.2020 | Life Sciences

KIST researchers develop high-capacity EV battery materials that double driving range

24.02.2020 | Materials Sciences

How earthquakes deform gravity

24.02.2020 | Earth Sciences

VideoLinks
Science & Research
Overview of more VideoLinks >>>