Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:


Computer program learns language rules and composes sentences, all without outside help


Cornell University and Tel Aviv University researchers have developed a method for enabling a computer program to scan text in any of a number of languages, including English and Chinese, and autonomously and without previous information infer the underlying rules of grammar. The rules can then be used to generate new and meaningful sentences. The method also works for such data as sheet music or protein sequences.

The development -- which has a patent pending -- has implications for speech recognition and for other applications in natural language engineering, as well as for genomics and proteomics. It also offers new insights into language acquisition and psycholinguistics.

"The algorithm -- the computational method -- for language learning and processing that we have developed can take a body of text, abstract from it a collection of recurring patterns or rules and then generate new material," explained Shimon Edelman, a computer scientist who is a professor of psychology at Cornell and co-author of a new paper, "Unsupervised Learning of Natural Languages," published in the Proceedings of the National Academy of Sciences (PNAS, Vol. 102, No. 33).

"This is the first time an unsupervised algorithm is shown capable of learning complex syntax, generating grammatical new sentences and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics," he said.

Unlike previous attempts at developing computer algorithms for language learning, the new method, called Automatic Distillation of Structure (ADIOS), successfully identifies complex patterns in raw texts. The algorithm discovers the patterns by repeatedly aligning sentences and looking for overlapping parts.

For example, the sentences I would like to book a first-class flight to Chicago, I want to book a first-class flight to Boston and Book a first-class flight for me, please may give rise to the pattern book a first-class flight -- if this candidate pattern passes the novel statistical significance test that is the core of the algorithm.

If the system also encounters the sentences I need to book a direct flight from New York to Tel Aviv andI would like to book an economy flight , it may infer that the phrases first-class, direct and economy are equivalent in the context of the new pattern. "Because such equivalence sets can contain other patterns -- in turn containing further patterns, and so on -- the resulting body of knowledge grows recursively, as a sort of forest of branching trees of possibilities," said Edelman.

He added, "ADIOS relies on a statistical method for pattern extraction and on structured generalization -- two processes that have been implicated in language acquisition. Our experiments show that it can acquire intricate structures from raw data, including transcripts of parents’ speech directed at 2- or 3-year-olds. This may eventually help researchers understand how children, who learn language in a similar item-by-item fashion and with very little supervision, eventually master the full complexities of their native tongue."

In addition to child-directed language, the algorithm has been tested on the full text of the Bible in several languages, on artificial context-free languages with thousands of rules and on musical notation. It also has been applied to biological data, such as nucleotide base pairs and amino acid sequences. In analyzing proteins, for example, the algorithm was able to extract from amino acid sequences patterns that were highly correlated with the functional properties of the proteins.

The new method was developed jointly with David Horn and Eytan Ruppin, professors of physics and computer science, respectively, at Tel Aviv University, and with Zach Solan, a doctoral student there and the lead author on the paper. Their collaboration with Edelman was supported in part by the U.S.-Israel Binational Science Foundation.

| EurekAlert!
Further information:

More articles from Information Technology:

nachricht Fraunhofer FIT joins Facebook's Telecom Infra Project
25.10.2016 | Fraunhofer-Institut für Angewandte Informationstechnik FIT

nachricht Stanford researchers create new special-purpose computer that may someday save us billions
21.10.2016 | Stanford University

All articles from Information Technology >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: Light-driven atomic rotations excite magnetic waves

Terahertz excitation of selected crystal vibrations leads to an effective magnetic field that drives coherent spin motion

Controlling functional properties by light is one of the grand goals in modern condensed matter physics and materials science. A new study now demonstrates how...

Im Focus: New 3-D wiring technique brings scalable quantum computers closer to reality

Researchers from the Institute for Quantum Computing (IQC) at the University of Waterloo led the development of a new extensible wiring technique capable of controlling superconducting quantum bits, representing a significant step towards to the realization of a scalable quantum computer.

"The quantum socket is a wiring method that uses three-dimensional wires based on spring-loaded pins to address individual qubits," said Jeremy Béjanin, a PhD...

Im Focus: Scientists develop a semiconductor nanocomposite material that moves in response to light

In a paper in Scientific Reports, a research team at Worcester Polytechnic Institute describes a novel light-activated phenomenon that could become the basis for applications as diverse as microscopic robotic grippers and more efficient solar cells.

A research team at Worcester Polytechnic Institute (WPI) has developed a revolutionary, light-activated semiconductor nanocomposite material that can be used...

Im Focus: Diamonds aren't forever: Sandia, Harvard team create first quantum computer bridge

By forcefully embedding two silicon atoms in a diamond matrix, Sandia researchers have demonstrated for the first time on a single chip all the components needed to create a quantum bridge to link quantum computers together.

"People have already built small quantum computers," says Sandia researcher Ryan Camacho. "Maybe the first useful one won't be a single giant quantum computer...

Im Focus: New Products - Highlights of COMPAMED 2016

COMPAMED has become the leading international marketplace for suppliers of medical manufacturing. The trade fair, which takes place every November and is co-located to MEDICA in Dusseldorf, has been steadily growing over the past years and shows that medical technology remains a rapidly growing market.

In 2016, the joint pavilion by the IVAM Microtechnology Network, the Product Market “High-tech for Medical Devices”, will be located in Hall 8a again and will...

All Focus news of the innovation-report >>>



Event News

#IC2S2: When Social Science meets Computer Science - GESIS will host the IC2S2 conference 2017

14.10.2016 | Event News

Agricultural Trade Developments and Potentials in Central Asia and the South Caucasus

14.10.2016 | Event News

World Health Summit – Day Three: A Call to Action

12.10.2016 | Event News

Latest News

Enormous dome in central Andes driven by huge magma body beneath it

25.10.2016 | Earth Sciences

First time-lapse footage of cell activity during limb regeneration

25.10.2016 | Life Sciences

Deep down fracking wells, microbial communities thrive

25.10.2016 | Earth Sciences

More VideoLinks >>>