Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:

 

Computer program learns language rules and composes sentences, all without outside help

01.09.2005


Cornell University and Tel Aviv University researchers have developed a method for enabling a computer program to scan text in any of a number of languages, including English and Chinese, and autonomously and without previous information infer the underlying rules of grammar. The rules can then be used to generate new and meaningful sentences. The method also works for such data as sheet music or protein sequences.



The development -- which has a patent pending -- has implications for speech recognition and for other applications in natural language engineering, as well as for genomics and proteomics. It also offers new insights into language acquisition and psycholinguistics.

"The algorithm -- the computational method -- for language learning and processing that we have developed can take a body of text, abstract from it a collection of recurring patterns or rules and then generate new material," explained Shimon Edelman, a computer scientist who is a professor of psychology at Cornell and co-author of a new paper, "Unsupervised Learning of Natural Languages," published in the Proceedings of the National Academy of Sciences (PNAS, Vol. 102, No. 33).


"This is the first time an unsupervised algorithm is shown capable of learning complex syntax, generating grammatical new sentences and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics," he said.

Unlike previous attempts at developing computer algorithms for language learning, the new method, called Automatic Distillation of Structure (ADIOS), successfully identifies complex patterns in raw texts. The algorithm discovers the patterns by repeatedly aligning sentences and looking for overlapping parts.

For example, the sentences I would like to book a first-class flight to Chicago, I want to book a first-class flight to Boston and Book a first-class flight for me, please may give rise to the pattern book a first-class flight -- if this candidate pattern passes the novel statistical significance test that is the core of the algorithm.

If the system also encounters the sentences I need to book a direct flight from New York to Tel Aviv andI would like to book an economy flight , it may infer that the phrases first-class, direct and economy are equivalent in the context of the new pattern. "Because such equivalence sets can contain other patterns -- in turn containing further patterns, and so on -- the resulting body of knowledge grows recursively, as a sort of forest of branching trees of possibilities," said Edelman.

He added, "ADIOS relies on a statistical method for pattern extraction and on structured generalization -- two processes that have been implicated in language acquisition. Our experiments show that it can acquire intricate structures from raw data, including transcripts of parents’ speech directed at 2- or 3-year-olds. This may eventually help researchers understand how children, who learn language in a similar item-by-item fashion and with very little supervision, eventually master the full complexities of their native tongue."

In addition to child-directed language, the algorithm has been tested on the full text of the Bible in several languages, on artificial context-free languages with thousands of rules and on musical notation. It also has been applied to biological data, such as nucleotide base pairs and amino acid sequences. In analyzing proteins, for example, the algorithm was able to extract from amino acid sequences patterns that were highly correlated with the functional properties of the proteins.

The new method was developed jointly with David Horn and Eytan Ruppin, professors of physics and computer science, respectively, at Tel Aviv University, and with Zach Solan, a doctoral student there and the lead author on the paper. Their collaboration with Edelman was supported in part by the U.S.-Israel Binational Science Foundation.

| EurekAlert!
Further information:
http://www.cornell.edu

More articles from Information Technology:

nachricht Information integration and artificial intelligence for better diagnosis and therapy decisions
24.05.2017 | Fraunhofer MEVIS - Institut für Bildgestützte Medizin

nachricht World's thinnest hologram paves path to new 3-D world
18.05.2017 | RMIT University

All articles from Information Technology >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: Can the immune system be boosted against Staphylococcus aureus by delivery of messenger RNA?

Staphylococcus aureus is a feared pathogen (MRSA, multi-resistant S. aureus) due to frequent resistances against many antibiotics, especially in hospital infections. Researchers at the Paul-Ehrlich-Institut have identified immunological processes that prevent a successful immune response directed against the pathogenic agent. The delivery of bacterial proteins with RNA adjuvant or messenger RNA (mRNA) into immune cells allows the re-direction of the immune response towards an active defense against S. aureus. This could be of significant importance for the development of an effective vaccine. PLOS Pathogens has published these research results online on 25 May 2017.

Staphylococcus aureus (S. aureus) is a bacterium that colonizes by far more than half of the skin and the mucosa of adults, usually without causing infections....

Im Focus: A quantum walk of photons

Physicists from the University of Würzburg are capable of generating identical looking single light particles at the push of a button. Two new studies now demonstrate the potential this method holds.

The quantum computer has fuelled the imagination of scientists for decades: It is based on fundamentally different phenomena than a conventional computer....

Im Focus: Turmoil in sluggish electrons’ existence

An international team of physicists has monitored the scattering behaviour of electrons in a non-conducting material in real-time. Their insights could be beneficial for radiotherapy.

We can refer to electrons in non-conducting materials as ‘sluggish’. Typically, they remain fixed in a location, deep inside an atomic composite. It is hence...

Im Focus: Wafer-thin Magnetic Materials Developed for Future Quantum Technologies

Two-dimensional magnetic structures are regarded as a promising material for new types of data storage, since the magnetic properties of individual molecular building blocks can be investigated and modified. For the first time, researchers have now produced a wafer-thin ferrimagnet, in which molecules with different magnetic centers arrange themselves on a gold surface to form a checkerboard pattern. Scientists at the Swiss Nanoscience Institute at the University of Basel and the Paul Scherrer Institute published their findings in the journal Nature Communications.

Ferrimagnets are composed of two centers which are magnetized at different strengths and point in opposing directions. Two-dimensional, quasi-flat ferrimagnets...

Im Focus: World's thinnest hologram paves path to new 3-D world

Nano-hologram paves way for integration of 3-D holography into everyday electronics

An Australian-Chinese research team has created the world's thinnest hologram, paving the way towards the integration of 3D holography into everyday...

All Focus news of the innovation-report >>>

Anzeige

Anzeige

Event News

Marine Conservation: IASS Contributes to UN Ocean Conference in New York on 5-9 June

24.05.2017 | Event News

AWK Aachen Machine Tool Colloquium 2017: Internet of Production for Agile Enterprises

23.05.2017 | Event News

Dortmund MST Conference presents Individualized Healthcare Solutions with micro and nanotechnology

22.05.2017 | Event News

 
Latest News

How herpesviruses win the footrace against the immune system

26.05.2017 | Life Sciences

Water forms 'spine of hydration' around DNA, group finds

26.05.2017 | Life Sciences

First Juno science results supported by University of Leicester's Jupiter 'forecast'

26.05.2017 | Physics and Astronomy

VideoLinks
B2B-VideoLinks
More VideoLinks >>>