Compression helps a computer tell Dante from Machiavelli
New computer programme could settle literary debates.
To date, unlike us, computers have struggled to differentiate a page of Jane Austen from one by Jackie Collins. Now researchers in Italy have developed a program that can spot enough subtle differences between two authors’ works to attribute authorship1.
The program can tell a text by Machiavelli from one by Pirandello, Dante or a host of other great Italian writers. It constructed a language tree of the degree of affinity between 50 different tongues. The tree identifies all the main linguistic groups, such as Romance, Celtic, Slavic and so forth and highlights Maltese (an Afro-Asiatic language) and Basque as anomalies.
Clash of symbols
So Dario Benedetto and colleagues at the Universita ’La Sapienza’ in Rome try a different approach. They start from the premise that written language is in the end no more than a string of symbols. It might look rather random, but it is not.
Some groups of characters recur commonly (such as ’the’ in English), and particular authors favour certain constructions and turns of phrase. These can be measured, rather than being reliant on subjective impressions or anecdotal comparisons.
The team begin from the classic insight of telecommunications engineer Claude Shannon in the 1940s that the information content of a message is related to its entropy. Roughly speaking, entropy is a measure of how much redundancy a message contains. It can be defined as the smallest program that will produce the original message as the output.
For a random string of characters, this program would simply specify every character - it would be the same size as the original message. For a string of just A’s, the program could be very concise: ’repeat A’. Most real messages lie somewhere in-between: they can usually be compressed a little without losing significant information. This is the basis of data-compression computer algorithms, used to make ’zip’ files, for instance.
Benedetto and his colleagues borrow the principles of data-compression algorithms to calculate a kind of relative entropy for two different character strings: a measure of how much they differ. This distance between two texts is smaller for two works by the same author than for two works by different authors.
PHILIP BALL | © Nature News Service
Supercomputing the emergence of material behavior
18.05.2018 | University of Texas at Austin, Texas Advanced Computing Center
Keeping a Close Eye on Ice Loss
18.05.2018 | Alfred-Wegener-Institut, Helmholtz-Zentrum für Polar- und Meeresforschung
So-called quantum many-body scars allow quantum systems to stay out of equilibrium much longer, explaining experiment | Study published in Nature Physics
Recently, researchers from Harvard and MIT succeeded in trapping a record 53 atoms and individually controlling their quantum state, realizing what is called a...
The historic first detection of gravitational waves from colliding black holes far outside our galaxy opened a new window to understanding the universe. A...
A team led by Austrian experimental physicist Rainer Blatt has succeeded in characterizing the quantum entanglement of two spatially separated atoms by observing their light emission. This fundamental demonstration could lead to the development of highly sensitive optical gradiometers for the precise measurement of the gravitational field or the earth's magnetic field.
The age of quantum technology has long been heralded. Decades of research into the quantum world have led to the development of methods that make it possible...
Cardiovascular tissue engineering aims to treat heart disease with prostheses that grow and regenerate. Now, researchers from the University of Zurich, the Technical University Eindhoven and the Charité Berlin have successfully implanted regenerative heart valves, designed with the aid of computer simulations, into sheep for the first time.
Producing living tissue or organs based on human cells is one of the main research fields in regenerative medicine. Tissue engineering, which involves growing...
A team of scientists of the Max Planck Institute for the Structure and Dynamics of Matter (MPSD) at the Center for Free-Electron Laser Science in Hamburg investigated optically-induced superconductivity in the alkali-doped fulleride K3C60under high external pressures. This study allowed, on one hand, to uniquely assess the nature of the transient state as a superconducting phase. In addition, it unveiled the possibility to induce superconductivity in K3C60 at temperatures far above the -170 degrees Celsius hypothesized previously, and rather all the way to room temperature. The paper by Cantaluppi et al has been published in Nature Physics.
Unlike ordinary metals, superconductors have the unique capability of transporting electrical currents without any loss. Nowadays, their technological...
02.05.2018 | Event News
13.04.2018 | Event News
12.04.2018 | Event News
18.05.2018 | Power and Electrical Engineering
18.05.2018 | Information Technology
18.05.2018 | Information Technology