Compression helps a computer tell Dante from Machiavelli
New computer programme could settle literary debates.
To date, unlike us, computers have struggled to differentiate a page of Jane Austen from one by Jackie Collins. Now researchers in Italy have developed a program that can spot enough subtle differences between two authors’ works to attribute authorship1.
The program can tell a text by Machiavelli from one by Pirandello, Dante or a host of other great Italian writers. It constructed a language tree of the degree of affinity between 50 different tongues. The tree identifies all the main linguistic groups, such as Romance, Celtic, Slavic and so forth and highlights Maltese (an Afro-Asiatic language) and Basque as anomalies.
Clash of symbols
So Dario Benedetto and colleagues at the Universita ’La Sapienza’ in Rome try a different approach. They start from the premise that written language is in the end no more than a string of symbols. It might look rather random, but it is not.
Some groups of characters recur commonly (such as ’the’ in English), and particular authors favour certain constructions and turns of phrase. These can be measured, rather than being reliant on subjective impressions or anecdotal comparisons.
The team begin from the classic insight of telecommunications engineer Claude Shannon in the 1940s that the information content of a message is related to its entropy. Roughly speaking, entropy is a measure of how much redundancy a message contains. It can be defined as the smallest program that will produce the original message as the output.
For a random string of characters, this program would simply specify every character - it would be the same size as the original message. For a string of just A’s, the program could be very concise: ’repeat A’. Most real messages lie somewhere in-between: they can usually be compressed a little without losing significant information. This is the basis of data-compression computer algorithms, used to make ’zip’ files, for instance.
Benedetto and his colleagues borrow the principles of data-compression algorithms to calculate a kind of relative entropy for two different character strings: a measure of how much they differ. This distance between two texts is smaller for two works by the same author than for two works by different authors.
PHILIP BALL | © Nature News Service
Man versus machine: Can AI do science?
14.01.2020 | Okinawa Institute of Science and Technology (OIST) Graduate University
Beyond 5G lab: Communication technology of the future
13.01.2020 | Friedrich-Alexander-Universität Erlangen-Nürnberg
Styrofoam or copper - both materials have very different properties with regard to their ability to conduct heat. Scientists at the Max Planck Institute for Polymer Research (MPI-P) in Mainz and the University of Bayreuth have now jointly developed and characterized a novel, extremely thin and transparent material that has different thermal conduction properties depending on the direction. While it can conduct heat extremely well in one direction, it shows good thermal insulation in the other direction.
Thermal insulation and thermal conduction play a crucial role in our everyday lives - from computer processors, where it is important to dissipate heat as...
In order to advance the transfer of research developments from the field of quantum sensor technology into industrial applications, an application laboratory is being established at Fraunhofer IAF. This will enable interested companies and especially regional SMEs and start-ups to evaluate the innovation potential of quantum sensors for their specific requirements. Both the state of Baden-Württemberg and the Fraunhofer-Gesellschaft are supporting the four-year project with one million euros each.
The application laboratory is being set up as part of the Fraunhofer lighthouse project »QMag«, short for quantum magnetometry. In this project, researchers...
Microtubules, filamentous structures within the cell, are required for many important processes, including cell division and intracellular transport. A...
Researchers from the University Hospital Zurich, ETH Zurich, Wyss Zurich and the University of Zurich have developed a machine that repairs injured human livers and keep them alive outside the body for one week. This breakthrough may increase the number of available organs for transplantation saving many lives of patients with severe liver diseases or cancer.
Until now, livers could be stored safely outside the body for only a few hours. With the novel perfusion technology, livers - and even injured livers - can now...
A balloon-borne scientific instrument designed to study the origin of cosmic rays is taking its second turn high above the continent of Antarctica three and a half weeks after its launch.
SuperTIGER (Super Trans-Iron Galactic Element Recorder) is designed to measure the rare, heavy elements in cosmic rays that hold clues about their origins...
16.01.2020 | Event News
15.01.2020 | Event News
07.01.2020 | Event News
17.01.2020 | Life Sciences
17.01.2020 | Power and Electrical Engineering
17.01.2020 | Life Sciences