Information on, for example, events in a company from news texts; who is leaving which post, why, to which company and position the person is moving etc. In his thesis Fredrik Olsson deals with a new method of facilitating the marking up of occurrences of names in data-based textual documents.
Information extraction entails analysing texts with the aim of identifying and picking out information about predefined types of entities, events in which the entities are engaged and relationships between entities and events. In other words it is about gaining access to structured information from an apparently unstructured source of information.
One of the reasons that information extraction is not available for everyone is that it requires a lot of work and time to adapt a system to function for new data in a new text domain. A system that could handle the scenario used as an example above would probably not function at all if the data were changed to identifying interactions between proteins described in biomedical text.
An established way of approaching the problem of domain adaptation of systems for information extraction is to realise its components using machine learning, i.e. computer programs that can learn. In many respects machine learning is based on there being examples from which to learn. A component in an extraction system needs to see examples of the phenomenon it is going to learn to identify, e.g. entities and the relationships between them. The basis of this type of machine learning is thus access to large quantities of examples. However, there are major challenges in producing good examples: it is laborious, takes time and requires a person who knows the domain well to mark up examples in texts.
Recognising names of, for example individuals, companies and locations is fundamental for information extraction. By recognising names we can also start to look for, for example, relationships, expressed in the text, between the bearers of the names.
In his thesis Fredrik Olsson describes the work of developing and evaluating a method, called BootMark, of marking up the occurrence of names in textual documents. BootMark contributes to reducing the quantity of documents that a human annotator needs to mark up in order to train a name recognizer with a performance that is equally good or better than a name recognizer who is trained in a random selection of documents from the same corpus.
Eva Lundgren | idw
Tile Based DASH Streaming for Virtual Reality with HEVC from Fraunhofer HHI
03.01.2017 | Fraunhofer-Institut für Nachrichtentechnik Heinrich-Hertz-Institut
Product placement: Only brands placed very prominently benefit from 3D technology
07.07.2016 | Alpen-Adria-Universität Klagenfurt
An important step towards a completely new experimental access to quantum physics has been made at University of Konstanz. The team of scientists headed by...
Yersiniae cause severe intestinal infections. Studies using Yersinia pseudotuberculosis as a model organism aim to elucidate the infection mechanisms of these...
Researchers from the University of Hamburg in Germany, in collaboration with colleagues from the University of Aarhus in Denmark, have synthesized a new superconducting material by growing a few layers of an antiferromagnetic transition-metal chalcogenide on a bismuth-based topological insulator, both being non-superconducting materials.
While superconductivity and magnetism are generally believed to be mutually exclusive, surprisingly, in this new material, superconducting correlations...
Laser-driving of semimetals allows creating novel quasiparticle states within condensed matter systems and switching between different states on ultrafast time scales
Studying properties of fundamental particles in condensed matter systems is a promising approach to quantum field theory. Quasiparticles offer the opportunity...
Among the general public, solar thermal energy is currently associated with dark blue, rectangular collectors on building roofs. Technologies are needed for aesthetically high quality architecture which offer the architect more room for manoeuvre when it comes to low- and plus-energy buildings. With the “ArKol” project, researchers at Fraunhofer ISE together with partners are currently developing two façade collectors for solar thermal energy generation, which permit a high degree of design flexibility: a strip collector for opaque façade sections and a solar thermal blind for transparent sections. The current state of the two developments will be presented at the BAU 2017 trade fair.
As part of the “ArKol – development of architecturally highly integrated façade collectors with heat pipes” project, Fraunhofer ISE together with its partners...
19.01.2017 | Event News
10.01.2017 | Event News
09.01.2017 | Event News
19.01.2017 | Earth Sciences
19.01.2017 | Life Sciences
19.01.2017 | Physics and Astronomy