Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:

 

New method to facilitate extraction

18.12.2008
With today's large flows of data-based texts it is important to produce systems that facilitate searches for the particular information that is required.

Information on, for example, events in a company from news texts; who is leaving which post, why, to which company and position the person is moving etc. In his thesis Fredrik Olsson deals with a new method of facilitating the marking up of occurrences of names in data-based textual documents.

Information extraction entails analysing texts with the aim of identifying and picking out information about predefined types of entities, events in which the entities are engaged and relationships between entities and events. In other words it is about gaining access to structured information from an apparently unstructured source of information.

One of the reasons that information extraction is not available for everyone is that it requires a lot of work and time to adapt a system to function for new data in a new text domain. A system that could handle the scenario used as an example above would probably not function at all if the data were changed to identifying interactions between proteins described in biomedical text.

An established way of approaching the problem of domain adaptation of systems for information extraction is to realise its components using machine learning, i.e. computer programs that can learn. In many respects machine learning is based on there being examples from which to learn. A component in an extraction system needs to see examples of the phenomenon it is going to learn to identify, e.g. entities and the relationships between them. The basis of this type of machine learning is thus access to large quantities of examples. However, there are major challenges in producing good examples: it is laborious, takes time and requires a person who knows the domain well to mark up examples in texts.

Recognising names of, for example individuals, companies and locations is fundamental for information extraction. By recognising names we can also start to look for, for example, relationships, expressed in the text, between the bearers of the names.

In his thesis Fredrik Olsson describes the work of developing and evaluating a method, called BootMark, of marking up the occurrence of names in textual documents. BootMark contributes to reducing the quantity of documents that a human annotator needs to mark up in order to train a name recognizer with a performance that is equally good or better than a name recognizer who is trained in a random selection of documents from the same corpus.

Title of the thesis: Bootstrapping Named Entity Annotation by Means of Active Machine Learning. A method for creating corpora.
The thesis will be public defended on Friday 19 December at 1.15 pm
Location: Lilla hörsalen, Humanisten, Renströmsgatan 6
For further information contact Fredrik Olsson, mobile: +46 (0)704 -15 54 10,
e-mail: fredriko@sics.se
Contact person: Barbro Ryder Liljegren Faculty of Arts, University of Gothenburg Tel. +46 (0)31-786 48 65, e-mail: barbro.ryder@hum.gu.se

Eva Lundgren | idw
Further information:
http://www.vr.se

More articles from Communications Media:

nachricht On patrol in social networks
25.01.2017 | Fraunhofer-Institut für Arbeitswirtschaft und Organisation IAO

nachricht Tile Based DASH Streaming for Virtual Reality with HEVC from Fraunhofer HHI
03.01.2017 | Fraunhofer-Institut für Nachrichtentechnik Heinrich-Hertz-Institut

All articles from Communications Media >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: Breakthrough with a chain of gold atoms

In the field of nanoscience, an international team of physicists with participants from Konstanz has achieved a breakthrough in understanding heat transport

In the field of nanoscience, an international team of physicists with participants from Konstanz has achieved a breakthrough in understanding heat transport

Im Focus: DNA repair: a new letter in the cell alphabet

Results reveal how discoveries may be hidden in scientific “blind spots”

Cells need to repair damaged DNA in our genes to prevent the development of cancer and other diseases. Our cells therefore activate and send “repair-proteins”...

Im Focus: Dresdner scientists print tomorrow’s world

The Fraunhofer IWS Dresden and Technische Universität Dresden inaugurated their jointly operated Center for Additive Manufacturing Dresden (AMCD) with a festive ceremony on February 7, 2017. Scientists from various disciplines perform research on materials, additive manufacturing processes and innovative technologies, which build up components in a layer by layer process. This technology opens up new horizons for component design and combinations of functions. For example during fabrication, electrical conductors and sensors are already able to be additively manufactured into components. They provide information about stress conditions of a product during operation.

The 3D-printing technology, or additive manufacturing as it is often called, has long made the step out of scientific research laboratories into industrial...

Im Focus: Mimicking nature's cellular architectures via 3-D printing

Research offers new level of control over the structure of 3-D printed materials

Nature does amazing things with limited design materials. Grass, for example, can support its own weight, resist strong wind loads, and recover after being...

Im Focus: Three Magnetic States for Each Hole

Nanometer-scale magnetic perforated grids could create new possibilities for computing. Together with international colleagues, scientists from the Helmholtz Zentrum Dresden-Rossendorf (HZDR) have shown how a cobalt grid can be reliably programmed at room temperature. In addition they discovered that for every hole ("antidot") three magnetic states can be configured. The results have been published in the journal "Scientific Reports".

Physicist Dr. Rantej Bali from the HZDR, together with scientists from Singapore and Australia, designed a special grid structure in a thin layer of cobalt in...

All Focus news of the innovation-report >>>

Anzeige

Anzeige

Event News

Booth and panel discussion – The Lindau Nobel Laureate Meetings at the AAAS 2017 Annual Meeting

13.02.2017 | Event News

Complex Loading versus Hidden Reserves

10.02.2017 | Event News

International Conference on Crystal Growth in Freiburg

09.02.2017 | Event News

 
Latest News

Stingless bees have their nests protected by soldiers

24.02.2017 | Life Sciences

New risk factors for anxiety disorders

24.02.2017 | Life Sciences

MWC 2017: 5G Capital Berlin

24.02.2017 | Trade Fair News

VideoLinks
B2B-VideoLinks
More VideoLinks >>>