Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:

 

New method to facilitate extraction

18.12.2008
With today's large flows of data-based texts it is important to produce systems that facilitate searches for the particular information that is required.

Information on, for example, events in a company from news texts; who is leaving which post, why, to which company and position the person is moving etc. In his thesis Fredrik Olsson deals with a new method of facilitating the marking up of occurrences of names in data-based textual documents.

Information extraction entails analysing texts with the aim of identifying and picking out information about predefined types of entities, events in which the entities are engaged and relationships between entities and events. In other words it is about gaining access to structured information from an apparently unstructured source of information.

One of the reasons that information extraction is not available for everyone is that it requires a lot of work and time to adapt a system to function for new data in a new text domain. A system that could handle the scenario used as an example above would probably not function at all if the data were changed to identifying interactions between proteins described in biomedical text.

An established way of approaching the problem of domain adaptation of systems for information extraction is to realise its components using machine learning, i.e. computer programs that can learn. In many respects machine learning is based on there being examples from which to learn. A component in an extraction system needs to see examples of the phenomenon it is going to learn to identify, e.g. entities and the relationships between them. The basis of this type of machine learning is thus access to large quantities of examples. However, there are major challenges in producing good examples: it is laborious, takes time and requires a person who knows the domain well to mark up examples in texts.

Recognising names of, for example individuals, companies and locations is fundamental for information extraction. By recognising names we can also start to look for, for example, relationships, expressed in the text, between the bearers of the names.

In his thesis Fredrik Olsson describes the work of developing and evaluating a method, called BootMark, of marking up the occurrence of names in textual documents. BootMark contributes to reducing the quantity of documents that a human annotator needs to mark up in order to train a name recognizer with a performance that is equally good or better than a name recognizer who is trained in a random selection of documents from the same corpus.

Title of the thesis: Bootstrapping Named Entity Annotation by Means of Active Machine Learning. A method for creating corpora.
The thesis will be public defended on Friday 19 December at 1.15 pm
Location: Lilla hörsalen, Humanisten, Renströmsgatan 6
For further information contact Fredrik Olsson, mobile: +46 (0)704 -15 54 10,
e-mail: fredriko@sics.se
Contact person: Barbro Ryder Liljegren Faculty of Arts, University of Gothenburg Tel. +46 (0)31-786 48 65, e-mail: barbro.ryder@hum.gu.se

Eva Lundgren | idw
Further information:
http://www.vr.se

More articles from Communications Media:

nachricht Arguments, Emotions, and News distribution in social media - Leibniz-WissenschaftsCampus Tübingen
04.05.2018 | Leibniz-Institut für Wissensmedien

nachricht High Number of Science Enthusiasts in Switzerland
05.02.2018 | Universität Zürich

All articles from Communications Media >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: Powerful IT security for the car of the future – research alliance develops new approaches

The more electronics steer, accelerate and brake cars, the more important it is to protect them against cyber-attacks. That is why 15 partners from industry and academia will work together over the next three years on new approaches to IT security in self-driving cars. The joint project goes by the name Security For Connected, Autonomous Cars (SecForCARs) and has funding of €7.2 million from the German Federal Ministry of Education and Research. Infineon is leading the project.

Vehicles already offer diverse communication interfaces and more and more automated functions, such as distance and lane-keeping assist systems. At the same...

Im Focus: Molecular switch will facilitate the development of pioneering electro-optical devices

A research team led by physicists at the Technical University of Munich (TUM) has developed molecular nanoswitches that can be toggled between two structurally different states using an applied voltage. They can serve as the basis for a pioneering class of devices that could replace silicon-based components with organic molecules.

The development of new electronic technologies drives the incessant reduction of functional component sizes. In the context of an international collaborative...

Im Focus: LZH showcases laser material processing of tomorrow at the LASYS 2018

At the LASYS 2018, from June 5th to 7th, the Laser Zentrum Hannover e.V. (LZH) will be showcasing processes for the laser material processing of tomorrow in hall 4 at stand 4E75. With blown bomb shells the LZH will present first results of a research project on civil security.

At this year's LASYS, the LZH will exhibit light-based processes such as cutting, welding, ablation and structuring as well as additive manufacturing for...

Im Focus: Self-illuminating pixels for a new display generation

There are videos on the internet that can make one marvel at technology. For example, a smartphone is casually bent around the arm or a thin-film display is rolled in all directions and with almost every diameter. From the user's point of view, this looks fantastic. From a professional point of view, however, the question arises: Is that already possible?

At Display Week 2018, scientists from the Fraunhofer Institute for Applied Polymer Research IAP will be demonstrating today’s technological possibilities and...

Im Focus: Explanation for puzzling quantum oscillations has been found

So-called quantum many-body scars allow quantum systems to stay out of equilibrium much longer, explaining experiment | Study published in Nature Physics

Recently, researchers from Harvard and MIT succeeded in trapping a record 53 atoms and individually controlling their quantum state, realizing what is called a...

All Focus news of the innovation-report >>>

Anzeige

Anzeige

VideoLinks
Industry & Economy
Event News

In focus: Climate adapted plants

25.05.2018 | Event News

Save the date: Forum European Neuroscience – 07-11 July 2018 in Berlin, Germany

02.05.2018 | Event News

Invitation to the upcoming "Current Topics in Bioinformatics: Big Data in Genomics and Medicine"

13.04.2018 | Event News

 
Latest News

In focus: Climate adapted plants

25.05.2018 | Event News

Flow probes from the 3D printer

25.05.2018 | Machine Engineering

Less is more? Gene switch for healthy aging found

25.05.2018 | Life Sciences

VideoLinks
Science & Research
Overview of more VideoLinks >>>