Classic grammar model can be used for computerised parsing

One such application enables queries answered by a digital text to be generated when it is opened, and then used to search for specific information in the text.

Language researcher Kenneth Wilhelmsson has developed a new method which interprets the grammatical structure of a text, known as parsing, with the help of a computer program.

The method builds on Danish linguist Paul Diderichsen’s traditional sentence structure, which has been adopted for the description of all the Nordic languages and is found in most modern Swedish grammar books.

“The grammatical analysis in the program is performed mostly at the main clause level, which can be seen as a big advantage, as the task is then less complex but still gives usable results,” explains Wilhelmsson at the University of Gothenburg.

Instead of performing the entire analysis in one go, the approach consists of a series of steps which can be performed with high levels of accuracy. It is primarily the main clause’s finite verb and other single-word sentence elements which are identified at the main clause level. This, in turn, paves the way for the identification of complex sentence elements (subject, object/predicative and adverbial), which can rely on exclusion methodologies and similar rule formulations (heuristics) rather than an explicit, complete grammatical description.

Kenneth Wilhelmsson’s newly developed method can also be used by language researchers to search for instances of different grammatical phenomena, which can be described in a more refined fashion than with word and string matching.

Wilhelmsson’s work on the thesis also included the creation of various prototype applications which build on this type of analysis. One of them is a unique system for automatic generation of queries from a Swedish text.

The program has access to the Swedish Wikipedia’s article database and can be used to generate queries when a text is opened. When the user begins to type a query, the text is completed automatically, and only queries that can actually be answered may be asked.

“This is intended as an alternative to most other modern query programs where the user cannot know whether a query can actually be answered by the knowledge base at all, and where variations in the formulation of the query may mean that information that is there is missed,” explains Wilhelmsson.

Title of thesis: Heuristic Analysis with Diderichsen's Sentence Schema – Applications for Swedish Text
Author: Kenneth Wilhelmsson, tel: +46 31 408 211
E-mail: kw@ling.gu.se
Link to thesis: http://hdl.handle.net/2077/22028

Media Contact

Helena Aaberg idw

All latest news from the category: Information Technology

Here you can find a summary of innovations in the fields of information and data processing and up-to-date developments on IT equipment and hardware.

This area covers topics such as IT services, IT architectures, IT management and telecommunications.

Back to home

Comments (0)

Write a comment

Newest articles

Lighting up the future

New multidisciplinary research from the University of St Andrews could lead to more efficient televisions, computer screens and lighting. Researchers at the Organic Semiconductor Centre in the School of Physics and…

Researchers crack sugarcane’s complex genetic code

Sweet success: Scientists created a highly accurate reference genome for one of the most important modern crops and found a rare example of how genes confer disease resistance in plants….

Evolution of the most powerful ocean current on Earth

The Antarctic Circumpolar Current plays an important part in global overturning circulation, the exchange of heat and CO2 between the ocean and atmosphere, and the stability of Antarctica’s ice sheets….

Partners & Sponsors