New tool for building thesauruses

Tesaurvai can extract, annotate and organize specialized terms taken from a collection of digitalized texts. Tesaurvai complies with the ISO thesaurus building standard and was developed by the VAI in conjunction with the Spanish National Research Council’s Institute of Documentary Studies on Science and Technology (formerly CINDOC).

Euralex is Europe’s most influential lexicographical congress. The InfoLex research group, based at the Universidad Pompeu Fabra’s College of Applied Linguistics is organizing the 2008 event, which will bring together professional lexicographers, publishers, researchers, specialists and anyone with an interest in dictionaries of any kind.

2 in 1

Tesaurvai’s key innovation is that it combines a terminology extractor capable of ordering and selecting from 1- to 10-word terms with ISO standard-compliant thesaurus building capabilities in the same tool. The extractor identifies the terms located in digital texts that are to be transferred to the thesaurus builder. The thesaurus is a systematized list of domain-representative terms.

Tesaurvai conforms to international thesaurus building and management standards and has several implementations. First, the tool can build thesauruses from scratch, through information extraction to term creation, edition and annotation. It is easy to use to establish relationships between terms and run basic and advanced word searches. Second, the Tesaurvai tool can import and export text thesauruses to XML files. Finally, it can build alphabetical and systematized indices, which can be exchanged for printing or exportation as reports.

Available as of 2008

The tool has been developed in Java and works on a database. Tesaurvai is compatible with any database manager equipped with Java Database (JDBC) connectivity.

It was developed as part of the “Cultural heritage document search based on multilingual technical resources” (Patrilex) project, supported by the Ministry of Education with the aim of generating a methodology and tools for building multilingual lexical resources.

Tesaurvai is now undergoing massive testing. As of July 2008 it will be available to any Internet user.

Media Contact

Eduardo Martínez alfa

All latest news from the category: Information Technology

Here you can find a summary of innovations in the fields of information and data processing and up-to-date developments on IT equipment and hardware.

This area covers topics such as IT services, IT architectures, IT management and telecommunications.

Back to home

Comments (0)

Write a comment

Newest articles

Sea slugs inspire highly stretchable biomedical sensor

USC Viterbi School of Engineering researcher Hangbo Zhao presents findings on highly stretchable and customizable microneedles for application in fields including neuroscience, tissue engineering, and wearable bioelectronics. The revolution in…

Twisting and binding matter waves with photons in a cavity

Precisely measuring the energy states of individual atoms has been a historical challenge for physicists due to atomic recoil. When an atom interacts with a photon, the atom “recoils” in…

Nanotubes, nanoparticles, and antibodies detect tiny amounts of fentanyl

New sensor is six orders of magnitude more sensitive than the next best thing. A research team at Pitt led by Alexander Star, a chemistry professor in the Kenneth P. Dietrich…

Partners & Sponsors