New tool for building thesauruses

Tesaurvai can extract, annotate and organize specialized terms taken from a collection of digitalized texts. Tesaurvai complies with the ISO thesaurus building standard and was developed by the VAI in conjunction with the Spanish National Research Council’s Institute of Documentary Studies on Science and Technology (formerly CINDOC).

Euralex is Europe’s most influential lexicographical congress. The InfoLex research group, based at the Universidad Pompeu Fabra’s College of Applied Linguistics is organizing the 2008 event, which will bring together professional lexicographers, publishers, researchers, specialists and anyone with an interest in dictionaries of any kind.

2 in 1

Tesaurvai’s key innovation is that it combines a terminology extractor capable of ordering and selecting from 1- to 10-word terms with ISO standard-compliant thesaurus building capabilities in the same tool. The extractor identifies the terms located in digital texts that are to be transferred to the thesaurus builder. The thesaurus is a systematized list of domain-representative terms.

Tesaurvai conforms to international thesaurus building and management standards and has several implementations. First, the tool can build thesauruses from scratch, through information extraction to term creation, edition and annotation. It is easy to use to establish relationships between terms and run basic and advanced word searches. Second, the Tesaurvai tool can import and export text thesauruses to XML files. Finally, it can build alphabetical and systematized indices, which can be exchanged for printing or exportation as reports.

Available as of 2008

The tool has been developed in Java and works on a database. Tesaurvai is compatible with any database manager equipped with Java Database (JDBC) connectivity.

It was developed as part of the “Cultural heritage document search based on multilingual technical resources” (Patrilex) project, supported by the Ministry of Education with the aim of generating a methodology and tools for building multilingual lexical resources.

Tesaurvai is now undergoing massive testing. As of July 2008 it will be available to any Internet user.

Media Contact

Eduardo Martínez alfa

All latest news from the category: Information Technology

Here you can find a summary of innovations in the fields of information and data processing and up-to-date developments on IT equipment and hardware.

This area covers topics such as IT services, IT architectures, IT management and telecommunications.

Back to home

Comments (0)

Write a comment

Newest articles

High-energy-density aqueous battery based on halogen multi-electron transfer

Traditional non-aqueous lithium-ion batteries have a high energy density, but their safety is compromised due to the flammable organic electrolytes they utilize. Aqueous batteries use water as the solvent for…

First-ever combined heart pump and pig kidney transplant

…gives new hope to patient with terminal illness. Surgeons at NYU Langone Health performed the first-ever combined mechanical heart pump and gene-edited pig kidney transplant surgery in a 54-year-old woman…

Biophysics: Testing how well biomarkers work

LMU researchers have developed a method to determine how reliably target proteins can be labeled using super-resolution fluorescence microscopy. Modern microscopy techniques make it possible to examine the inner workings…

Partners & Sponsors