Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:

 

A rose is a rózsa is a 薔薇: Image-search tool speaks hundreds of languages

14.09.2007
From the fall of the Tower of Babel to the Esperanto global language movement, many humans have dreamed of sharing a common tongue. Despite the Internet's promise of global communication, language barriers remain. Even pictures on the Web get lost in translation.

"Images are universal, but image search is not," said Oren Etzioni, a professor of computer science and engineering at the University of Washington. "A person who types his or her search in English won't find images tagged in Chinese, and a Dutch person won't find images tagged in English. We've created a collaborative tool that solves this problem."

A new multilingual search tool developed at the UW's Turing Center makes the universal appeal of pictures available to all. PanImages, presented today at the Machine Translation Summit in Copenhagen, Denmark, allows people to search for images on the Web using hundreds of languages.

Search engines such as Google look for images by detecting the search term in captions and other nearby text. But since the process looks for a string of letters, the results are limited to the seeker's mother tongue.

The new tool is named PanImages, from the Greek prefix, "pan," meaning whole or all-inclusive. It automatically translates the search term into about 300 other languages, suggests a few that might work and then displays images from Google and the online photo database Flickr.

PanImages promises to help people who speak languages that have a small Web presence. Imagine you are a Zulu speaker looking for a picture of a refrigerator, Etzioni said. You type the Zulu word for refrigerator ("ifriji") into an image search and get two results. The same search using PanImages generates 472,000 hits. In a test of so-called minor languages, PanImages was able to find 57 times more results, on average, than a Google image search.

"We want to serve the vast number of people who don't speak one of the major languages," Etzioni said. "As the Internet becomes more widely available outside of the major industrialized nations, it becomes increasingly important to serve people who don't speak English, French or Chinese."

Even people who speak these more common languages can benefit by switching electronic tongues. Words that have more than one meaning inevitably produce unwanted results. For instance, typing the word "spring" in an English-language image search generates diverse images: grassy meadows, metal coils and pictures from the town of Silver Spring, Md. If you want images of a metal spring, you might use PanImages and search for the more precise French word "ressort." If you want a picture of a rectangular bar and don't want businesses where patrons drink alcohol, you might try the Russian word "áðóñîê." Experiments showed that, for common languages, PanImages nearly doubles the number of correct images on the first 15 pages of results.

PanImages' powerful brains were created by scanning more than 350 machine-readable online dictionaries. Some of these were "wiktionaries," online multilingual dictionaries written by volunteers. The PanImages software scans these dictionaries and uses an algorithm to check the accuracy of the results. It then assembles the results in a matrix that allows translation in combinations that may never have been attempted -- for instance, from Gujarati to Lithuanian.

"It's an unprecedented lexical resource. The most distinguishing element is its ability to scale to such a broad set of languages," Etzioni said. "Our goal is to ultimately cover all the languages people are interested in."

Free online translation services used by Yahoo! and Google incorporate just one or two dozen common languages. In the United States, research on machine translation tends to focus on languages with military importance, such as Arabic and Chinese, Etzioni said. PanImages had 50 languages earlier this year and by June it incorporated 100 languages. It now includes some 300 languages, 2.5 million words and millions of individual translations.

PanImages also lets people instantly add new words or translations.

Future work on PanImages will scour more online dictionaries to expand the number of words and languages it can handle. Researchers also hope to translate the words used in tagging sites, such as del.icio.us, where visitors use single-word labels to describe the page's content.

"Our goal is to promote pan-lingual translation," said Etzioni. "With this first step, we've created a service we hope will be a handy tool."

Hannah Hickey | EurekAlert!
Further information:
http://www.washington.edu

More articles from Communications Media:

nachricht On patrol in social networks
25.01.2017 | Fraunhofer-Institut für Arbeitswirtschaft und Organisation IAO

nachricht Tile Based DASH Streaming for Virtual Reality with HEVC from Fraunhofer HHI
03.01.2017 | Fraunhofer-Institut für Nachrichtentechnik Heinrich-Hertz-Institut

All articles from Communications Media >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: A Challenging European Research Project to Develop New Tiny Microscopes

The Institute of Semiconductor Technology and the Institute of Physical and Theoretical Chemistry, both members of the Laboratory for Emerging Nanometrology (LENA), at Technische Universität Braunschweig are partners in a new European research project entitled ChipScope, which aims to develop a completely new and extremely small optical microscope capable of observing the interior of living cells in real time. A consortium of 7 partners from 5 countries will tackle this issue with very ambitious objectives during a four-year research program.

To demonstrate the usefulness of this new scientific tool, at the end of the project the developed chip-sized microscope will be used to observe in real-time...

Im Focus: Giant Magnetic Fields in the Universe

Astronomers from Bonn and Tautenburg in Thuringia (Germany) used the 100-m radio telescope at Effelsberg to observe several galaxy clusters. At the edges of these large accumulations of dark matter, stellar systems (galaxies), hot gas, and charged particles, they found magnetic fields that are exceptionally ordered over distances of many million light years. This makes them the most extended magnetic fields in the universe known so far.

The results will be published on March 22 in the journal „Astronomy & Astrophysics“.

Galaxy clusters are the largest gravitationally bound structures in the universe. With a typical extent of about 10 million light years, i.e. 100 times the...

Im Focus: Tracing down linear ubiquitination

Researchers at the Goethe University Frankfurt, together with partners from the University of Tübingen in Germany and Queen Mary University as well as Francis Crick Institute from London (UK) have developed a novel technology to decipher the secret ubiquitin code.

Ubiquitin is a small protein that can be linked to other cellular proteins, thereby controlling and modulating their functions. The attachment occurs in many...

Im Focus: Perovskite edges can be tuned for optoelectronic performance

Layered 2D material improves efficiency for solar cells and LEDs

In the eternal search for next generation high-efficiency solar cells and LEDs, scientists at Los Alamos National Laboratory and their partners are creating...

Im Focus: Polymer-coated silicon nanosheets as alternative to graphene: A perfect team for nanoelectronics

Silicon nanosheets are thin, two-dimensional layers with exceptional optoelectronic properties very similar to those of graphene. Albeit, the nanosheets are less stable. Now researchers at the Technical University of Munich (TUM) have, for the first time ever, produced a composite material combining silicon nanosheets and a polymer that is both UV-resistant and easy to process. This brings the scientists a significant step closer to industrial applications like flexible displays and photosensors.

Silicon nanosheets are thin, two-dimensional layers with exceptional optoelectronic properties very similar to those of graphene. Albeit, the nanosheets are...

All Focus news of the innovation-report >>>

Anzeige

Anzeige

Event News

International Land Use Symposium ILUS 2017: Call for Abstracts and Registration open

20.03.2017 | Event News

CONNECT 2017: International congress on connective tissue

14.03.2017 | Event News

ICTM Conference: Turbine Construction between Big Data and Additive Manufacturing

07.03.2017 | Event News

 
Latest News

Transport of molecular motors into cilia

28.03.2017 | Life Sciences

A novel hybrid UAV that may change the way people operate drones

28.03.2017 | Information Technology

NASA spacecraft investigate clues in radiation belts

28.03.2017 | Physics and Astronomy

VideoLinks
B2B-VideoLinks
More VideoLinks >>>