Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:

 

A rose is a rózsa is a 薔薇: Image-search tool speaks hundreds of languages

14.09.2007
From the fall of the Tower of Babel to the Esperanto global language movement, many humans have dreamed of sharing a common tongue. Despite the Internet's promise of global communication, language barriers remain. Even pictures on the Web get lost in translation.

"Images are universal, but image search is not," said Oren Etzioni, a professor of computer science and engineering at the University of Washington. "A person who types his or her search in English won't find images tagged in Chinese, and a Dutch person won't find images tagged in English. We've created a collaborative tool that solves this problem."

A new multilingual search tool developed at the UW's Turing Center makes the universal appeal of pictures available to all. PanImages, presented today at the Machine Translation Summit in Copenhagen, Denmark, allows people to search for images on the Web using hundreds of languages.

Search engines such as Google look for images by detecting the search term in captions and other nearby text. But since the process looks for a string of letters, the results are limited to the seeker's mother tongue.

The new tool is named PanImages, from the Greek prefix, "pan," meaning whole or all-inclusive. It automatically translates the search term into about 300 other languages, suggests a few that might work and then displays images from Google and the online photo database Flickr.

PanImages promises to help people who speak languages that have a small Web presence. Imagine you are a Zulu speaker looking for a picture of a refrigerator, Etzioni said. You type the Zulu word for refrigerator ("ifriji") into an image search and get two results. The same search using PanImages generates 472,000 hits. In a test of so-called minor languages, PanImages was able to find 57 times more results, on average, than a Google image search.

"We want to serve the vast number of people who don't speak one of the major languages," Etzioni said. "As the Internet becomes more widely available outside of the major industrialized nations, it becomes increasingly important to serve people who don't speak English, French or Chinese."

Even people who speak these more common languages can benefit by switching electronic tongues. Words that have more than one meaning inevitably produce unwanted results. For instance, typing the word "spring" in an English-language image search generates diverse images: grassy meadows, metal coils and pictures from the town of Silver Spring, Md. If you want images of a metal spring, you might use PanImages and search for the more precise French word "ressort." If you want a picture of a rectangular bar and don't want businesses where patrons drink alcohol, you might try the Russian word "áðóñîê." Experiments showed that, for common languages, PanImages nearly doubles the number of correct images on the first 15 pages of results.

PanImages' powerful brains were created by scanning more than 350 machine-readable online dictionaries. Some of these were "wiktionaries," online multilingual dictionaries written by volunteers. The PanImages software scans these dictionaries and uses an algorithm to check the accuracy of the results. It then assembles the results in a matrix that allows translation in combinations that may never have been attempted -- for instance, from Gujarati to Lithuanian.

"It's an unprecedented lexical resource. The most distinguishing element is its ability to scale to such a broad set of languages," Etzioni said. "Our goal is to ultimately cover all the languages people are interested in."

Free online translation services used by Yahoo! and Google incorporate just one or two dozen common languages. In the United States, research on machine translation tends to focus on languages with military importance, such as Arabic and Chinese, Etzioni said. PanImages had 50 languages earlier this year and by June it incorporated 100 languages. It now includes some 300 languages, 2.5 million words and millions of individual translations.

PanImages also lets people instantly add new words or translations.

Future work on PanImages will scour more online dictionaries to expand the number of words and languages it can handle. Researchers also hope to translate the words used in tagging sites, such as del.icio.us, where visitors use single-word labels to describe the page's content.

"Our goal is to promote pan-lingual translation," said Etzioni. "With this first step, we've created a service we hope will be a handy tool."

Hannah Hickey | EurekAlert!
Further information:
http://www.washington.edu

More articles from Communications Media:

nachricht New Technologies for A/V Analysis and Search
13.04.2017 | Fraunhofer-Institut für Digitale Medientechnologie IDMT

nachricht On patrol in social networks
25.01.2017 | Fraunhofer-Institut für Arbeitswirtschaft und Organisation IAO

All articles from Communications Media >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: The pyrenoid is a carbon-fixing liquid droplet

Plants and algae use the enzyme Rubisco to fix carbon dioxide, removing it from the atmosphere and converting it into biomass. Algae have figured out a way to increase the efficiency of carbon fixation. They gather most of their Rubisco into a ball-shaped microcompartment called the pyrenoid, which they flood with a high local concentration of carbon dioxide. A team of scientists at Princeton University, the Carnegie Institution for Science, Stanford University and the Max Plank Institute of Biochemistry have unravelled the mysteries of how the pyrenoid is assembled. These insights can help to engineer crops that remove more carbon dioxide from the atmosphere while producing more food.

A warming planet

Im Focus: Highly precise wiring in the Cerebral Cortex

Our brains house extremely complex neuronal circuits, whose detailed structures are still largely unknown. This is especially true for the so-called cerebral cortex of mammals, where among other things vision, thoughts or spatial orientation are being computed. Here the rules by which nerve cells are connected to each other are only partly understood. A team of scientists around Moritz Helmstaedter at the Frankfiurt Max Planck Institute for Brain Research and Helene Schmidt (Humboldt University in Berlin) have now discovered a surprisingly precise nerve cell connectivity pattern in the part of the cerebral cortex that is responsible for orienting the individual animal or human in space.

The researchers report online in Nature (Schmidt et al., 2017. Axonal synapse sorting in medial entorhinal cortex, DOI: 10.1038/nature24005) that synapses in...

Im Focus: Tiny lasers from a gallery of whispers

New technique promises tunable laser devices

Whispering gallery mode (WGM) resonators are used to make tiny micro-lasers, sensors, switches, routers and other devices. These tiny structures rely on a...

Im Focus: Ultrafast snapshots of relaxing electrons in solids

Using ultrafast flashes of laser and x-ray radiation, scientists at the Max Planck Institute of Quantum Optics (Garching, Germany) took snapshots of the briefest electron motion inside a solid material to date. The electron motion lasted only 750 billionths of the billionth of a second before it fainted, setting a new record of human capability to capture ultrafast processes inside solids!

When x-rays shine onto solid materials or large molecules, an electron is pushed away from its original place near the nucleus of the atom, leaving a hole...

Im Focus: Quantum Sensors Decipher Magnetic Ordering in a New Semiconducting Material

For the first time, physicists have successfully imaged spiral magnetic ordering in a multiferroic material. These materials are considered highly promising candidates for future data storage media. The researchers were able to prove their findings using unique quantum sensors that were developed at Basel University and that can analyze electromagnetic fields on the nanometer scale. The results – obtained by scientists from the University of Basel’s Department of Physics, the Swiss Nanoscience Institute, the University of Montpellier and several laboratories from University Paris-Saclay – were recently published in the journal Nature.

Multiferroics are materials that simultaneously react to electric and magnetic fields. These two properties are rarely found together, and their combined...

All Focus news of the innovation-report >>>

Anzeige

Anzeige

Event News

“Lasers in Composites Symposium” in Aachen – from Science to Application

19.09.2017 | Event News

I-ESA 2018 – Call for Papers

12.09.2017 | Event News

EMBO at Basel Life, a new conference on current and emerging life science research

06.09.2017 | Event News

 
Latest News

Rainbow colors reveal cell history: Uncovering β-cell heterogeneity

22.09.2017 | Life Sciences

Penn first in world to treat patient with new radiation technology

22.09.2017 | Medical Engineering

Calculating quietness

22.09.2017 | Physics and Astronomy

VideoLinks
B2B-VideoLinks
More VideoLinks >>>