Forum for Science, Industry and Business
Sponsored by:     Siemens  n-tv 
Search our Site:

Topic (optional):

 

Home Reports Studies and Analyses Content

Researchers Teach Medical Search Engines to Learn Slang

next article
18.11.2010

Medical websites like WebMD provide consumers with more access than ever before to comprehensive health and medical information, but the sites’ utility becomes limited if users use unclear or unorthodox language to describe conditions in a site search.

 

However, a group of Georgia Tech researchers have created a machine-learning model that enables the sites to “learn” dialect and other medical vernacular, thereby improving their performance for users who use such language themselves.


Called “diaTM” (short for “dialect topic modeling”), the system learns by comparing multiple medical documents written in different levels of technical language. By comparing enough of these documents, diaTM eventually learns which medical conditions, symptoms and procedures are associated with certain dialectal words or phrases, thus shrinking the “language gap” between consumers with health questions and the medical databases they turn to for answers.

“The language gap problem seems to be the most acute in the medical domain,” said Hongyuan Zha, professor in the School of Computational Science & Engineering and a paper co-author. “Providing a solution for this domain will have a high impact on maintaining and improving people’s health.”

To educate diaTM in various modes of medical language, Crain and his fellow researchers pulled publicly available documents not only from WebMD but also Yahoo! Answers, PubMed Central, the Centers for Disease Control & Prevention website, and other sources. After processing enough documents, he said, diaTM can learn that the word “gunk,” for example, is often a vernacular term for “discharge,” and it can process user searches that incorporate the word “gunk” appropriately.

In this initial study using small-scale experiments, the researchers found that diaTM can achieve a 25 percent improvement in nDCG (“normalized discounted cumulative gain”), a scientific term that refers to the relevance of information retrieval in a web search. Zha, whose research focuses on Internet search engines and their related algorithms, said a 5 percent improvement in nDCG is “very significant.”

“DiaTM figures out enough language relationships that over time it does quite well,” said Steven Crain, Ph.D. student in computer science and lead author of the paper that describes diaTM. “Another benefit is we’re not doing word-for-word equivalencies, so ‘gunk’ doesn’t necessarily have to be connected to ‘discharge,’ as long as it’s recognized that ‘gunk’ is related to infections.”

Also, diaTM is not limited to medical search; it is a machine-learning technique that would work equally well in any topic-related search. In addition to approaching websites about incorporating diaTM into their search engines, Crain said one next stop is to develop the model so that it can learn dialects by looking at patterns that do not make sense from a topical perspective. For example, using a similar algorithm he was able to automatically discover dialects including text-speak dialect (e.g. “b4” as a subsititue for “before”), but the dialects were mixed in with topically-related groups of words.

“We’re trying to get to where you can isolate just the dialects,” Crain said.

“This feature will help common users of medical websites,” Zha said. “It will help enable consumers with a relatively low level of health literacy to access the critical medical information they need.”

DiaTM is described in the paper, “Dialect Topic Modeling for Improved Consumer Medical Search,” to be presented by Crain at the American Medical Informatics Association Annual Symposium, Nov. 17 in Washington, D.C. Crain’s coauthors include Hongyuan Zha, professor in the School of Computational Science & Engineering; Shuang-Hong Yang, a Ph.D. student in Computational Science and Engineering; and Yu Jiao, research scientist at Oak Ridge National Laboratory (ORNL). The research was conducted with partial funding from ORNL, Microsoft and Hewlett-Packard.

About the Georgia Tech College of Computing
The Georgia Tech College of Computing is a national leader in the creation of real-world computing breakthroughs that drive social and scientific progress. With its graduate program ranked 10th nationally by U.S. News and World Report, the College’s unconventional approach to education is defining the new face of computing by expanding the horizons of traditional computer science students through interdisciplinary collaboration and a focus on human centered solutions. For more information about the Georgia Tech College of Computing, its academic divisions and research centers, please visit http://www.cc.gatech.edu.

Michael Terrazas | Source: Newswise Science News
Further information: www.gatech.edu

next article

More articles from Studies and Analyses:

nachricht Could cap and trade for water solve problems facing the United States' largest rivers?
18.05.2012 | Wiley-Blackwell

nachricht Teaching Creativity to Children from a Galaxy Away
18.05.2012 | American Friends of Tel Aviv University

All articles from Studies and Analyses >>>
The most recent press releases about innovation >>>

Overview of the latest five Focus news of the innovations-report:
In the focus: A supernova cocoon breakthrough

The first evidence in X-rays of a supernova shock wave breaking through a cocoon of gas around the star has been found.

This discovery may help explain why some supernova explosions are more powerful than others.

This supernova is called SN 2010jl and is found in a galaxy about 160 million light years from Earth.

SN 2010jl was first spotted by astronomers on November 3, 2010, and probably exploded about a month before that.

Observations with NASA's Chandra X-ray Observatory have provided the first X-ray evidence of a supernova shock wave breaking through a cocoon of gas surrounding the star that exploded. This discovery may help astronomers understand why some supernovas are much more powerful than others.

On November 3, 2010, a supernova was ...

In the focus: Fuel for the black hole

An international research team led by Gerd Weigelt from the Max-Planck-Institut für Radioastronomie in Bonn reports on high-resolution studies of an active galactic nucleus.

The use of near-infrared interferometry allowed the team to resolve a ring-shaped dust distribution (generally called "dust torus") in the inner region of the nucleus of the active galaxy NGC 3783. This method is able to achieve an angular resolution equivalent to the resolution of a telescope with a diameter ...

In the focus: Big-mouthed babies drove the evolution of giant island snakes

Some populations of tiger snakes stranded for thousands of years on tiny islands surrounding Australia have evolved to be giants, growing to nearly twice the size of their mainland cousins. Now, new research in The American Naturalist suggests that the enormity of these elapids was driven by the need to have big-mouthed babies.

Mainland tiger snakes, which generally max out at 35 inches (89 cm) long, patrol swampy areas in search of frogs, their dietary staple. When sea levels rose around 10,000 years ago, some tiger snakes found themselves marooned on islands that would become dry and frog-free. With their favorite food gone, ...

In the focus: Black holes turn up the heat for the Universe

HITS astrophysicists discover a new heating source in cosmological structure formation

So far, astrophysicists thought that super-massive black holes can only influence their immediate surroundings. A collaboration of scientists at the Heidelberg Institute for Theoretical Studies (HITS) and in Canada and the US now discovered that diffuse gas in the universe can absorb luminous gamma-ray emission from black holes, heating it ...

In the focus: German astronomers finish Europe’s largest solar telescope on Tenerife

After ten years of development, the new German solar telescope GREGOR will start operating at the Spanish Observatorio del Teide of the Instituto de Astrofísica de Canarias on Tenerife. It is the largest solar telescope in Europe and number three worldwide.

It will provide the German and the international community of solar physicists with new and better instrumentation which will enable them to investigate our home star in unprecedented detail.

Studying the Sun is a key to understand the physical processes on and in the majority of stars. Moreover, there is ...

All Focus news of the innovations-report >>>

B2B Search

Product / Service
Company / Organisation

Latest News

New technique reveals unseen information in DNA code

18.05.2012 | Life Sciences

Biologists Produce Potential Malarial Vaccine from Algae

18.05.2012 | Life Sciences

Listening to Chickens Could Improve Poultry Production

18.05.2012 | Agricultural and Forestry Science

VideoLinks
B2B-VideoLinks
More VideoLinks >>>

Event News

SecureCloud 2012 in Frankfurt

10.05.2012 | Event News

WWU hosts Germany’s Biggest Giftedness Congress

09.05.2012 | Event News

Neuroscientists Discuss Latest Research Results in Potsdam

08.05.2012 | Event News