However, a group of Georgia Tech researchers have created a machine-learning model that enables the sites to “learn” dialect and other medical vernacular, thereby improving their performance for users who use such language themselves.
Called “diaTM” (short for “dialect topic modeling”), the system learns by comparing multiple medical documents written in different levels of technical language. By comparing enough of these documents, diaTM eventually learns which medical conditions, symptoms and procedures are associated with certain dialectal words or phrases, thus shrinking the “language gap” between consumers with health questions and the medical databases they turn to for answers.
“The language gap problem seems to be the most acute in the medical domain,” said Hongyuan Zha, professor in the School of Computational Science & Engineering and a paper co-author. “Providing a solution for this domain will have a high impact on maintaining and improving people’s health.”
To educate diaTM in various modes of medical language, Crain and his fellow researchers pulled publicly available documents not only from WebMD but also Yahoo! Answers, PubMed Central, the Centers for Disease Control & Prevention website, and other sources. After processing enough documents, he said, diaTM can learn that the word “gunk,” for example, is often a vernacular term for “discharge,” and it can process user searches that incorporate the word “gunk” appropriately.
In this initial study using small-scale experiments, the researchers found that diaTM can achieve a 25 percent improvement in nDCG (“normalized discounted cumulative gain”), a scientific term that refers to the relevance of information retrieval in a web search. Zha, whose research focuses on Internet search engines and their related algorithms, said a 5 percent improvement in nDCG is “very significant.”
“DiaTM figures out enough language relationships that over time it does quite well,” said Steven Crain, Ph.D. student in computer science and lead author of the paper that describes diaTM. “Another benefit is we’re not doing word-for-word equivalencies, so ‘gunk’ doesn’t necessarily have to be connected to ‘discharge,’ as long as it’s recognized that ‘gunk’ is related to infections.”
Also, diaTM is not limited to medical search; it is a machine-learning technique that would work equally well in any topic-related search. In addition to approaching websites about incorporating diaTM into their search engines, Crain said one next stop is to develop the model so that it can learn dialects by looking at patterns that do not make sense from a topical perspective. For example, using a similar algorithm he was able to automatically discover dialects including text-speak dialect (e.g. “b4” as a subsititue for “before”), but the dialects were mixed in with topically-related groups of words.
“We’re trying to get to where you can isolate just the dialects,” Crain said.
“This feature will help common users of medical websites,” Zha said. “It will help enable consumers with a relatively low level of health literacy to access the critical medical information they need.”
DiaTM is described in the paper, “Dialect Topic Modeling for Improved Consumer Medical Search,” to be presented by Crain at the American Medical Informatics Association Annual Symposium, Nov. 17 in Washington, D.C. Crain’s coauthors include Hongyuan Zha, professor in the School of Computational Science & Engineering; Shuang-Hong Yang, a Ph.D. student in Computational Science and Engineering; and Yu Jiao, research scientist at Oak Ridge National Laboratory (ORNL). The research was conducted with partial funding from ORNL, Microsoft and Hewlett-Packard.About the Georgia Tech College of Computing
Michael Terrazas | Newswise Science News
New study: How does Europe become a leading player for software and IT services?
03.04.2017 | Fraunhofer-Institut für System- und Innovationsforschung (ISI)
Reusable carbon nanotubes could be the water filter of the future, says RIT study
30.03.2017 | Rochester Institute of Technology
Staphylococcus aureus is a feared pathogen (MRSA, multi-resistant S. aureus) due to frequent resistances against many antibiotics, especially in hospital infections. Researchers at the Paul-Ehrlich-Institut have identified immunological processes that prevent a successful immune response directed against the pathogenic agent. The delivery of bacterial proteins with RNA adjuvant or messenger RNA (mRNA) into immune cells allows the re-direction of the immune response towards an active defense against S. aureus. This could be of significant importance for the development of an effective vaccine. PLOS Pathogens has published these research results online on 25 May 2017.
Staphylococcus aureus (S. aureus) is a bacterium that colonizes by far more than half of the skin and the mucosa of adults, usually without causing infections....
Physicists from the University of Würzburg are capable of generating identical looking single light particles at the push of a button. Two new studies now demonstrate the potential this method holds.
The quantum computer has fuelled the imagination of scientists for decades: It is based on fundamentally different phenomena than a conventional computer....
An international team of physicists has monitored the scattering behaviour of electrons in a non-conducting material in real-time. Their insights could be beneficial for radiotherapy.
We can refer to electrons in non-conducting materials as ‘sluggish’. Typically, they remain fixed in a location, deep inside an atomic composite. It is hence...
Two-dimensional magnetic structures are regarded as a promising material for new types of data storage, since the magnetic properties of individual molecular building blocks can be investigated and modified. For the first time, researchers have now produced a wafer-thin ferrimagnet, in which molecules with different magnetic centers arrange themselves on a gold surface to form a checkerboard pattern. Scientists at the Swiss Nanoscience Institute at the University of Basel and the Paul Scherrer Institute published their findings in the journal Nature Communications.
Ferrimagnets are composed of two centers which are magnetized at different strengths and point in opposing directions. Two-dimensional, quasi-flat ferrimagnets...
An Australian-Chinese research team has created the world's thinnest hologram, paving the way towards the integration of 3D holography into everyday...
24.05.2017 | Event News
23.05.2017 | Event News
22.05.2017 | Event News
26.05.2017 | Life Sciences
26.05.2017 | Life Sciences
26.05.2017 | Physics and Astronomy