Data Mining Is Shaping Our World. Are We Ready for It?

In a July cover article titled “The Web Means the End of Forgetting,” the New York Times Magazine explored how the Internet is changing privacy. Recently, the Wall Street Journal reported on Google’s “soul-searching” over the question of how far to go in “profiting from its crown jewels — the vast trove of data it possesses about people’s activities.”

Rohini Srihari, who teaches classes on Web search and mining and is founder and CEO of Janya Inc., a text analysis/text mining firm, understands the potential of data mining — and the complicated concerns it raises.

Rohini K. Srihari
Associate Professor of Computer Science and Engineering
University at Buffalo School of Engineering and Applied Sciences
Why is Google’s data trove so valuable?
Srihari: The sheer volume of data that they have is powerful — there’s so much of it, and it’s so diverse. It reflects the voice of consumers, the voices of citizens, the voices of people across countries. One way they can exploit this information is through usage mining, which is tracking how people are using the Internet. They know what people are querying. Google has access to all sorts of information that marketers would love to get their hands on. When people query a brand, for instance, what are they querying for? Google was able to spot outbreaks of flu-like illnesses before government agencies could, because government agencies rely on traditional reporting, waiting for hospitals to send in statistics, whereas Google relies on queries. They know what people are querying for and where those queries are coming from.
What other companies or organizations are investing in data mining on the Web, and why?

Srihari: Practically everyone. The telecoms, credit card agencies, major retailers, airlines, e-commerce providers like Amazon — all of these entities are engaged in data mining. One emerging technology is socially targeted advertising. Companies that provide this service analyze the browsing patterns of brand loyalists, identify Internet users with similar browsing patterns, and use that information to target advertising. The success stories of companies attracting new customers through socially targeted advertising are amazing.

What are some interesting challenges that researchers and companies face when mining data on the Web?

Srihari: The No. 1 challenge is balancing privacy with data mining. We’ve come to a stage where we do less than we can for fear of spooking the public. How do you gain enough information to help a retailer without creating a backlash? You don’t want people to feel like you’re invading their privacy. There are technical challenges, like making sense of text with multiple languages or spelling mistakes, but it’s achieving that balance between data mining and privacy that is the No. 1 challenge.

What are some potential public benefits that could come from data mining?
Srihari: Data mining has the potential for making a serious impact on societal problems. Trends emerge quickly on the Web, and that can be used in an advantageous way. Google’s ability to spot outbreaks of flu-like activity is one example. Law enforcement is another. We’ve heard that gang members often post on their Facebook pages what they did, so law enforcement agents frequently go and look at Facebook to glean additional information. In local communities, if the volume of communication or chatter about some topic increases to a certain level — maybe roads need fixing or there’s a dangerous traffic light — public officials might take notice.
How might data mining affect the average Internet user?
Srihari: We’re going to see more of this socially targeted advertising, and it might start making people wonder, “How did they know that I was interested in traveling to Peru, or that I was looking to buy this thing?” It’s one thing when you’re doing a Google search and you see some advertising appear on the side. It’s quite another thing when you’re reading the newspaper online, and you suddenly see an ad that’s targeted specifically at you that’s unrelated to the content on the page. As people become more aware of how much their Internet activities reveal, they may become more wary about the way they communicate. We’re going to see more debate about privacy.

The University at Buffalo is a premier research-intensive public university, a flagship institution in the State University of New York system and its largest and most comprehensive campus. UB’s more than 28,000 students pursue their academic interests through more than 300 undergraduate, graduate and professional degree programs. Founded in 1846, the University at Buffalo is a member of the Association of American Universities.

Media Contact

Charlotte Hsu Newswise Science News

More Information:

http://www.buffalo.edu

All latest news from the category: Information Technology

Here you can find a summary of innovations in the fields of information and data processing and up-to-date developments on IT equipment and hardware.

This area covers topics such as IT services, IT architectures, IT management and telecommunications.

Back to home

Comments (0)

Write a comment

Newest articles

Silicon Carbide Innovation Alliance to drive industrial-scale semiconductor work

Known for its ability to withstand extreme environments and high voltages, silicon carbide (SiC) is a semiconducting material made up of silicon and carbon atoms arranged into crystals that is…

New SPECT/CT technique shows impressive biomarker identification

…offers increased access for prostate cancer patients. A novel SPECT/CT acquisition method can accurately detect radiopharmaceutical biodistribution in a convenient manner for prostate cancer patients, opening the door for more…

How 3D printers can give robots a soft touch

Soft skin coverings and touch sensors have emerged as a promising feature for robots that are both safer and more intuitive for human interaction, but they are expensive and difficult…

Partners & Sponsors