Search engine experts look forwards to completely digital lives and backwards to Washington’s letters
A conference at the University of Sheffield is set to celebrate ten years since the first Web search engines, and will reveal some of the capabilities of search engines of the future, and the way that our use of computers will lead them to new ways of archiving and retrieving information. Presentations at the conference will include ways that we can store and search through every personal document we have ever received, and another paper will use George Washington’s letters to demonstrate a new search system that is able to analyse and sift through handwritten pages, even using historical texts, which have notoriously ornate handwriting.
Mark Sanderson, of the Department of Computer Science at the University of Sheffield, is hosting the conference He explains, “It is particularly apt that we should be celebrating ten years of search engines here as we believe that the first web search engine was British. It was called Jumpstation and was created in the University of Stirling and released in early 1994.
The keynote speaker for the conference is Gordon Bell, from the Microsoft Bay Area Research Centre. His particular area of interest is a project called MyLifeBits, which looks at how our information needs will change as computer hard discs allow us to store our whole lives on our PC.
Dr Bell explains, “Within five years PCs will be large enough to store everything we read, write, hear and see including video, images and emails.”
“MyLifeBits looks at how we can digitally store everything – from financial records to the books we read and the songs we listen to, photos, telephone calls and even the web pages we have visited.
“This has great implications for allowing us to move to a truly paperless society, but also means that trawling through swathes of our own personal data will become very difficult without an effective system in place.
“MyLifeBits works on the premise that you don’t think about personal information in the same way as you do other people’s data, so a traditional search engine approach would be unhelpful for personal information. For example, you may think of a personal document in relation to the time you received it (maybe around the same time as an important life changing event), rather than in key words. MyLifeBits will hopefully provide the flexibility to search in more creative ways.”
Researchers from the Center for Intelligent Information retrieval at the University of Massachusetts will be presenting a paper entitled, “A Search Engine for Historical Manuscript Pages.”
Raghavan Manmatha will be demonstrating a new search tool that allows the search of historical, handwritten texts, without expensive and time consuming manual annotations.
He explains, “At the moment historical texts can only be accessed digitally as image files and searching them involves somebody inputting key search terms in a way that the computer can understand, this is a time consuming and expensive process, and inhibits the number of historical documents we can access electronically.
“We have developed a system whereby the computer ‘learns’ the handwriting after you transcribe a small proportion of the pages.
“We have used George Washington’s letters to develop the system and after transcribing 100 pages of his writing, the computer was able to search 987 pages of text, with a much lower error rate than by using traditional handwriting recogniser.
“This work not only has relevance to historical works but could also eventually come into common usage so that individuals can scan and search their entire record of handwritten information, for example letters, lecture notes or brainstorming notes.”
Lorna Branton | alfa