Making video easier to search and find
Imagine a computer system that can automatically search through videos of football matches and pull out all the shots on goal or all the fouls.
Creating the elements that make such a system possible is a key result from the IST BUSMAN project. The current generation of computer systems is excellent at searching for and manipulating text: as the spectacular success of Google has shown. However, computers are now routinely used to store and process more than just text - videos of football matches for example. Handling multimedia content such as video footage is far harder than text where particular words and phrases can be searched for.
The fast growth of such ‘unstructured’ digital content and the lack of efficient tools to handle it motivated several European research institutions and key industrial players in the multimedia field, including Motorola and BTexact Technologies, to set up the BUSMAN project. Coordinated by Professor Ebroul Izquierdo of the EE Multimedia and Vision Lab, at Queen Mary University of London, it finished in December 2004. In its final review the project was rated as among the very best research and development projects funded by the European Unions Fifth Framework programme.
The BUSMAN partners created a content management system and series of tools, all based on the MPEG 7 standard, to enable search, retrieval and delivery of video content from both PCs and mobile devices. In all 28 important results were achieved from the project in crucial areas of advanced media processing.
Driven by user needs
Simon Waddington of Motorola Labs in the UK emphasises the lengths to which the project partners went to ensure that they started from user needs. “We started at the very beginning by talking to end-users to find out what they really wanted. The two applications we eventually chose to demonstrate the technology were football and tourism. So we talked to football fans and asked them what they remembered about specific matches. Then we talked to the content providers and tried to understand their work flow for processing content, and the tools they’d need.”
Rather than attempt to service the major broadcasting organisations, “BUSMAN is ideally suited for smaller content providers, who have limited resources for managing and annotating their content,” he says. “BUSMAN also allows the ordinary user equipped with a standard PC or mobile phone easy access to video content.”
Potential applications for the technology are multifold, he believes. The content can be sports matches, music videos, city guides, history simulations for architectural sites or even instruction videos on how to repair a piece of equipment. Whichever the application, BUSMAN’s ability to manage the content of large collections of videos and pick out specific video segments, makes it ideal for the purpose.
Variety of annotation and search methods
What is unique about the system, claims Waddington, is the variety of categorisation and search methods that can be used to label and find video content. “The content can be labelled and searched semantically using keywords or free text. But the user can also perform query-by-example, by providing a query image and searching for similar images in the database,” he says. The researchers developed a method called ‘relevance feedback’, which allows the user to provide feedback on the relevance of the retrieved images to their original search.
“Given a ninety-minute football match,” he says, “we developed a tool that could annotate the most interesting shots in the video. What you find is that, when a goal is scored, it is accompanied by rapid changes in camera shots and a rise in background noise from the spectators. Our tool can use this information to select highlights from the video, which saves the annotator much time and effort. It can even predict what is actually happening in a particular football sequence such as a foul.”
Take the case of a typical football supporter Mike. He spends Saturday afternoon watching his team Manchester United play Liverpool on TV. Manchester win 3-2 with a particularly good goal by Roy Keane through an overhead kick. That evening he goes out with friends, and a discussion about the day’s match ensues. Using the BUSMAN application, he enters the team names and the date of play on his mobile phone and chooses the ‘goal highlights’ video for Manchester. The group watches the build-up, the goal and the celebration for each of the goals. A dispute arises about a disallowed goal. A slow-motion video clip of the crucial moment for the goal is passed around amongst the group, enabling each person to decide whether the referee’s decision was the right one.
The BUSMAN system was also designed to help with the management of intellectual property rights, in the area of usage monitoring in particular. “The creator can insert labels into the video content that are almost imperceptible using advanced watermarking techniques developed during the project.” Waddington says. “We used a binary form of ‘Digital Item Identifier’. When the user views the video, the system can extract the Digital Item Identifier and thus provide a link to a rights management page. It’s a very good way of associating metadata with content.”
By way of example, Gita, creative director of a production company, wants to do a commercial that includes a tiger jumping into shallow water. The client is not going to fund a live video shoot, so she uses the BUSMAN system to search for existing video material and its likely cost. The query ‘tiger jumping into water’ returns several shots of tigers, some of which look promising. Full screen previews let her assess what the shots look like. While none are perfect, one is close. Gita selects this clip and asks the system to look for ‘similar’ shots. By repeated searches using the best current match, two candidate shots are finally selected.
Gita then enters a detailed description of the intended use, and instructs BUSMAN to search for the rights. Contact is established with the rights holders via email to discuss licensing and fees. After the contracts are signed, BUSMAN transfers master-quality copies to a download server, while download authorisation data is sent on a secure channel. Once the download is complete, notification that the shots have been downloaded is stored in the system and sent to the rights holders.
Raising interest among potential users
Waddington believes that thanks to the amount of user research and feedback that has taken place in the project, the resulting system could be put into use quite easily. Motorola demonstrated the system at the IBC exhibition in 2004, as well as at a number of other international exhibitions.
BUSMAN technology has now been integrated into the educational programmes of project partner Queen Mary University of London. The project results have also led the BUSMAN partners to make a series of recommendations on European standards, which will help make available knowledge on the advances made in the project to the wider IT community.
On the topic of commercial products however, the partners are playing their cards close to their chests. They are not saying at present if they have plans to turn the technology into any commercial products. However, as Waddington admits, “the first thing we have to do is to generate interest amongst our customers. And we have had some very positive feedback.”
In all, the BUSMAN project was an outstanding success, and as a result we can look forward to being able to routinely and easily search through archives of videos for particular events of interest just as we can already search for Web pages about a given topic.
Tara Morris | alfa