Search engine technology is in a state of flux as it digs ever deeper for new meaning. Europe is poised to reap the benefits of the new age of semantic search thanks to the work of European researchers.
‘Search’ is the gateway to the web, it keeps internet traffic moving, it provides the maps and the shortcuts through the enormous tangle of the World Wide Web.
But while there is a phenomenal amount of content, most of it is not that easy to find. Sure, text content can be skimmed or glanced, but audiovisual content has to be viewed in linear time. We cannot easily search inside a film or audio recording for relevant information.
That is changing, and one European project has created the first integrated platform for semantic search that can return results based on the content and context of film and audio files, as well as text.
Not the end for keywords
This is not the end of keyword search – the standard technology that we use every day – but it could well be the beginning of the end.
For instance, try to compose a meaningful query, such as “effects of military action in civil population”. Traditional search engines will give results for the individual keywords introduced. A semantic engine, like MESH, will analyse the query first and then give relevant results for the actual meaning of the query.
The EU-funded MESH project sought to create a platform that integrated the state of the art in semantic search technologies and all the necessary tools to develop a working platform. But while the team’s achievements are impressive, there is a length of road to travel before they are ready for universal search by everyday surfers like you and I.
Still, the platform proves the technology in two restricted news domains – natural disasters and civil unrest and street violence – and it has led to many useful, working applications and potential commercialisation opportunities.
“We developed a manual annotation tool to create manageable annotations for all types of media, and it is a very strong program that is easy to use,” explains Pedro Concejero, coordinator of the MESH project. This tool could become a commercial product, he predicts.
The search for relevance
One partner of the project, Deutsche Welle, a German TV station, created a dossier-developing tool called Full Story. This remarkable program can help a video editor link to video, audio and text relevant to a particular topic.
The editor can then assemble these diverse elements into a dossier. For example, a dossier about flooding might assemble media outlining the mechanics of flooding, the impact of changing weather patterns, and the effect on lowland and populous areas.
TV stations do this type of feature all the time, and typically it can take days sorting through media archives for useful material to assemble a compelling dossier.
But with the Full Story program, an editor can perform the same task in hours, and the editor is much more likely to find compelling and visually interesting material, because most of the time is spent sorting through relevant results rather than searching for relevant material in a vast warehouse.
“Deutsche Welle is currently evaluating the future prospects of Full Story with further extensive user testing, a comprehensive technology implementation plan and an outline concerning potential commercialisation,” notes Concejero.
Annotating user-generated content
User-generated content is another area that could benefit from the work of the MESH consortium in the short to medium term. User-generated content is a huge element of Web 2.0 applications – it is the material that makes sites like YouTube, flickr, Facebook and Twitter so popular.
The MESH project’s automated annotation tool was central to the platform’s success, and it could be developed to work with user-generated content.
“Here at my company, Telefónica, we are very interested in developing semantic search and annotation for user-generated content on mobile phones, but more work would need to be done on the technology developed in MESH to make it ready for that sort of application,” reveals Concejero.
That may be the work of another project. The consortium has just put the final touches on MESH, but Concejero says that some of the partners may go forward with another project in the future. In the meantime, response from peers and industry to the work of the MESH project has been encouraging.
Above all, the MESH project demonstrates that semantic search for all media types is possible and automation is improving rapidly. It’s not quite there yet, but the search continues. You can already check out the MESH prototype.
The MESH integrated project received funding from the ICT strand of the EU’s Sixth Framework Programme for research.
This is the second of a two-part special feature on the MESH project.
Media note: This feature can be republished without charge provided ICT Results is acknowledged as the source at the top or the bottom of the story. You must request permission before you use any of the photographs on the site. If you do republish, we would be grateful if you could link back to the ICT Results site (http://cordis.europa.eu/ictresults). Let us know if you republish so as to help us provide you with a better service. If you want further contact information on any of the projects cited in this story please contact us.
(To play this video, you will need the latest flashplayer.)