This site has been archived on
The Community Research and Development Information Service - CORDIS
Information & Communication Technologies

Language Technologies


back to overview

Project factsheets will no longer be updated. All information relevant to the project can be found on the CORDIS factsheet . This is updated on a regular basis with public deliverables, etc.

TrendMiner - Large-scale, Cross-lingual Trend Mining and Summarisation of Real-time Media Streams

287863 - STREP

trendminer-logo.png

At a glance

FP7-ICT-2011-7 - Language technologies

  • Duration: 36 months
  • Start date: 1 November 2011
  • End date: 31 October 2014
  • Project officer: Susan Fraser
  • website

Challenge

The recent massive growth in online media and the rise of user-authored content (e.g weblogs, Twitter, Facebook) has lead to challenges of how to access and interpret these strongly multilingual data, in a timely, efficient, and affordable manner. Scientifically, streaming online media pose new challenges, due to their shorter, noisier, and more colloquial nature. Moreover, they form a temporal stream strongly grounded in events and context. Consequently, existing language technologies fall short onaccuracy, scalability and portability.

Goal

The goal of this project is to deliver. innovative, portable open-source real-time methods for cross-lingual mining and summarisation of large-scale stream media. TrendMiner will achieve this through an inter-disciplinary approach, combining deep linguistic methods from text processing, knowledge-based reasoning from web science, machine learning, economics, and political science. No expensive human annotated data will be required due to our use of time-series data (e.g. financial markets, political polls) as a proxy. A key novelty will be weakly supervised machine learning algorithms for automatic discovery of new trends and correlations. Scalability and affordability will be addressed through a cloud-based infrastructure for real-time text mining from stream media.

Innovation

A main innovation consists in developing novel multilingual ontology-based extraction methods, which are capable of analysing the shorter, colloquial, noisy, and contextualised social media streams.  Another innovative contribution will be the integration of opinion and trend elements in ontologies. This will be supported by semi-automatic lexical and terminological acquisition methods, applied to existing multilingual knowledge resources and unstructured documents.An important part of the knowledge modelling and ontology population process will be innovative merging algorithms for dealing with opinions, and with a special focus on provenance and stream reasoning. This will be an ongoing process of building and maintaining a persistent knowledge base, which is being updated over time, as new media come in.

The result

In TrendMiner partners are aiming at various types of results, to be delivered to the R&D community at large:

  • Models and Approaches: the outcome of the TrendMiner project will be novel models and approaches for combining multi-lingual text processing, extra-linguistic knowledge, and time-series machine learning models, in order to detect and track events, trends, and sentiment in stream media.
  • Methods and Technologies: outcome of TrendMiner will be open-source algorithms for real-time analysis and summarisation of multilingual media streams.
  • Infrastructure, tools and applications: TrendMiner will deliver a cloud-based platform for real-time stream media collection, analysis and summarisation. Algorithms and technologies will be made available as scalable web services, running on the TrendMiner platform.
  • Two case studies will provide two demonstrated deployments in financial decision support and political science.

Impact

In TrendMiner, R&D work is guided and validated by 2 use cases:

  • Multilingual Trend Mining and Summarisation for Financial Decision Support, led by the partner Eurokleis
  • Multilingual Public Spheres: Political Trends and Summaries, led by the partner SORA

Those use cases will help us to mesure the impact of the project in the broader field of sentiment analysis in the social media.

Co-ordinator

Contact Person:

Name: Dr. OLTHOFF Walter

Tel: +49 631205755000

E-mail: Walter_Gerhard.Olthoff@dfki.de

Organisation: DEUTSCHES FORSCHUNGSZENTRUM FUER KUENSTLICHE INTELLIGENZ GMBH

More»
 







 

 

 

 

back to overview



This page is maintained by: Susan Fraser (email removed)