Web and text mining through the Online Analyst
(Alessandro Zanasi, IBM)
While the amount of data available to us through the Web and the Intranets is increasing, our capacity of reading and analyzing this information remains constant. Search engines instead of reducing the problem augment it making more and more documents quickly available to us. Web (and text) mining is a new research area that tries to solve the information overload problem by exploiting recent advances in different fields of technology: language and data mining technology. Documents and web pages are a source of knowledge in an unstructured data format that can be decodified, analyzed and turned into actionable intelligence thanks to online analysis by text mining.
In this paper an example of a real and operative application to competitive intelligence (CI) is given: the Online Analyst. The final objective of this application is to give to end-users an intelligent agent to read and quickly analyze huge volumes of documents retrieved online, especially from the web (almost 5000 press sources and more than 70 web sites). Its typical users are intelligence analysts, in military as in business or politics field.
The most exciting conclusion is that the shown approach, as also the prototype, here shown in a CI application case, may be already easily used in all e-commerce applications (currently under development worldwide), that take into account the necessity of working with web and /or text data: e-commerce market places, CRM, competitive intelligence.
- Presentation slides
- e-mail: zanasia@it.ibm.com, a_zanasi@yahoo.it
- URL: http://open.cineca.it/datamining