Mining for information

Databases, text archives, corporate networks and the Web contain endless information. However, filtering important data or even new findings out of these masses is difficult. Semantic techniques and visual analyses can help to maintain a clear overview in the data jungle. Fraunhofer researchers will be presenting several new techniques at CeBIT in Hanover on March 4 through 9.

You don’t have to know everything – you just need to know where to find it. So goes the popular saying. But that is becoming increasingly difficult in today’s age of digital information. Knowledge is now distributed worldwide, so how can you find the information you seek? Searching the Internet or a corporate network often provides little help. You either receive thousands of hits or none at all. The problem is that search programs can only understand individual terms, and cannot grasp the relationship between different words, let alone the meaning, i.e. the semantics, of whole sentences. If you enter the word ‘web’ into Google, Yahoo or MSN, it makes no difference whether you mean a spider’s web, the World Wide Web, or a woven fabric. Wikinger – for faster and more efficient research In the collaborative project ‘Wikinger’, which is being led by the Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS in Sankt Augustin, computer scientists, engineers and historians are working together to give search applications a better understanding of textual content. To this end, they are using techniques for obtaining knowledge through interrelationships within a document – such as the knowledge that the term ‘web’ can be associated with ‘new media’, ‘nature’ or ‘textiles’. The knowledge platform can then semi-automatically develop semantic networks of its own accord, making it easier for users to search for specific information. “This technology is suitable for searching text archives such as those belonging to newspaper publishers,” explains IAIS project manager Lars Bröcker. “It is also ideal for any tasks that require searching large databases or linking multimedia-based data in order to obtain new, additional information.” ConWeaver – for better orientation in corporate databases The ConWeaver search engine, which was developed by a working group led by Dr. Thomas Kamps at the Fraunhofer Institute for Computer Graphics Research IGD in Darmstadt, is tailored to the problems of large and medium-sized companies. In many businesses, staff waste valuable working time sifting through customer, supplier and specialist databases or text documents in search of specific information. The ConWeaver search engine (www.conweaver.de) can automatically link heterogeneous corporate know-how and make it available for use in business processes. A single entry is enough, and the software searches all the different data sources in a company. ConWeaver not only includes the term entered by the user but also its translation in other languages and any given thematic relationships in the search process. On the basis of company data, the engine automatically generates a semantic knowledge network which enables it to recognize that the word ‘customer’ in the sales database, for example, is synonymous with the German word ‘Kunde’ in the e-mail archive and ‘orderer’ in the project documents. “Unlike conventional search engines, ConWeaver establishes a connection between different data formats,” says Thomas Kamps. “This means that the software can efficiently search both structured and unstructured information sources.” Visual analytics – reaping the fruits of knowledge Being able to find the right information in large volumes of data is one thing. Presenting it in a user-friendly way is a challenge of quite a different order. The more extensive the information, the more difficult it is to maintain a clear overview. A team led by Dr. Jörn Kohlhammer at the IGD is combining automatic data analysis with novel methods of visualization. To this end, the researchers are using the various capabilities of computers and people. The computer’s task is to sequentially process large volumes of data and convert them into a visual form of representation that people can perceive and understand. The user can then focus on recognizing patterns, and evaluating and analyzing the observed data. “This process involves very close interaction between man and computer,” says Jörn Kohlhammer, “but man always has priority. It is always the user who decides, not the system.” Visualizations of this type are of particular interest to financial service providers, for example. The data they have to deal with are usually so substantial and varied that it is impossible to carry out conclusive analyses in a short space of time. Visual representations help to make things clearer. If an evaluation of corporate shareholder structures, for example, is presented on the screen in an intuitively comprehensible form instead of in dry numerical tables, the analyst can quickly and accurately draw conclusions from it. Other visualization techniques make it possible to observe the stock quotations of numerous companies simultaneously, and to draw conclusions from previous developments. This often visually highlights correlations that would otherwise be lost in a maze of numbers. The new data mining and visual analytics techniques will be presented by researchers at the Fraunhofer stand in Hall 9, Stand B36.

Countries

Austria, Belgium, Bulgaria, Cyprus, Czechia, Germany, Denmark, Estonia, Greece, Spain, Finland, France, Hungary, Ireland, Italy, Lithuania, Luxembourg, Latvia, Malta, Netherlands, Poland, Portugal, Romania, Sweden, Slovenia, Slovakia, United Kingdom

Mining for information

Countries

Share this page Share this page on social networks

Download Download the content of the page