Technologies for Information Management

Please note that the project factsheet will no longer be updated. All information relevant to the project can be found on the  CORDIS factsheet . This is updated on a regular basis.


KYOTO – Knowledge Yielding Ontologies for Transition-based Organization

KYOTO makes knowledge sharable between communities of people, culture, languages and computers, by assigning meaning to text and giving text to meaning

The globalization of markets and communication brings with it a concomitant globalization of world-wide problems and the need for new solutions. Timely examples are global warming, climate change and other environmental issues related to rapid growth and economic developments. Environmental problems can be acute, requiring immediate support and action, relying on information available elsewhere. Knowledge sharing and transfer are also essential for sustainable growth and development on a longer term. In both cases, it is important that distributed information and experience can be re-used on a global scale. The globalization of problems and their solutions requires that information and communication be supported across a wide range of languages and cultures. Such a system should furthermore allow both experts and laymen to access this information in their own language, without recourse to cultural background knowledge.

The goal of KYOTO is a system that allows people in communities to define the meaning of their words and terms in a shared Wiki platform so that it becomes anchored across languages and cultures but also so that a computer can use this knowledge to detect knowledge and facts in text. Whereas the current Wikipedia uses free text to share knowledge, KYOTO will represent this knowledge so that a computer can understand it. For example, the notion of environmental footprint will become defined in the same way in all these languages but also in such a way that the computer knows what information is necessary to calculate a footprint . With these definitions it will be possible to find information on footprints in documents, websites and reports so that users can directly ask the computer for actual information in their environment.

The goal of KYOTO is to develop a content enabling system that provides semantic search and information access to large quantities of distributed multimedia data for both experts and the general public, and to apply this system to environmental information on a global scale. Information access is provided through a cross-lingual user-friendly interface that allows for high-precision search and information dialogues over a variety of data from wide-spread sources in a range of different languages: English, Dutch, Italian, Spanish, Basque, Chinese and Japanese. It is like a Wikipedia in that it lets communities share and maintain there knowledge but it is different in that it shares this knowledge across languages, makes this knowledge usable by computers and exploits the knowledge to find facts in free text.

Figure-1 below gives a schematic overview of the complete system. At the top of the diagram, a collection of source data in different media and languages is supplied by global environmental organizations. A capture module collects the information and produces a general XML representation. The textual information in the XML representation is then processed by a chain of linguistic and conceptual processors. Through wordnets in each of the languages, the words and expressions will be matched to a shared universal ontology. This ontology guarantees a common level of semantic anchoring across languages and information sources.

The ontology consists of three layers: a generic top layer (based on existing top level ontologies), a middle layer (derived from the existing wordnets) that connects the third layer, consisting of domain terms and concepts, to the top-level. The domain terms are extracted semi-automatically from the source documents but are also manually created through a Domain Wiki. The Domain Wiki lets experts in the field modify and extend the domain level of the ontology and extend the corresponding wordnets in each language. It enables community-based resource building which will lead to better understanding and consensus in the field and at the same time result in the formalization of this knowledge so that it can be used by a system. Extensions to wordnets and the ontology are propagated to other wordnets and language resource builders through a sharing protocol.

Once the ontological anchoring is established, it will be possible to build text mining software that can detect semantic relations and facts in text. These data miners, so-called Kybots (Knowledge Yielding roBots), can be defined using constraints between relations at a generic ontological level. These logical expressions need to be implemented in each language by mapping the conceptual constraint on linguistic patterns. A collection of Kybots created this way can be used to extract the relevant knowledge from textual sources.

The extracted knowledge and information will be indexed by a corporate search system that can handle fast semantic search across languages. The search system uses so-called contextual conceptual indexes. Likewise the search system can give different results searching for 'polluting substance' than for 'polluted substance', because these involve different concepts and semantic relations.

Events in connection with KYOTO
Here is a list of some selected KYOTO event and news items . The full list can be found on the project website.

Workshops and demos:

- KYOTO is organizing the Semeval2010-task on Domain-specific Word Sense Disambiguation (WSD) on a specific Domain at SemEval-2010 : 5th International Workshop on Semantic Evaluations.

- Environmental Knowledge Transition and Exchange - The first KYOTO Workshop (Knowledge Yielding Ontologies for Transition-based Organization), in Amsterdam, The Netherlands on 2-3 February 2009.

- First Demo launched on Cross-Lingual Search (1 March 2008).

- Kyoto participates in the Global WordNet Grid of the Global WordNet Association .

- Project Forum : Opening of a technology forum and an environment forum to open up discussion to a wider public.


- Prof. Dr. P. Vossen: invited key-note speaker on Kyoto at Lustrum of NL TERM, October 25, 2008, Amsterdam, the Netherlands (Presentation NL Term).
- Prof. Dr. P. Vossen: invited key-note speaker on Kyoto at DART 2008: 2nd Workshop on Distributed Agent-based Retrieval Tools, 10 September 2008, Cagliari, Italy (Presentation DART).

- Vossen P., E. Agirre, N. Calzolari, C. Fellbaum, S. Hsieh, C. Huang, H. Isahara, K. Kanzaki, A. Marchetti, M. Monachini, F. Neri, R. Raffaelli, G. Rigau, M. Tescon (2008). "KYOTO: A system for Mining, Structuring and Distributing Knowledge Across Languages and Cultures", in: Proceedings of LREC 2008, Marrakech, Morocco, May 28-30, 2008.

- Vossen, P., E. Agirre, N. Calzolari, C. Fellbaum, S. Hsieh, C. Huang, H. Isahara, K. Kanzaki, A. Marchetti, M. Monachini, F. Neri, R. Raffaelli, G. Rigau, M. Tescon, J. van Gent (2008). "KYOTO: A System for Mining, Structuring, and Distributing Knowledge Across Languages and Cultures", in: Proceedings of the Fourth International Global Word Net Conference - GWC 2008, Szeged, Hungary, January 22-25, 2008.

Project coordinator
Prof. P. Vossen , Vrije Universiteit Amsterdam, The Netherlands
Vrije Universiteit Amsterdam, The Netherlands ( coordinator )
Berlin-Brandenburg Academy of Sciences and Humanities, Germany
Consiglio Nazionale delle Ricerche, Italy
Synthema S.R.L., Italy
National Institute of Information and Communications Technology, Japan
Euskal Herriko Unibertsitatea, Spain
Academia Sinica, Taiwan
Irion Technologies, The Netherlands
European Centre for Nature Conservation, The Netherlands
Administrative details
KYOTO (ICT-211423) is a Specific Targeted Research Project (STREP) of the European Union's 7 th Framework Programme: Information and Communication Technologies (ICT) – Call 1.
The project started on 1 March 2008, and will finish on 28 February 2011 (36 months).
There are 9 participants from 6 countries involved in the project, and the EC contribution is 2.20 million Euros (total cost: €3.32m).

