CORDIS Archive

View the original page arrowbar Legal Noticebar Print the page
This page has been archived. It will no longer be updated.
 

1 Introduction
2 CERIF History and Background
3 User requirements
4 Objectives of CERIF 2000, today and in the future

1 Introduction

1.1 The Setting
1.2 The Requirement
1.3 CERIF 1991
1.4 CERIF 2000

1.1 The Setting

Access to information on current research activities throughout Europe is an essential requirement for the success of EU innovation policy.
The key asset in European R&D consists of ideas, technical reports, publications, patents, prototypes, products and know-how - leading to technology transfer and wealth creation, and to the generation of new R&D ideas.
The key added value to be achieved is the pan-European approach to the generation of and exploitation of R&D. There is a need for information to be made available widely to encourage both innovation and new, improved R&D.

TOP

1.2 The Requirement

The innovators in industry and services, the academics pushing the frontiers of R&D, the decision makers in governments and R&D funding agencies all require easy-to-use access to R&D information.

The raw data sources are the R&D Information held by funding agencies and other information providers in the EU. These are held for the particular purposes of the agencies and the particular clients of the information providers. They are heterogeneous and unconnected. The potential for European wide exchange is not being exploited.

There is a need for the end-user of research information to be able to access this data through a uniform survey-level interface and to be able integrate and compare the information between data sources. This "common interface" must not only address the content (what must be exchanged) but also the format of such information (how it should be presented).

This information must be presented in a uniform way, at least at summary level. The WWW (World Wide Web) and database technology are the way to provide this.

The definition of this uniform information description platform requires:

  • the definition a full CRIS data model which will cover the database structures of the majority of existing CRIS;
  • the definition a set of data models which could provide examples for data exchange (since there are an infinity of possible exchange data models between CRIS);
  • the definition of a metadata data model to provide a uniform summary-level view over heterogeneous information sources.

Easy access to information must address not only the availability of information with a common definition and format but also how that information could be retrieved by the end-user.

The end-users need to be able to search, European wide, for information on a particular research topic or theme. Subject indexing of the information is the obvious key in this respect. Classification should be consistent for all the research information sources; otherwise users will not get consistent results when they retrieve information. Since end-users also use different languages, the controlled indexing terminology proposed should have the same meaning in all languages.

TOP

1.3 CERIF 1991

The first CERIF was published in 1991 with the aim of facilitating data exchange between CRIS. CERIF has not been revised since but research information providers were expressing a strong and urgent need for widening and updating the CERIF Format for the following reasons:

  1. the original CERIF covered only research projects. Users of CRIS want to extend this to cover data on persons, organisations and other entities;
  2. the "research subject classification scheme" recommended in CERIF 1991 has not been updated since 1988 and needs to be extended to cover the new data areas plus give enhanced coverage of existing ones;
  3. new technologies, in particular, the widespread use of the Internet and World Wide Web, have changed the nature of basic CRIS activities and opened new ways to serve various CRIS user groups.

TOP

1.4 CERIF 2000

In 1997, the CERIF revision work was entrusted to unit D2 of DG XIII of the EU's Commission dealing with dissemination of scientific and technological knowledge. The CERIF Revision Working Group, composed of experts from Member States and associated countries, was proposed by the Innovation Programme Management Committee (see list in Annex), and given the mandate for the CERIF revision activities.

The CERIF revision led to this recommendation for "CERIF 2000" which has the following objectives:

  1. to provide guidelines for data exchange of research information between CRIS, and hence facilitate access to the various sources throughout the EU, and indeed throughout the entire global research community;
  2. provide an example of a model Research Information System for newcomers and for organisations that want to expand their existing research information systems.

TOP

2 CERIF History and Background

2.1 Early history of research information systems
2.2 More recent activities in European context
2.3 Scope of CRIS
2.4 The need to revise CERIF

2.1 Early history of research information systems (sixties and seventies)

CRIS and CERIF efforts are not new. As early as in the 1970s, serious efforts were being made among research information systems, in the field of international cooperation. In 1970, a manual was published, "Surveying the national scientific and technological potential, including the collection and processing of data management, of the R&D system" (UNESCO, Paris 1970, 251 p.). This manual described, in operational terms, a methodology for conducting a survey on a country's scientific and technological potential, and discussed how to use this information in the formulation of a science policy on a national level.

Early efforts for a world-wide science information system occurred more than 30 years ago. In 1971, UNISIST published a "Study report on the feasibility of a world science information system" (Paris, 1971, 161 p. - Synopses, 92 p.). This joint UNESCO-ISCU study recommended that UNISIST acts as a catalyst to stimulate international cooperation among information systems, and as an initiator of projects designed to improve world information tools and resources.

In 1972 the first references to Computer based research information exchange appeared. Frank J. KREYSA published an article in the Journal of Chemical Documentation: "SSIE - an information center which stores foresight". He described the Smithsonian Science Information Exchange's improved system, which includes on-line video terminals, high capacity data cells, and full-text computer storage of project data.

The problems of subject indexing were already tackled long before the computer age. In the late 60s and the early 70s, both Smithsonian Science Information Exchange and UNESCO attempted to come towards an international standard nomenclature for fields of science and technology. See:

"Proposed international standard nomenclature for fields of science and technology". (Paris 1973, 22 p. UNESCO/NS/ROU/257 rev. 1).

A standard nomenclature is seen as an important tool in the management of scientific and technological affairs, especially in fields related to science policy and science statistics.

TOP

2.2 More recent activities in European context (eighties)

2.2.1 Initiatives on European Research Databases

2.2.1.1 Workshop on European Research Databases, 30 September - 2 October 1987 - Brussels

The aim of the Workshop was to give the Commission the opportunity to identify problems that had to be solved in order to establish an information network on scientific and technological research.

2.2.1.2 European Working Group on Research Databases (1987-1988), Development of the Common European Research Information Format, CERIF

The European Working Group on Research Databases was formed, with a mandate based on the conclusions of the Workshop on European Research Databases.

  1. The Working Group recommended the use of the "CERIF-manual" (Common European Research Information Format), which was designed to provide a standard format for two major purposes: To permit the exchange of records containing information on research projects between the Member States of the European Community;
  2. To serve as a basis to the format for setting up the foreseen network between the research databases.

2.2.1.3 Recommendation to the Member States to use the CERIF format

The Member States were officially recommended to use the CERIF manual as a standard format for their information systems on research projects. This recommendation was published in the Official Journal of the European Communities, OJ L 189, the 13th of July 1991.

The main recommendations of the Commission to Member States were to take progressive actions towards:

  • Harmonisation of the existing national databases in the field of research and technological development;
  • Inventories on R&D-projects carried out in industry and in research institutions, which conform to the CERIF.

2.2.1.4 European Working Party on Research Databases (1992)

A follow-up meeting was held by the European Working Party on Research Databases in 1992, to exchange experiences with the implementation of CERIF. A report covering conclusions and recommendations for revision of CERIF was sent to the CREST.1

The main recommendation of the Working Party was to investigate how a technical project could be set up, facilitating access to CERIF-compatible research databases throughout the European Community.

2.2.1.5 Pilot project : European Research Gateways On-line (ERGO) - proof of the feasibility to apply CERIF into a practical implementation

The "ERGO-Working Group" had been mandated to investigate the feasibility of a project enabling users to access to RTD databases in the Member States, from one single access. The ERGO Working Group proposed various alternative solutions. As a result, the Innovation Programme Committee decided to start the ERGO Pilot Project. In December 1998, a single gateway to national databases of research projects via a central catalogue and user-friendly search form was launched. For more details see annex 4

As such the ERGO pilot Project can be seen as a successful feasibility project, regarding the practical application of CERIF 1991. It has demonstrated, following CERIF guidelines:

  • The practical feasibility and added value of a facilitated access to national R&D information from a single entry point;

  • The suitability of a catalogue-based (metadata) concept both for data collection and easy searching.

Basic features of the pilot project were:

  • The central catalogue holds basic information on each project contained in participating databases;

  • The user, searching on the catalogue, receives such basic information, with further information being available from the relevant database directly;

  • Each database transmits basic information concerning all new or updated projects to the catalogue on a regular basis.

This is only one of the possible approaches for a practical implementation of CERIF into a pan-European unique access to research databases. Taking into account today's evolution in web and metadata technology, a range of scenarios can be worked out for implementation of the revised CERIF.

TOP

2.3 Scope of CRIS

2.3.1 Initial main user groups

We can find some interesting elements in a 24-year old document concerning the stimulus to create and maintain research information systems:

"Guidelines on the Conduct of a National Inventory of Current Research and Development Projects", from UNISIST, the United Nations Educational, Scientific and Cultural Organisation.

(Ref. SC/75/WS/13, Paris, March 1975)

The preface of these guidelines mentions:

"Current Research Information (CRI) systems have two principal objectives:

  • To enhance communication among scientists concerning ongoing projects;
  • To provide an effective information base to managers of the national R&D program."

The introduction also reads:

"The registers of current research - effectively indexed - is not only an important tool with which to locate individual expertise in specific fields. It also provides important statistical data on the activities of the population it deals with, their institutional adherence, trends of investment in R&D, and other data needed for formulation of science policy. Such a register enables scientists and engineers to recognize work done by their peers, and thus to avoid unwitting duplication of effort.

Finally, the register bridges the gap in time - often about two to three years - between the initiation or completion of a research project or some part of it and its publication".

It can be concluded that, more than twenty years ago, the motivation was often to investigate an R&D policy instrument, for instance for statistical purposes. A second incentive was that research project information systems could be used as a communication system between researchers, and as a source to find experts.

However, the "research world" seemed to be interpreted as a strict group of people dealing with traditional research activities, mostly in universities and research centres.

2.3.2 Today's scope of research information systems

Traditionally, CRIS mainly covered research project registers. However, CRIS is an umbrella covering a broad range of information sources. The reasons for this are diverse:

  • Industrial policy often has the stimulation of innovation as an objective. Awareness is thus raised not only on traditional research, but also on other types of information such as on partners, results, expertise and equipment.
  • Technological developments nowadays allow easy, low-cost and user-friendly access to a wide range of publicly available information sources;
  • More and more, the intermediaries are developing services for SMEs to assist them in selecting their desired information.

2.3.3 Today's information society

In this ongoing innovation movement, research information is of the utmost importance: information on research results is not only used by researchers, but also on behalf of the industry, directly or by intermediaries.

Latest developments in computers and www technology make these types of information directly and easily accessible for the end-user.

The information brokerage specialists and a wide range of intermediaries have developed a new market, offering services to industries concentrated on searching and delivering a selection of appropriate and accurate information meeting their clients' needs. This information can be much broader than research information alone. Getting the right information can assist all stages of innovation processes.

Research information covered can deal with:

  • Research programmes
  • Ongoing and finished research projects
  • Research groups / laboratories
  • Research centres / institutions
  • Research funding sources
  • Research results (publications, references to patents, products, etc.)
  • Research facilities (usually large ones)
  • Calls for proposals
  • Events (conferences, etc.)
  • Technology transfer databases
  • Technical patent information
  • Expertise/consultancy

Additionally, related information can be given on:

  • Standards related to a potential new product
  • Procedures about intellectual property rights
  • Venture capitalists
  • Training organisations dealing with business plans

TOP

2.4 The need to revise CERIF

2.4.1 Scope and structure of CERIF 1991

The scope of CERIF 1991 was to provide a standard format for two major purposes

  • To permit exchange of records on research projects between different Member States of the EU
  • To serve as the basis for the format in order to allow networking of databases

The CERIF 1991 structure is based upon a data model for describing "Research Projects", which contained :

  • A list of essential data elements
  • A list of optional data elements
  • Recommendations concerning subject indexing of research activities (Recommendation to use the "CERIF Common European Research Classification")

2.4.2 Need to revise CERIF 1991

The CERIF 1991 was limited to research projects information.

The needs were recognised :

  • To widen the scope of CERIF to other types of research information
  • For reviewing the recommendation concerning the classification approach
  • For revising the CERIF guidelines in view of the today's information environment

Reference material on CERIF background

  • Guidelines on the Conduct of a National Inventory of Current Research and Development Projects/UNISIST (UNESCO)/SC/75/WS/13. - Paris, March 1975.

  • Information services on research in progress : a worldwide inventory/Edited by the Smithsonian Science Information Exchange [for the] General Information Programme and UNISIST (UNESCO) - Paris : UNESCO, 1982. - 220 p.

  • Programme implementation: The European Gateways On-line project was launched on 9 December 1998 : news item on CORDIS RTD-NEWS, 1998-12-08 and on 1999-03-30 (/)

  • Proposed international standards standard nomenclature for fields of science and technology/UNESCO/NS/ROU/257 rev. 1. - Paris, 1973. - 22 p.

  • "Towards harmonisation of databases on research in progress - Final report of the European Working Group on Research Databases", November 1988. Published by the Liaison Committee of Rectors' Conferences of Member States of the European Communities and Directorate General for Science, Research and Development of the Commission of the European Communities; financed by the Commission of the E C, contract PSS*0058/B, compiled by Dr. L. Van Woensel.

  • Recommendation to the Member States to use the CERIF format In : Official Journal of the European Communities, OJ L 189, 13th July 1991.

  • SSIE - an information centre which stores foresight/Frank J. Kreysa In: Journal of Chemical Documentation, 1972.

  • Study report on the feasibility of a world science information system/UNISIST (UNESCO-ISCU). - Paris, 1971. - 161 p. (Synopses, 92 p.).

  • Surveying the national scientific and technological potential, including the collection and processing of data management, of the R&D system/UNESCO. - Paris, 1970. - 251 p.

TOP

3 User requirements

3.1 Introduction
3.2 User groups
3.3 Evolution of user needs
3.4 Common user requirements

TOP

3.1 Introduction

Defining CERIF 2000 guidelines implies a first reflection on CRIS user requirements. Who are the CRIS users? Are they a homogeneous body? What are their interests?
First, this chapter identifies the different CRIS user groups and for each of them their particular information needs. Second, in the light of the technology change and the shift in public policy towards more innovation-related as opposed to pure research goals, an analysis of evolving user needs is carried out.

Finally, a set of common users' requirements is proposed 1.

TOP

3.2 User groups

3.2.1 Information service providers (providers of the CRIS data set)
3.2.2 Information providers (providers for the CRIS contents)
3.2.3 Institutions and policy makers
3.2.4 R&D Community
3.2.5 Intermediary organisations
3.2.6 Enterprises
3.2.7 Non-profit sector
3.2.8 The Media

This section segments the population using CRIS into eight categories. For each category, users are defined and their interaction with CRIS and CERIF needs are identified.

  1. Information service providers (providers of the CRIS data set)
  2. Information providers (providers of the CRIS content)
  3. Institutions and policy makers
  4. R&D Community
  5. Intermediary organisations
  6. Enterprises
  7. Non-profit sector
  8. Media

However, research information users belonging to those categories can interact with CRIS in different and complementary ways.

User Retrieving information from CRIS to be used in a working context.
Contributor Offering information for publication on CRIS.
Multiplier Promoting the service to potential users and/or assisting them to use CRIS.
Adviser Suggesting ways to improve the content and features on offer.
Supporter Influencing ways to support and develop CRIS.

3.2.1 Information service providers (providers of the CRIS data set)

    Who are they?

Collectors and publishers of research information, for instance research funding organisations, local or national authorities.

    Why do they provide CRIS?

CRIS providers aim to increase the awareness, usage of and interest in CRIS as a tool to communicate R&D information. In some case, CRIS are made available as a policy-supporting tool.

    How CERIF can help to meet their needs?

By implementing CERIF guidelines, existing CRIS can exchange data between distinct information systems. Further, new CRIS providers or organizations that want to expand their existing systems get guidelines for building up their research information system along the lines of a CRIS model.

A survey for a "Code of Good Practice for CRIS" concluded that CRIS2providers have a need for a recommended "CERIF 2000" which:

  • Covers a wider scope of research information, including relationships between networks;
  • Is adapted to today's information society environment;
  • And which allows implementation of emerging technologies.

However, CERIF 2000 should comply with a flexible approach, taking into account the various structures and database management systems used by current CRIS.

    How to improve CRIS visibility?

CRIS providers might take benefit from watching other CRIS to be aware of their own positioning. Both technical and information coverage considerations might lead to improvement of the service provided. Moreover, from a marketing point of view reciprocal links and comparability of contents might be beneficial for all CRIS.

3.2.2 Information providers (providers for the CRIS contents)

    Who are they?

Research centres in universities, research departments within companies, who are reporting about ongoing research activities to their management or directly to CRIS providers. Information managers in those structures responsible for gathering the information edited by researchers.

The main subgroups are researchers and innovators, and can be located in:

  • Research institutes;
  • Industrial organisations;
  • Small and medium-sized enterprises (SMEs);
  • Virtually any organisation, large or small, public or private, that is in some way involved in implementing, participating in research or innovation development activities.
    Why do they use CRIS?

CRIS data provision is often mandatory for participants in publicly funded programmes. However, information providers also supply data on their own initiative. Providers use CRIS as a communication tool, even as a promotional instrument for their activities and achievements.

Information providers also might be end-users seeking to benefit from published information.

    How CERIF can help to meet their needs?

CERIF provides a practical common standard for information contents and for subject indexing. Further, controlled value lists ease the collection and exchange of data.

To provide easily searchable information implies adapting common rules. To this end, it is important to use standard controlled vocabularies. The use of standard controlled vocabularies and standardised data structures, as described further in the CERIF guidelines, should be as wide spread as possible, in order to make both data providers and end-users familiar with common CRIS characteristics.

    How to improve CRIS visibility?

The function of data provider is the first step in the information flow from research activities to end-user. Therefore, CRIS should offer both clear information on the structure of the information covered and easy tools to deliver the information collected to CRIS.

To ease the information collection task, standard controlled vocabularies may be used for subject indexing purposes as well as controlled value lists (for instance to indicate the role of a person in a research project). It is necessary to provide additional value or benefits to both contributors to and users of the system to ensure the continued use of CRIS. This may be achieved by adhering to a quality plan that defines the accuracy, timeliness, data completeness, presentation of data to the end-user, and the functionality offered by the search software. Thus the data providers can be sure of the consistent treatment of the data which they furnish.

Further, CRIS should be active to make more information providers aware of the possibility to exploit research information systems to converse about their activities.

3.2.3 Institutions and policy makers

    Who are they?

People active in policy definition (institutional level) and decision-making, particularly in the context of innovation and R&D at different levels (national, regional or European). This group also includes research and/or scientific foundations, networks, special interest groups.

Policy makers might act as CRIS supporters and promoters. They can give advice, based upon their own needs.

    Why do they use CRIS?

They use CRIS for policy making background information, research trends analysis, etc. They could use CRIS to set-up priorities, objectives, policy definition, budget and programmes in the field of Innovation and R&D. Policy makers are also likely to look to the CRIS to evaluate research funding allocation, avoid duplication of research activity, analyse research trends and get concrete research results examples for citizens.

Policy makers also commonly need statistical analysis on research activities. Some CRIS do contain research information in such a way that statistical reports can be generated. However, the CERIF activities don't deal with statistical need as a priority. CRIS is not supposed to overlap with specific statistical surveys, as done by the Member States for Eurostat and OECD, but most of the research information systems are set up for other purposes3.

Furthermore, policy makers can use CRIS to support information services in meeting government policy requirements, as a formal log of research in progress, or to assist project planning.

3.2.4 R&D Community

    Who are they?

People who are actively involved in R&D activities, either as researchers or project managers.

R&D Community includes:

  • Universities (research departments, individual researchers, higher education officials, fundraising officers);
  • Regional, national and international institutional research centres and research councils;
  • Research centres in industry.
    Why do they use CRIS?

They are users of CRIS as a source of research-related information. They may be seeking to use CRIS as a kind of interactive channel (for example to seek for projects partners). In general, CRIS can be a key tool to use in supporting their R&D activities.

CRIS is a useful tool:

  • To avoid duplication of research;
  • To identify experts for exchange of ideas or for collaboration;
  • To identify information about ongoing or completed research results;
  • To locate equipment and services;
  • To find funding opportunities for research activities;
  • To make their research activities known;
  • To communicate their own research results to the CRIS user community.

They also may be providers of information for publication on CRIS.

    How CERIF can help to meet their needs?

The common format eases information retrieval of interest for researchers. Further, the subject index tools facilitate the search of information in their native language thanks to the Ortelius Thesaurus4.

3.2.5 Intermediary organisations

    Who are they?

Intermediary organisations refer to any body offering assistance and support in the field of innovation and/or research and technological development. The exploitation of key R&D information is not always easy for the enterprise (namely SMEs); thus the assistance of intermediary support is often a necessity.

Examples of intermediary organisations are:

  • EU-supported advisory services, and their coordinating agencies (Innovation Relay Centres, Business Innovation Centres, Euro Info Centres, Midas-Net centres, etc.);
  • National enterprise advisory centres, such as specialist technology advice centres, general business advice centres, enterprise creation centres;
  • Independent and semi-independent advisory services for enterprises, such as Chambers of Commerce and Industry, associative bodies, regional or local development agencies, University/Enterprise Interface or spin-off organisations, Private consultants - technology brokers, and more;
  • Enterprise accommodation organisations, as there are Science Parks, Technology incubators, Business incubators, Business parks, etc;
  • Enterprise finance organisations, e.g. public and semi-public venture capital funds, commercial venture capital funds, Banks & financial societies, and also business angel networks/clubs;
  • Enterprise associations, covering industrial sector related syndicates and federations, Research associations, interest or lobby offices for national or European associations, national business associations, small enterprise federations and Trade Unions.

These organisations nearly always have a detailed knowledge of their local economic environment and the prevailing business context. They operate in close cooperation with the local business community and are well aware of the daily difficulties encountered by entrepreneurs and business managers.

    Why do they use CRIS?

Intermediaries are often frequent and pragmatic users of CRIS to assist their clients. They are often skilled in using CRIS and have an intimate knowledge of the possibilities.

Intermediaries might use CRIS to get overviews, with the purpose to produce synthesis reports for their clients. CRIS facilitates day-to-day contact with innovative enterprises and finding relevant practical information (on ongoing and completed research, exploitable research results, potential partners and experts).

    How CERIF can help to meet their needs?

These organisations are big users of tailoring tools, such as the possibility to select information by industrial sector through methodical use of industry-related classification schemes (NACE codes for sectors of economic activity, CPA codes for products).

CERIF 2000 enlarges the range of research information with added types of information which offer practical benefits for intermediaries. Further, when CERIF 2000 is used by most CRIS, the intermediaries will develop a certain familiarity of searching CRIS and retrieving comparable information from different sources.

    How to improve CRIS visibility?

The best way to attract and retain the attention of intermediary organisations is by offering practical advantages, such as pragmatic features that are immediately useful to their day-to-day work. One example is to offer them an information space to communicate information about exploitable results, to get in contact with other CRIS users.

3.2.6 Enterprises

    Who are they?

Enterprises are the vital practical end of research and innovation processes; they cover people and organisations that turn new ideas into successful business activities, thereby generating economic growth and employment. All potentially innovative enterprises – not just those active in R&D - are contained in this group.

A special group of enterprises with relevance to CRIS are publishers.

    Why do they use CRIS?

CRIS can help businesses:

  • To know what is going on in their sector and in the business world at large;
  • To promote and locate transferable technologies;
  • To identify funding sources;
  • To identify potential partners and experts all over Europe and to build the right relationships;
  • To turn research efforts into products, but also to encourage them to participate in and exploit research activities;
  • To compile and create publications;

By providing new information at minimal expense, CRIS is keeping costs down whilst offering a wide range of information concerning EU R&D activities and opportunities for fruitful new ventures.

    How CERIF can help to meet their needs?

Most firms have a relatively narrow field of interest, so they should be able to tailor information queries to focus only on their special interests. Therefore the appliance of simply-to-use classification systems for "filtering" information of interest is vital. Further, CERIF facilitates information retrievals by offering searchable subject indexing tools (thesaurus for research subject, commonly used codes for indication of industrial sector or possible market applications, etc.).

    How to improve CRIS visibility?

Europe contains some 15 to 20 million enterprises but this does not mean that all enterprises will have an equal interest in CRIS, nor an equal ability to exploit the service. It is thus necessary to target communication to those enterprises most likely to be both able and willing to make good use of the service. CRIS is essentially concerned with technological innovation. The involvement of a firm in technological/scientific R&D is a factor that will indicate potential high interest for CRIS.

This approach leads to the following consequences for CRIS:

  • Stronger emphasis on promotion of innovation, and exploitation of R&D Results;
  • Need for far-reaching measures to improve the user-friendly nature of CRIS, in particular for users with low levels of IT skills/equipment. To this end, the use of easy-to-use and known (standardised) schemes for subject indexing might be a great help for a first selection of information.

Moreover, CRIS must attract and retain the interest of enterprises by offering practical advantages:

  • CRIS must be an easy service to use, requiring no special training or aptitude, and leading the enterprise swiftly, reliably and directly to the information it needs;
  • CRIS must give enterprises greater business information content, and present all information from a business perspective;
  • CRIS must have a service mentality, and try to understand and meet users' needs;
  • CRIS must offer understandable up-to-date information in an exciting way that makes businesses feel concerned and involved.

It must be ensured that once a business related user comes to CRIS, he/she will want to come back regularly.

3.2.7 Non-profit sector, including citizens

    Who are they?

The non-profit sector covers any organisation looking for usable research results or other research-related information that may be of benefit to society in general or to a specific social group.

Examples are:

  • Local, national and international public bodies (Local authorities, Public administrations, Police Forces, Civil Protection Services, Health care systems, hospitals);
  • Organisations and representatives of specific social categories, for instance NGOs or lobby-groups interested in the close follow up of innovative research for tackling key social or development problems. These could be for instance Consumer associations, People working to help the disabled and/or ageing population, Medical and medicinal associations, Environment oriented organisations, Libraries and publishers, Cultural/historical heritage groups;
  • Charities.

Such social groups appear as a key target with regards to the new themes highlighted by the Fifth Framework Programme for Research, Technological Development and Demonstration of the European Commission. They can apply research results, for example to improve the quality of life, in management of living resources, or in creating a user friendly information society.

This group is far from homogeneous and is acting in various roles. Non-profit users are simultaneously CRIS users, supporters, and/or are intermediaries who can help to reach the other players in this group and elsewhere. This group also covers the "citizen" in general.

    Why might they use CRIS?

Non-profit users can use CRIS for different purposes:

  • CRIS can be a useful tool in their work to locate technologies or results related to their "societal" objectives;
  • CRIS is an easy to use tool to get relevant information and contacts with experts;
  • CRIS is a tool that can provide them with help and information with policy options and guidelines;

These users are particularly looking for research projects leading to specific social applications in a wide variety of domains such as health, education, culture, social services, needs of disabled, environment, transportation and leisure, etc.

    How CERIF can help to meet their needs?

CERIF 2000 covers details of added value information, which might be helpful for this type of users. Further, the application of CERIF will provide more user-friendly searching facilities.

    How to improve CRIS visibility?

Awareness might be re-inforced on CRIS as a useful tool to locate and further exploit research results for "societal" objectives. Specific needs of this user category are:

  • Easy query formulation, eventually in multiple languages, selection based upon classification per research area, on different levels;
  • Concise information, in multiple language;
  • Tools for fetching tailored information on possible research applications;
  • Getting contact details where relevant.

3.2.8 Media

    Who are they?

People working in all internationally, nationally and locally printed, audio-visual, and virtual publications that are read or used (or indeed produced) by any of the other user groups.

For the daily and weekly general press, journalists can be specialised in science, economics, or new technologies. For the specialised press, specialisation can cover managerial and financial issues, science and research, new technologies, industrial and sectoral (especially press targeting SMEs) innovation and education.

    Why they use CRIS?

The media are potential users of CRIS as their information source to find accurate and up-to-date information on ongoing research, to detect relevant experts for interviews etc. CRIS can also generate inspiration about the use of research information in their daily working process.

    How CERIF can help to meet their needs?

Journalists traditionally are used to work with indexes of various kinds to look-up information. CERIF 2000 covers added details of information, which might be helpful for this type of users. Further the searching facilities are user-friendlier.

    How to improve CRIS visibility?

In the promotion of CRIS, journalists can have an extremely crucial role. By encouraging them to use CRIS, they might spread the word or become trend-setters in the use of CRIS. To this end, it is crucial that CRIS are easily accessible and user-friendly. The introduction of well-thought classification schemes and indexes would give them a "familiar" tool to select information.

Specific requirements for CRIS regarding the media as a potential user group are:

  • To adopt a clear language, to avoid technical jargon, to provide glossaries;
  • To provide easy to find information;
  • To index information according to the type of journalist: general and specialised press, specialised domain, area of industrial sector.

CRIS have to promote their systems with a clear description of what can be found. Media need to be aware of the opportunities offered by CRIS to the R&D community and enterprises, as journalists often target these groups.

Besides the use of easy to handle controlled vocabularies, e- mail-alerting tools would give a welcome added value to CRIS for this specific user group.

TOP

3.3 Evolution of user needs

A decade ago, on-line research information services were too complex for the average end-user. Specialized information brokers and intermediaries were therefore consulted. Today, basic information is easily accessible by the end- user.

New roles for intermediaries are to "translate" the problem / question of a user into a specific "consultation" or "query" of information systems, to assist their clients to select catered information, to develop tailored information services, and to transmit good practices in using information systems.

Research and technological development policy has shifted to the broaden concept of innovation policy. There is therefore a growing need of new research facilities, i.e. wider than information on ongoing research activities. The evolving need also covers information on exploitable R&D results, and other types of information, which can lead to innovation and technology transfer.

A CORDIS5 user needs survey pointed out that end-users of R&D information also expect easy access to CRIS all over Europe. It is also important for the end-user that CRIS should supply comparable information.

The value of certain types of research information is not always acknowledged. The "naming" of different types of research information often refers to traditional uses, and users are not always aware of other possibilities than traditional ones. For instance, to identify experts in a certain field, it is not required to use exclusively "expert" databases: it is also possible to locate experts by searching on different types of databases, such as projects-databases, results-databases, etc. This means that CRIS should include awareness raising on the multiple-purposes of certain research information fields. Different information services should be promoted in a different way to different user groups in order to enhance their full exploitation.

Regarding the information content, CRIS providers have to be aware of the growing group of non-expert users. To cater to this broader audience needs, CRIS should further adapt and introduce new techniques such as:

  • Multi-lingual search possibilities;
  • "Active links" (URLs) to complementary information related to CRIS records, which might be available in other sources;
  • A wider range of multimedia types of information;
  • Tools for selecting research results according to their sector of industry, or to potential industrial market application;
  • Value added services, such as automatic e-mail alerting features, based upon the users' individual profiles of interest;
  • Interactive input tools, e.g. for research results and technology offers;
  • Electronic communication forums for discussion groups.

TOP

3.4 Common user requirements

This section deals with basic user needs common to all user groups identified in the previous section.

The common user requirement was the main concern of a study6 carried out by EuroCRIS. The recommendations highlighted the following important features:

  • Comprehensive and accurate information;
  • Reliable information;
  • Quality information;
  • Easy to find information (user-friendly interface);
  • Comparability of data;
  • In own language or in English;
  • Comprehensive subject indexing to get to relevant information quickly;
  • On-line help/explanation of subject indexing;
  • Protection of privacy (by law);
  • No copyright problems in using data or clear guidelines on limitations for use of data;
  • Clear indication of property rights;
  • Accurate contact information of the identified person/organisation/document;
  • People/organisations with relevant skills;
  • Easy identification of exploitable results;
  • Clear identification of the type of collaboration sought;
  • Easy identification of consultancy offered;
  • Output selections;
  • Information on cost (when charged for).

In terms of the functionality required for a CRIS, the survey lists the following possible functions:

  • Order documents on-line;
  • Comprehensive searching facility;
  • Links to other databases;
  • Links between documents in full text and multimedia objects;
  • Import/export functions (ability to download in different formats);
  • Reporting facility (ad hoc and predefined: text and statistics);
  • On-line tutorial;
  • Electronic payment (if applicable);
  • Multilingual facilities.

The report recalls that the definition of content and field structure is the most important component leading to consistency of and interoperability between CRIS. In particular, it is necessary to define the basic data elements and the kernel (mandatory) fields used for information exchange that should be considered for all CRIS. Fields making up the kernel need to be considered very carefully to ensure that only the essential information is marked for information exchange. Further, it is important to consider how to mark or handle empty mandatory/priority fields if no information is available. A definition for the supplementary information fields is also required.

EuroCRIS recommended that the kernel should include the following types of research information:

  • Projects, including the title and associated information on the individual research project;
  • Programmes; containing the title and associated information of the overall research programme;
  • Organisations, with information on the institution or body awarded the research funding and hosting the project or programme;
  • Expertise, about the knowledge and experience of the individual researchers;
  • Results, with information about the outcome of the research project or programme (often closely linked to "Publications");
  • Publications: the documentation in any format of the research results. As stated in the EuroCRIS guidelines, "publications" may include articles in journals as well as the deliverables from the projects. However, the CERIF guidelines are more restrictive in this area.

The CERIF Revision Working Group discussed the limits on contents and structure guidelines for CRIS.

It was agreed that overlap has to be avoided with research documentation systems such as publication databases or patent information systems. References to publications and patents could be included if relevant for namely a project, a researcher, a result. These references should not go further than for instance information to define what kind of publication it is and where to find it.

The issues of recommended versus kernel fields, together with the related attributes, are tackled in detail in the CERIF Data Model (Cf. chapter 5).

Reference material on user requirements

  • "User Needs for Research Information", paper presented by Lieve Van Woensel, CORDIS, DG XIII-D.4, European Commission, during the CRIS 98 Conference "CRIS – The way to innovation", 12-14 March 1998, Luxembourg./cybercafe/src/vanwoensel.htm

  • Code of Good Practice for Current Research Information Systems",a EuroCRIS report, January 1998, resulting from an initiative of the European Commission DG XIII (CORDIS) and the European Platform for Current Research Database Producers (EuroCRIS). Further information on the EuroCRIS platform can be found at the internet, http://www.fou.uib.no/cris/index.htm

  • "Towards harmonisation of databases on research in progress - Final report of the European Working Group on Research Databases", November 1988. Published by the Liaison Committee of Rectors' Conferences of Member States of the European Communities and Directorate General for Science, Research and Development of the Commission of the European Communities; financed by the Commission of the EC, contract PSS*0058/B, compiled by Dr. L. Van Woensel.

  • "The proof of the pudding is in the eating - Ways and means to secure the participation of researchers when documenting R&D at institutional level.", Jostein H. Hauge, University of Bergen Norway, March 1998, /cybercafe/src/hauge.htm

  • Internal documents on marketing strategy for CORDIS, the Community R&D Information Service (1999).

TOP

4 Objectives of CERIF 2000, today and in the future

4.1 Analysis by CERIF Revision Working Group
4.2 Scope of research information covered
4.3 Flexible approach of CERIF 2000 data model
4.4 Subject indexing schemes and automatic indexing technologies
4.5 Multi-linguality - cross-language information retrieval
4.6 CERIF implementation in metadata environment

4.1 Analysis by CERIF Revision Working Group

The CERIF Revision Working Group has experience related to broad exchange of information, know-how and CRIS.

The following issues were addressed:

  • User groups targeted and their needs;
  • Types of research information to be considered;
  • Compatibility with existing CRIS;
  • Relevant data content per type of research information;
  • Web & metadata technologies;
  • Indexing technologies and classification tools.
  • Technologies and tools for subject indexing.

Derived from this analysis the Working Group made a number of decisions regarding in particular:

  • Scope of research information covered;
  • A flexible approach for the CERIF 2000 data model to cover both existing CRIS and new "Ideal" CRIS;
  • Automatic and traditional subject indexing;
  • A multi-lingual approach;
  • A CERIF implementation in metadata environment.

These subjects are dealt with in the following sections.

TOP

4.2 Scope of research information covered

CERIF 2000 will extend the original CERIF scope of "projects" to cover the following research information types:

  • Organisations;
  • Persons;
  • Products, patents and publications and other "results" of research projects;
  • Expertise;
  • Equipment and facilities.

For the selection of these main types of research information, the CERIF 2000 data model also defines more specifically the data entities and their attributes.

TOP

4.3 Flexible approach of CERIF 2000 data model

CERIF 2000 must not only define the "ideal" for new CERIF conform to CRIS but also it must accommodate existing CRIS. The major design objectives for the CERIF 2000 data model are therefore to provide:

  1. A full CRIS data model with flexibility to allow the majority of existing CRIS to accommodate their own database structures;
  2. A basic framework for data exchange.

The approach therefore in meeting the design objectives is threefold:

  1. To define a full CRIS data model which will cover the database structures of the majority of existing CRIS;
  2. To define a set of data models which could provide examples for data exchange (since there are an infinity of possible exchange data models between CRIS). These examples of data models also illustrate that it is not necessary to implement the full CRIS data model if the requirement is only for a particular subset;
  3. To define a metadata data model to provide a uniform summary-level view over heterogeneous information sources.

These data models are described in chapter 5 and an explanation of metadata is given below in section 4.6.

TOP

4.4 Subject indexing schemes and automatic indexing technologies

By automatic indexing techniques we refer to a software program which "automatically" provide subject access points. Traditional subject indexing in contrast uses a controlled vocabulary.

Conformity in subject indexing and systematic application of controlled vocabulary is always beneficial. The indexing of records using a predefined set of terms as found in a controlled vocabulary is supportive at the single data base level, but is essential in case of access to multiple data sets from multiple sources in a multilingual environment.

The advantages of introducing guidelines for harmonised terminology are that:

  • They improve performance in retrieval;
  • Multi-lingual terminology is a welcome assistance for users in a multilingual environment;
  • The controlled terms from various terminology can be used not only as "access points" in databases, but also in metadata related to contents of records, applied at various levels ("catalogue" type data set, web interface, …).

A characteristic of controlled vocabulary is that regular updating has to be ensured. In addition, human intervention is required and often leads to retrospective subject indexing on existing data. Automatic indexing then looks attractive but as indicated above is no substitute for traditional indexing and systematic application of uniform controlled vocabularies when multi-lingual, multiple data sets from multiple sources are introduced.

The CERIF 2000 guidelines concerning classification will therefore cover:

  • Major research related subject areas (R&D domain, potential market application, developed products);
  • Items which can be identified in the light of "controlled value lists".

These guidelines are defined in chapter 6 below.

A future alternative for the retrieval problem may be the utilisation of emerging technologies, such as data mining or knowledge- based systems.At the moment this report has been drafted, these were found not be mature enough. The Working Group decided to follow the evolution of these techniques in the future.

A description of both Knowledge Based Systems and Data mining is given below in the following subsections.

4.4.1 Knowledge Based Systems

KBS (knowledge-based systems) technology and intelligent agents interacting with domain ontology are of particular interest. Domain ontology is like a thesaurus but includes semantic relationships between terms that are not present in traditional thesauri. Traditional thesauri restrict themselves basically to synonyms and hierarchical higher and lower terms; ontologies also can contain other relations, e.g. causal and temporal relations.

By using these technologies, a search can be broadened or narrowed automatically to retrieve relevant records according to some user-set threshold. Intelligent agents can assist at the user interface (both for retrieval and for input / update). Thus the addition of intelligent agents can foster domain ontology dynamics, therefore overcoming some of the problems faced with systems and thesaurus maintenance. There are 'frontier science' techniques for automated production of thesauri and ontologies but for the time being manual intervention and checking is required – a sort of 'editor' function – but it can be assisted by the agents to minimise the input and administrative tasks. More importantly, the domain ontology has more uses within the system than a simple thesaurus and so the return on investment may be greater.

It is considered that this technology will be followed for eventual future integration to the recommendation when it is mature and cost- effective for the purposes of CERIF.

4.4.2 Data Mining

A second alternative to be watched regarding future use in CERIF is data mining. Data mining is the analysis of data and the use of software techniques for finding patterns and regularities in sets of data. The computer builds up the patterns by identifying the underlying rules and features in the data. It is possible to `strike gold', as the data mining software extracts patterns not previously discernible or so obvious that no one has noticed them before. Data mining as such might become a useful tool to develop automatic generated indexing (terminology, metadata) based upon a first period of gathering terms and notations from controlled vocabularies in CERIF compatible data collections.

Data mining technology is the result of a long process of research and product development. Data mining gives access to prospective and proactive information delivery; i.e. oriented towards forecasting based upon collected information. However, data mining is not magic. Buying a neural net program and interfacing it with a terabyte data warehouse is not likely to produce any useful results. It will take an ever-lasting time to get an answer, and that answer will probably be worthless. The successful data mining application will combine data mining techniques with thorough understanding of business problems and present the results in an easy to understand way for the user.

TOP

4.5 Multi-linguality - cross-language information retrieval

Multilingual query support is a major issue in current and future European information systems. A strong need exists to find effective ways for handling cross-language information retrieval, that is the ability to issue a query in one language and receive information in another. This distinguishes cross-language information retrieval from monolingual information retrieval.

Researchers in this field have investigated issues, which are of vital importance to cross-language information retrieval systems, such as segmentation, morphology, stop-word lists, and user interface localisation. The European Commission strongly supported and still supports several studies and projects (see list below).

Examples of projects in Europe:

  • CANAL/LS: a European Commission Telematics for Libraries project that uses query translation to allow searches of a multilingual library catalogue in English, French and German.

  • CRISTAL: a European Commission Language Engineering project working on an information retrieval system that searches a corpus of French documents using queries in English, French or Italian.

  • EMIR: the European Multilingual Information Retrieval Project designed to handle cross-linguistic queries in English, French and German. Now completed, EMIR was part of the European Commission's ESPRIT programme. EMIR's SPIRIT text retrieval system was extended to Russian.

  • MULINEX: a European Commission Language Engineering project developed tools for cross-language text retrieval on the World Wide Web.

  • SPARKLE: a European Commission Language Engineering project evaluating the utility of parsing for cross-language information retrieval.

  • TRANSLIB: a European Commission Telematics for Libraries project working on Greek, Spanish and English access to OPAC (library Online Public Access Catalogue).

  • TwentyOne: a European Commission Information Engineering project working on English, German, Dutch and French access to multimedia documents.

  • University of Geneva: a paper on the use of deep natural processing for cross-linguistic retrieval of medical texts by Anne-Marie Rassinoux, prepared for the HELIOS project of the European Commission.

Two main approaches exist to access the content of information systems.

The first approach allows the user to formulate queries by drawing search terms from a controlled vocabulary (pre-defined vocabulary, thesaurus or classification system).
The second approach allows the user to formulate queries in a free-text style, known as free-text search. In the latter the user is allowed to express queries in terms of his/her choice. The entire body of documents is consequently searched through for matching terms.

Although the use of controlled vocabulary assists the user in formulating more precise queries and therefore significantly reduces retrieval noise (e.g. spelling differences), this approach may cause restrictions in expression when the selected controlled vocabulary is not complete or precise enough. For this reason, most retrieval systems offer the possibility of free-text searching, often additional to the use of controlled vocabulary. Simultaneous use of both options is commonly found in commercial text retrieval systems.

In those two approaches, the need exists for a user in a multilingual environment to be able to express a query in one language and to retrieve documents indexed in other languages as well. In the controlled vocabulary approach, cross-language querying can be supported by the maintenance of look-up tables in which all terms, together with their translations, are kept.

Cross-language free-text systems need a translation of the submitted terms into all languages of interest. The user introduces the search terms in a given language, after which a multilingual dictionary is consulted for all other languages available, in order to translate the search terms into equivalent terms.

For CERIF 2000, the data model makes provision for multi- linguality throughout its structure (see chapter 5).

TOP

4.6 CERIF implementation in metadata environment

Metadata is 'data about data'. As such, this concept is not new and has been in use whenever information objects need to be organised, retrieved or exchanged. A library manual catalogue index card system to organise and aid searching for books is thus a metadata system. Catalogue information stored to describe the available goods in a retail store, the available parts in an engineering warehouse, the contents of a museum or the available holidays offered by a travel company can all be considered as metadata.

There are three major kinds of metadata:

  1. schema metadata (which controls database structure),
  2. navigational metadata (which provide access to the data) and,
  3. associational (labelled) metadata(which provides additional and useful information about the data, subdivided into :
    1. descriptive (describing the data),
    2. restrictive (controlling access to the data) and
    3. conditional (terms under which the data may be used / accessed).

Metadata allows for intelligently-assisted querying, online help and intelligent interpretation of results. It helps in quality control of input data. Further, it allows systems to exchange information or participate in global queries over heterogeneous distributed systems. Metadata can advertise information.

Data Dictionaries, leading to the IRDS7 standard, were an early attempt at a metadata standard by the database community. There are many metadata standards in use, commonly for exchange, in the scientific and engineering communities – e.g. the STEP data and EXPRESS language specifications.

The universal access provided by WWW has highlighted the need for metadata on the web. The richness of information sources on the web, together with its easy access, has given users a taste of what can be achieved. However, this also led to frustration when users look at the results of generalised searches and realise they have further work to do to get to the core of what they want. The W3C consortium has developed a standard named RDF (Resource Description Framework) which utilises XML (Extensible Mark-up Language). These are generalised standards that are adapted and extended to cover particular domains. For instance, the Dublin Core and Warwick Extensions are attempts by the library community to define associational (labeled) descriptive metadata for content, analogous to a library catalogue card. However, there is no consensus yet reached and experiments are ongoing.

In the research information domain, the original CERIF (1991) was used as an exchange data model but equally could have been considered to be a metadata model. The application of such models has led to a number of pilot projects (the Swedish SAFARI8 project and the European Commission ERGO project). Although these projects are ongoing, a number of conclusions are beginning to emerge as to how the Metadata models should be refined and extended:

  1. The metadata models should be formally defined with precise definition of syntax and semantics. The relatively "informal" manner in which some of the early work was defined means that implementation are open to widely differing interpretations with consequent lack of homogeneity and reduced ability for interoperation.

  2. In systems such as CERIF, the metadata model and the data model of the CERIF data itself should not be regarded as mutually exclusive. CRIS have much greater structure with precise data models and semantics. Generalised metadata models developed for less structured entities such as web pages will loose much of the richness of the content of the original CRIS source.

This is why CERIF is now not only defining a metadata/catalogue model but also the "Ideal CRIS" from which it is derived.

Reference material on the Objectives of CERIF 2000, today and in the future

  • NSF/DARPA/NASA Research Projects on Digital Libraries - National Science Foundation (NSF), of Defense Advanced Research Projects Agency (DARPA), National Aeronautics and Space Administration (NASA).

  • Post-conference Research Workshop on Cross-linguistic Information Retrieval, 22.8.1996, Zürich, Switzerland - Special Interest Group on Information Retrieval (SIGIR).

  • The 1997 Spring Symposium : report - American Association for Artificial Intelligence.

  • Metadata: "The Future of CRIS", Paper from Keith G Jeffery, CLRC Rutherford Appleton Laboratory, presented during the CRIS98 Conference, 12-14 March 1998, Luxembourg. /cybercafe/src/jeffery.htm

TOP

 
 

About CERIF CERIF 2000 recommandations CERIF 2000 tool kit CERIF 2000 Assistance CERIF 2000 maintenance and feedback CERIF Reference Material Links to CRIS CORDIS Home page CERIF Assistance page CERIF copyright/disclaimer CERIF glossary CERIF Home page CORDIS Home page CORDIS Home page CERIF Home page