The major design objectives for the CERIF 2000 data model were:
These objectives have been met by:
Subject indexing is essential in the case of access to multiple data sets from multiple sources and in multiple languages. This is the case for the CERIF 2000 environment. Specific thesauri and classifications have been recommended for the areas of research subject (Ortelius), economic activity (NACE) and products (CPA). Other indexing guidelines have been given for controlled value lists for specific attributes. 7.2.1 Ortelius Thesaurus The Ortelius Consortium made the following proposal to the Commission for the use of the Ortelius Thesaurus as the main R&D subject indexing scheme for CERIF 2000. "The Ortelius Consortium agrees to undertake the following actions:
Contact person : Fiora Imberciadori
7.3 Recommendations for the use of CERIF 2000 guidelines The following recommendations are made for the use of the CERIF 2000 findings and deliverables:
CERIF 2000 is intended to be "implementation independent". A discussion on implementation scenarios is included here to help the reader to visualise how it could be put into practical use. Two extreme scenarios can be considered:
7.4.2 Local CERIF implementation Research information providers can apply/implement CERIF 2000 in a flexible way, meeting their own needs. They should use appropriate components from the full CRIS Data Model and should provide utility software, to export schema and data instances, with structure and content at least as rich as the Exchange Data model. 7.4.3 Integrated implementation
To provide one single access point allowing the CRIS user access to distributed research information sources, one could consider a web implementation of the Catalogue Metadata model. It is possible for the metadata and for the data to be in many different kinds of system. For general undisciplined web resources, it is becoming increasingly common for metadata to be in RDF/XML pointing to data in html. However, most of the data on the web is in databases, the html is ephemeral, and the metadata is also stored in databases - in the form of data elements that could be converted to XML if required; - relational systems are particularly suitable for this. This means the metadata and data should be modelled correctly and then implemented in the most suitable systems environment. The relative strengths and weaknesses of the different types of implementation system environments (Relational, Object Oriented (OO), Information Retrieval (IR) and RDF/XML) are discussed in the following section. Then the practicalities of metadata extraction are reviewed before a discussion on how it might be implemented in ERGO, in RDF/XML and in an IR environment.
From the above table it is clear that Relational has advantages except for large blocks of free text - whether handling data or metadata.. 7.4.3.2 How to get the metadata?
Schema metadata has to be constructed by the database administrator, although in advanced heterogeneous systems it is possible by schema reconciliation to generate new schemas. A similar situation applies for metadata for security, access or charging purposes. Content metadata is different; instances are subsets of the real data - instance by instance - a catalogue. This can be generated by projection from the data instances - but this requires the content metadata to be a proper subset of the export or exchange schema (what the Database system is willing to expose to the outside world). This is the basis for the three layer model of CERIF 2000. ERGO has already implemented a metadata-like catalogue for several European research databases. It is recommended that ERGO should consider a project to provide a single access point for the CRIS users to reach CERIF 2000 compliant research information systems. A scenario for implementing "ERGO 2" architecture could be as follows: (a) Web query form accessing content metadata database (like ERGO pilot). There is advantage in using RDBMS 1 technology to allow realised XML metadata pages to be generated; (b)A multimedia web document (xml or html) with query answers to be sent back to the end-user over the web; (c) Connection between the metadata system and multiple heterogeneous systems with:
Exactly as referred under 7.3.3.3, but the metadata instances are generated from metadata database records as XML when seen by the end-user. For retrieval speed they are stored / accessed as database records. One could imagine authored RDF/XML metadata and authored html pages one per project or other entity but it would take a lot of time and effort. 7.4.3.5 Information Retrieval Systems for metadata and / or data Information Retrieval (IR) systems can be used to handle either data or metadata - especially when free text retrieval over long attributes is important, and especially with advanced Boolean query capability. However, IR systems cannot handle complexly structured data and are very inflexible to structural change.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||