Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS
Content archived on 2024-04-19

Provision of EMBL data library services

Objective

The core objective of this project was the provision of the EMBL Nucleotide Sequence Database in collaboration with the appropriate international partners. Related tasks included:
- the provision of protein sequences via the SWISS-PROT database in a way which allowed it to viewed through the same interface as the nucleotide sequences.
- support (with collaborators) of specialist databases to be used with the Nucleotide Sequence Database.
- collaboration with the EMBnet project.

All of the main objectives were met, during a phase complicated by a move of the entire Data Library operation from EMBL's Heidelberg headquarters to its Outstation on the Wellcome Trust Genome Campus at Hinxton in the UK. Added to these logistic developments, advances in genome sequencing caused a huge surge in the scale of the task. The original proposal predicted 350 megabases at the end of the project. The total, as can be seen from figure 1, was nearer 700 megabases.
The EMBL Data Library has responded to these changes by major technical developments in software, and database methodology and hardware enhancements. In particular, following the transition to the UK, the Data Library remodelled its team and developed its services to Exploit modern network access methods, such as the World Wide Web, both for data acquisition and for data distribution and query.
The Nucleotide Sequence Database was been delivered to schedule, and the only disappointments have been problems with the SWISS-PROT pipeline causing late delivery of releases, and postponement of a planned user survey due to changes around the transition to the UK. Although the results of this survey were largely positive, the response rate was so low that it is hard to give very much credance to its findings. Aside from the overall growth of the nucleotide collection it is worth noting that the fraction of the nucleotide data which are human has risen from 20% to over 40% (by base pairs) in the course of the project.
The scaling up of operations at the Sanger Centre, the EBI's neighbours on the Genome Campus, makes them the biggest submitter of data in the world, and close cooperation with them resulted in the development of the 'Syncron' system for the automatic inclusion of data from genome sequencing projects into the database.
The EBI continued to work in close co-operation with their global partners in the USA and Japan, exchanging all data and updates via computer networks. Technical collaborative developments have included:
- extension to accession number format
- experimental scheme for representing very long sequences
- introduction of cross-references to external databases at level of sequence features, allowing, among other things, more robust links to SWISS-PROT.
- implementation of a common taxonomy developed by NCBI
- simplified procedures for processing and exchanging data from the patent literature.

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

You need to log in or register to use this function

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Data not available

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Data not available

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

CSC - Cost-sharing contracts

Coordinator

EUROPEAN MOLECULAR BIOLOGY LABORATORY
EU contribution
No data
Address
Wellcome Trust Genome Campus, Hinxton Hall
CB10 1SD SAFFRON WALDEN
United Kingdom

See on map

Total cost

The total costs incurred by this organisation to participate in the project, including direct and indirect costs. This amount is a subset of the overall project budget.

No data
My booklet 0 0