Development of tools to support the rapid distribution of protein sequence data within Europe.
New methods for data exchange and distribution of protein sequences have been explored. The concept of data distribution over wide area networks has been investigated. A new method for protein family classification has been applied. An advanced sequence data description language is under development.
The Protein Sequence Databank at the Max-Planck-Institute for Biochemistry is responsible for the data collection and distribution of protein sequence data within Europe. Approximately 10000 new sequences are collected by PIR-International annually. The goal of the closely cooperating databases in Europe, the US and Japan is to provide high quality up to date data to a user community that is estimated larger than 50000 scientists in life sciences, bioindustry, and, more recently in the patent business.
Distribution through hard media such as magnetic tapes or CD-ROMs is too slow to satisfy the need for access to the latest data. Online access to molecular sequences and sequence data analysis software is more appropriate, but a single node in Europe is not sufficient to provide satisfactory services.
Based on the European Bioinformatics Network, a distribution scheme is developed to forward transactions to national nodes that act directly on an indexed database. The implementation of such a method will allow frequent updates, including corrections and will improve the European infrastructure in bioinformatics.