Genome research has spawned unprecedented volumes of data, but characterisation of DNA and protein sequences has not kept pace with the rate of data acquisition. To anyone trying to know more about a given sequence, the worldwide collection of abstracts and papers remains the ultimate information source. The goal of the BioMinT project is to develop a generic text mining tool that:
1) interprets diverse types of query;
2) retrieves relevant documents from the biological literature;
3) extracts the required information; and
4) outputs the result as a data- base slot filler or as a structured report.
The BioMinT tool will thus operate in two modes. As a curator's assistant, it will be validated on SWISS-PROT & PRINTS; as a researcher's assistant, its reports will be submitted to the scrutiny of biologists in academia and industry. The project will be conducted by an inter-disciplinary team from biology computational linguistics and data/text mining.
Funding SchemeCSC - Cost-sharing contracts
2610 Wilrijk (Antwerpen)