Pan-genome Graph Algorithms and Data Integration

Project description

Graph-based representation of the genome sequence data

Modern sequencing technology produces genome sequence data on a gigantic scale reaching into exabytes. The emerging urgent question is how these volumes of data could be arranged and analysed in a computationally efficient and biomedically meaningful manner. This EU-funded project is going to explore graph-based representation of large genome datasets and determine their advantages over traditional sequence-based presentation of pan-genomic data. Genomes that are evolutionarily close vary only a little and graph-based pan-genomic representation allows to remove redundancies while highlighting important differences. The research is going to demonstrate the advantage of the shift to the new data representation approach using comparative analysis, compression, integration and exploitation of genome data as the fundamental points.

Objective

Genomes are strings over the letters A,C,G,T, which represent nucleotides, the building blocks of DNA. In view of ultra-large amounts of genome sequence data emerging from ever more and technologically rapidly advancing genome sequencing devices—in the meantime, amounts of sequencing data accrued are reaching into the exabyte scale—the driving, urgent question is: how can we arrange and analyze these data masses in a formally rigorous, computationally efficient and biomedically rewarding manner?
Graph based data structures have been pointed out to have disruptive benefits over traditional sequence based structures when representing pan-genomes, sufficiently large, evolutionarily coherent collections of genomes. This idea has its immediate justification in the laws of genetics: evolutionarily closely related genomes vary only in relatively little amounts of letters, while sharing the majority of their sequence content. Graph-based pan-genome representations that allow to remove redundancies without having to discard individual differences, make utmost sense. In this project, we will put this shift of paradigms—from sequence to graph based representations of genomes—into full effect. As a result, we can expect a wealth of practically relevant advantages, among which arrangement, analysis, compression, integration and exploitation of genome data are the most fundamental points. In addition, we will also open up a significant source of inspiration for computer science itself.

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Keywords

Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)

Coordinator

UNIVERSITA' DEGLI STUDI DI MILANO-BICOCCA

Net EU contribution

€ 290 076,00

Address

PIAZZA DELL'ATENEO NUOVO 1
20126 Milano
Italy

Region

Nord-Ovest Lombardia Milano

Activity type

Higher or Secondary Education Establishments

Links

Contact the organisation Website

Participation in EU R&I programmes

HORIZON collaboration network

Total cost

€ 290 076,00

Participants (8)

STICHTING NEDERLANDSE WETENSCHAPPELIJK ONDERZOEK INSTITUTEN

Netherlands

Net EU contribution

€ 55 982,00

UNIVERSITAET BIELEFELD

Germany

Net EU contribution

€ 250 378,00

UNIVERZITA KOMENSKEHO V BRATISLAVE

Slovakia

Net EU contribution

€ 148 120,00

GENETON S.R.O.

Slovakia

Net EU contribution

€ 251 160,00

Address

ILKOVICOVA 8
841 04 Bratislava

SME

Yes

Region

Slovensko Bratislavský kraj Bratislavský kraj

Activity type

Private for-profit entities (excluding Higher or Secondary Education Establishments)

Links

Contact the organisation

Participation in EU R&I programmes

HORIZON collaboration network

Total cost

€ 251 160,00

INSTITUT PASTEUR

France

Net EU contribution

€ 101 660,00

ILLUMINA CAMBRIDGE LIMITED

United Kingdom

Net EU contribution

€ 0,00

UNIVERSITA DI PISA

Italy

Net EU contribution

€ 43 424,00

Masarykova univerzita

Czechia

Net EU contribution

€ 0,00

Partners (6)

Partner

CORNELL UNIVERSITY

United States

Net EU contribution

€ 0,00

Partner

NATIONAL UNIVERSITY CORPORATION THEUNIVERSITY OF TOKYO

Japan

Net EU contribution

€ 0,00

Partner

Simon Fraser University

Canada

Net EU contribution

€ 0,00

Partner

THE PENNSYLVANIA STATE UNIVERSITY

United States

Net EU contribution

€ 0,00

Partner

THE REGENTS OF THE UNIVERSITY OF CALIFORNIA

United States

Net EU contribution

€ 0,00

Partner

HUNAN UNIVERSITY

China

Net EU contribution

€ 0,00

Project description

Graph-based representation of the genome sequence data

Objective

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Keywords

Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Coordinator

Participants (8)

Partners (6)

Share this page Share this page on social networks

Download Download the content of the page

Pan-genome Graph Algorithms and Data Integration

Project description

Graph-based representation of the genome sequence data

Objective

Fields of science (EuroSciVoc) CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Keywords Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)

Programme(s) Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s) Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Funding Scheme Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Call for proposal Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Coordinator

Participants (8)

Partners (6)

Share this page Share this page on social networks

Download Download the content of the page

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Keywords

Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.