The fields of interest in IGNITE are as manifold as the targeted study systems. Four main research areas, however, unite all subprojects:
(1) production of high-quality genome and transcriptome resources for various underrepresented non-model invertebrates
(2) testing and adjusting of existing, and implementation of new software, to produce and analyse high-quality genome assemblies, including novel method development for publication, harvesting, and re-use of biodiversity and genomic data
(3) establishing robust relationships of main animal lineages to provide a reliable backbone of the animal tree of life
(4) identification and exploitation of bioactive compounds with potential for biomedical application
Multiple draft genome assemblies have been generated from diverse taxa including five sponges (Porifera), one mollusc (Mollusca), one acoel worm (Xenacoelomorpha), one acorn worm (Hemichordata) and one insect (Hemiptera). Additional genomes are currently sequenced from sponges, molluscs, corals (Cnidaria), arrow worms (Chaetognatha), and wheel animals (Rotifera). In addition, multiple transcriptomes are sequenced to increase taxon diversity.
To improve the quality of genomes, IGNITE adopted and further developed chromosome conformation capture (Hi-C) protocols for problematic invertebrate tissues. The aim was to provide a Hi-C protocol that works in a broad diversity of invertebrates. The establishment of new lab protocols, as well as the testing and adjusting of Hi-C scaffolding software, helped to enhance the genome assembly quality of very distantly related invertebrate lineages. Besides improving lab protocols and scaffolding, IGNITE is developing new software to fill gaps in scaffolded assemblies and to phase genomes to yield chromosome-level and gap-free nuclear genome assemblies.
To produce robust and yet energy efficient sequence-based phylogenetic analyses, IGNITE has been working on phylogenetic likelihood implementations, with the overall objective to develop efficient open-source bioinformatics tools for analysing large molecular data sets under complex evolutionary models. The specific aim was to implement new algorithms that are more energy-efficient than existing software solutions. Those software tools are routinely used in biological and medical research around the globe to analyse bacterial samples, viral outbreaks, and to study the evolutionary history of life.
During this reporting period, a production-level library has been implemented and made publicly available. The newly developed kernel version has now been integrated into several production-level tools for phylogenetic inference. The final software is now faster and also provides options for reducing and monitoring energy consumption. Furthermore, complex models to account for heterotachy and the so-called non time-reversible models of nucleotide substitution were implemented. An open-source tool to determine the root of a phylogenetic tree will be finished shortly.
The genomic information gathered by sequencing highly under-sampled invertebrate groups promises to provide new insights into biochemical compounds being produced either by the host or its associated microbial organisms. In IGNITE, we have been completing and analysing particularly sponge holobionts (host and associated symbionts) as they are known to harbour a rich diversity of metabolites with a high potential for applications in human health, including cancer therapies and novel classes of antibiotics. Our initial analyses already yielded one candidate for a novel antibiotic compound. With more sequenced genomes becoming available in IGNITE we aim to identify a larger number of metabolites.