This project has assembled a chromosome-level genome for the two-toed sloth Choloepus didactylus. This genome was produced with four sequencing technologies (Pacbio long reads, Chromium 10X linked reads, Bionano Optical Maps and Arima Hi-C reads) and a state-of-the-art hierarchical genome assembly pipeline (VGP 1.6 Assembly Pipeline). The experienced researcher and main supervisor have joined the Vertebrate Genomes Project - an international initiative within the Genome 10K consortium testing, developing and implementing methods to assemble the final reference genomes of all vertebrates, starting with one representative of each of the ~260 vertebrate orders. The Choloepus didactylus genome assembled by this project is the first representative of the Pilosa and Xenarthra on the VGP Phase I. C. didactylus genome was assembled to a total of 3.6Gb in 281 scaffolds with a N50=161 Mb, making it one of the best mammalian genomes assembled to date. This genome is in its final stages of manual curation, but sexual chromosomes are already identified (X and Y) and the Hi-C heat map seems to indicate that the karyotype of this specimen of C. didactylus is 2n=65. Several entire chromosome syntenies were found between C. didactylus and human chromosomes, allowing the start of the investigation of genome architecture evolution. In addition, 46% of C. didactylus genome is composed of repeats, with LINE1 being the most abundant element. Six tissues were sequenced as RNA-Seq (lung, blood, brain, heart, spleen and liver) to allow a high-quality annotation of C. didactylus genome. A genome annotation performed with MAKER2, followed by a single-copy-orthologs comparative analysis of C. didactylus and 10 other protein sets predicted from whole genomes representing all described mammalian lineages indicate Afrotheria to be more basal than Xenarthra within Eutherian mammals. The mitochondrial genome of C. didactylus was also assembled, presenting 16499 bp, 13 coding genes, 22 tRNAs, and 2 rRNAS. Both genomes of C. didactylus, nuclear and mitochondrial, are available at vgp.github.io. Further, this project has sequenced Illumina short reads for the three-toed sloth Bradypus tridactylus, and has assembled a draft genome for it with DISCOVAR. The kmer composition of B. tridactylus indicated a genome size of 3.3Gb. Homology analysis of C. didactylus chromosomes with B. tridactylus DISCOVAR contigs show 70% shared homology between both genomes. The genomes assembled by this work representing both clades of tree-sloths, Bradypus and Choloepus, present smaller genomes than previously estimated for sloths and other Xenarthrans (>4Gb). Comparative analysis of the unique genomic features of sloth’s genomes are being carried out and will be presented in open-access scientific manuscripts in the near future.
This project has also made efforts to train the next generation of genome bioinformaticians: the Experienced Researcher (ER) and Main Supervisor started an initiative to train masters students to run the Vertebrate Genomes Project Assembly Pipeline. The ER has trained at least four students at the Berlin Center for Genomics and Biodiversity Research to integrate the four sequencing technologies and run the VGP Assembly Pipeline. These students have produced at least one chromosome-level genome each, getting high-quality training and greatly contributing to the international scientific community studying genome evolution. This training initiative is getting larger and is being followed by other senior researchers within the VGP which are now integrating students with the consortium.