Periodic Reporting for period 1 - BBFGEN (Genetic analysis of rare and common variation in a large Brazilian bipolar family)
Reporting period: 2015-05-01 to 2017-04-30
About 330 DNA samples were quality controlled and organized in the BRC local genotyping facility. Of those, 321 high quality samples were prepared for hybridization, and genotyping chips were run. I quality controlled the raw genotype data extensively by removing sample and SNP outliers through utilization of a published pipeline (Coleman et al., 2015). The common and rare variants were quality controlled separately as they require different QC parameters. The algorithms in QuantiSNP and PennCNV were used to call copy number variants (CNVs). The CNV set was exposed to a rigorous quality control procedure, including the requirement of overlapping calls between algorithms. Raw genotype data was obtained and, after outlier removal and quality control, there was data for common variants, rare variants and copy number variants.
Work Package 2: Analysis of common variation
Linkage analysis was performed on Affy 10k linkage panel data. I performed a standard case control association analysis on the common variants from the quality-controlled sets in WP1 in order to identify variants more common in cases as compared to unaffected family members. Linkage analysis resulted in several whole genome significant and suggestive genomic regions associated with mood disorders. No common SNPs or CNVs on a genome wide level showed a higher frequency in mood disorders as compared to unaffected family members.
Work Package 3 (1): Analysis of Polygenic Risk Scores
I have obtained the latest summary statistics from the Psychiatric Genomics Consortium through their secondary analysis proposal process. I have used these to generate polygenic risk scores for schizophrenia (SCZ), Bipolar disorder (BP) and Major Depressive disorder (MDD) on the family members. In addition, we formed additional collaborations in Brazil, in order to obtain a Brazilian control group. I have performed analyses using linear mixed models in ASREML-R for which I have devised a strategy in collaboration with Dr. Hall and Dr. Thomson. In addition, I had the opportunity to apply this polygenic risk score method within a local brain-imaging project also including bipolar patients and their family members. Affected individuals in the family show a higher polygenic risk for SCZ, BP and MDD. The results suggest an influx of common genetic risk via affected individuals marrying into the family (an effect of assortative mating). This increases the common genetic risk for psychiatric disorders over generations, in parallel with the anticipatory pattern observed for symptoms and age of onset in the family.
Work Package 3 (2): Integrative analysis linkage & PRS
The combination of rare and common genetic risk within a family context is a new concept. I have developed and tested several strategies for identifying family members contributing most to linkage analysis results, therefore most likely to carry a deleterious rare variant. For this, I have selected the most significant linkage peak on chromosome 2p from WP2 and the polygenic risk score showing the best discrimination between affected and unaffected family members from WP3.1 that for bipolar disorder. I have stratified analysis based on “linkage status” on individuals. Preliminary results suggest the existence of a risk allele typed on the previous linkage panel and a protective haplotype identified on the new Illumina data. Stratification by this genotype status does not show an interaction with affection status and polygenic scores, but it does suggest that common risk becomes more important over generations.
Work package 4: Analysis of rare variation
The linkage analysis mentioned in WP2 identified several interesting genomic regions associated to mood disorders. I have investigated the information on the new high-resolution Illumina psych chip data for fine-mapping of this large regions. Outside of the previous relatively successful exome sequencing fine mapping approach I have performed previously, I have not identified any additional candidates for causal rare risk in the linkage regions or genome wide in either the rare variant dataset, or the CNV dataset.