Skip to main content

Analysis of natural variation for cold tolerance in the model plant species Arabidopsis thaliana

Final Report Summary - ANAVACO (Analysis of natural variation for cold tolerance in the model plant species Arabidopsis thaliana)

Project context and objectives

The project ANAVACO aims at uncovering the genetic variants underlying natural variation for tolerance to sub-zero temperatures in the model plant species Arabidopsis thaliana through a genome-wide association mapping approach. In the first reporting period, more than 500 Arabidopsis accessions have been scored for freezing tolerance. The phenotyping protocol, established in the first year of this project, consists of the exposure of three-week-old plants to a temperature of -8 °C for three hours overnight. Thereafter, the plants are allowed to grow for an additional ten days at 10 °C. At the end of the experiment, survival and leaf numbers distinguishing healthy and damaged leaves are recorded.

Project results

Considerable variation for freezing tolerance was observed in the sample used in this study. Genotypes of the accessions were generated on approximately 250 000 single nucleotide polymorphisms (SNPs), corresponding to an average of one marker every 500 basepairs (bp). A genome-wide association study (GWAS) was performed using three different single SNP-based statistical methods, a non-parametric Wilcoxon rank-based test, a parametric linear test, and a parametric mixed-model approach, implemented in EMMA (efficient mixed-model association), which accounts for population structure. The profiles obtained on the different freezing tolerance phenotypes, using the non-parametric Wilcoxon rank-sum-based test, showed the occurrence of multiple significant associations on all five chromosomes of the Arabidopsis genome, though the largest proportion was located on chromosome 5 for all phenotypes. At a genome-wide significance threshold of 0.05 corrected for multiple testing by bonferroni 1 280, 1 260 and 633, SNPs were identified as significantly associated with survival rates, freezing damage and healthy leaf ratios respectively. If we considered that SNPs less than 10 kb away from each other belong to the same peak, we were able to identify 408, 369 and 221 peaks for survival rates, freezing damage and healthy leaf ratios respectively. Since the distribution of p-values obtained by Wilcoxon is highly skewed towards low and significant p-values, a high proportion of these SNPs are probably false positives due to confounding. In the Arabidopsis population, structure is an important confounding factor. Therefore, we also applied a mixed-model approach, which accounts for the population structure. Confounding was efficiently reduced using the mixed-model as indicated by the uniform p-value distribution. While at the genome-wide significance threshold of 0.05 obtained through permutations, none or few SNPs were associated with freezing tolerance in EMMA-X, ranks, and p-values of the significant SNPs identified by Wilcoxon showed a highly significant positive correlation (P < 0.001) with their respective ranks and p-values obtained by EMMA-X.

In addition to GWAS, a quantitative trait loci (QTL) analysis of freezing tolerance was performed in the first reporting period on three different F2/F3 mapping populations. QTL of survival rates and freezing damage were identified in all three mapping populations, while QTL for healthy leaf ratios were only detected in a Col-0 x Edi-0 population. Unlike GWAS, a QTL analysis is not affected by population structure. A combined analysis of QTL and GWAS can consequently help in distinguishing true from spurious correlations in a Wilcoxon analysis, as well as in identifying SNPs that would be considered false negatives in EMMA, based on their p-values. In addition to the QTL detected in this study, we also included QTL identified for freezing tolerance in a Cape Verde Islands/Landsberg erecta (Cvi x Ler) recombinant in-bred line (RIL) population, previously published by Alonso-Blanco et al. (2005) in the comparative analysis of QTL and GWAS of freezing tolerance. The majority of QTL support intervals co-localise with one or more association mapping peaks as shown for survival rates on chromosome 4. In each of the QTL intervals, the SNP with the most significant p-value was identified and its rank was determined. In Wilcoxon, the rank of the most significant SNPs within each of the QTL intervals ranged from 1 to 1 033, 1 to 1 343, and 1 to 1 290 for the phenotypes survival rates, freezing damage and healthy leaf ratios respectively. For each of these phenotypes, the rank of the most significant SNP was below 500 in 28 out of 33 QTL intervals (85 %). For survival rates and healthy leaf ratios for only 2 QTL the most significant SNP showed a rank exceeding 1 000, while for freezing damage this was the case for five QTLs. The most significant SNPs were located at a distance as far as 8 megabases away from the QTL position. The closest distance between the most significant SNP and the QTL position (38 583 basepairs) was identified for QTL Edi_dam_3 and Edi_sur_2 located on the top of chromosome 4 in the phenotype healthy leaf ratios. In the mixed model approach, the maximum rank of the most significant SNPs within the QTL regions corresponds to 266, 262 and 176 for the phenotypes survival rates, freezing damage and healthy leaf ratios respectively. As the p-values of these SNPs are well below the significance threshold, this comparative analysis clearly highlights the presence of false negatives in EMMA. While in Wilcoxon, in all three phenotypes, the most significant SNP within the QTL intervals Survival_SD_2 and Survival_LD_3, co-localising with the C-repeat Binding Factor genes (Alonso-Blanco et al., 2005), did not belong to the top 1 000, in EMMA the most significant SNP within these QTL intervals had a minimum rank of 71 and a maximum rank of 160 depending of the phenotype. In the Wilcoxon analysis, these SNPs were located at a distance of 619 kb from the QTL position, while in EMMA, the distance between the most significant SNPs and the QTL position was 361 kb. The C-repeat- (CRT) binding factor (CBF) genes have been widely studied with regard to their role in freezing tolerance. While they did not figure among the highest ranking genes in this GWAS, we picked up several genes that were previously described as being either up or down-regulated by the CBF regulon, or by cold temperatures (Carvallo et al 2011). Considering a window of 10 kb up and downstream of the 5’ untranslated region (UTR) and 3’ UTR of the genes respectively 28 CBF regulated genes (10 down and 18 up-regulated) were detected by EMMA-X, i.e. contained a SNP with a rank below 1 000, while 16 (5 down and 11 up-regulated) CBF regulated genes were picked up by Wilcoxon. Nine genes, among which were two encode proteins of unknown function, were detected by both analysis methods. Using a 10 kb window around the SNPs that show a rank of maximum 500 we also succeeded in identifying a total of 21, and 13 genes that are cold-induced, and repressed respectively as indicated by transcriptomic analyses (Carvallo et al, 2011). In both Wilcoxon and EMMA-X, the SNP with rank 1 mapped to genes that are either regulated by cold temperatures, or by the CBF regulon. In Wilcoxon, this SNP mapped to the gene At5g22270 encoding an unknown protein that is a member of the CBF regulon, while in EMMA-X, the highest ranking SNP mapped to the gene At5g02760 encoding a protein phosphatase 2C involved in protein amino acid dephosphorylation.

While genome-wide association mapping and QTL mapping were successful in detecting genes or genomic regions underlying freezing tolerance, validation of their role in tolerance to sub-zero temperatures can only be provided by functional analyses. Validation of genes and QTLs identified during the first reporting period of the project was initiated in the second reporting period at the host organisation. Among the QTLs detected during the first phase of this fellowship, four QTLs (Edi-rat-1, Edi-sur-5, Lov-sur-1 and Var-sur-3) were selected for validation and fine-mapping using a heterogenous inbred family (HIF) approach. The validation of these QTLs could, however, not be completed before the end of the fellowship. In addition to the validation and fine-mapping of selected QTLs, the validation of genes identified in GWAS was initiated using knock-out mutant analysis. Based on the 100 highest ranking SNPs identified through GWAS using the Wilcoxon rank-based test, Arabidopsis plants (Columbia ecotype) carrying a non-functional copy of the gene, in which either the SNP is located or the two genes flanking the SNP, were ordered at seed stock centres (NASC, GABI). After selecting knock-out mutants that were homozygous at the insertion, 37 genes were screened with the aim of detecting a significant decrease in freezing tolerance in the mutant lines compared to the highly frost tolerant Columbia accession. One gene, ANAC090, was validated in this experiment as the insertion line carrying a non-functional copy of this gene was significantly less freezing tolerant than the Columbia wild type. Differences in phenotyping conditions between the GWAS and the experiments aiming at validating the genes identified in the GWAS might have precluded the validation of all or some of the other 36 genes tested in this experiment. Other factors, however, can also provide an explanation for the poor number of validated genes in this experiment, such as the fact that these genes do not necessarily have a big effect on freezing tolerance when considered one by one, or they might have redundant functions, or even be false positives. No additional screens on knock-out mutants could be carried out before the end of this fellowship due to time constraints.

To our knowledge this project presents the most extensive sample of A. thaliana accessions studied for freezing tolerance through GWA mapping. This study shows that GWAS can be successfully applied to address the genetic basis of adaptive traits, such as freezing tolerance in A. thaliana, as clear peaks were identified by each of the methods, under which one or more candidate genes for freezing tolerance are potentially located. A comparative analysis of peaks identified through GWA and QTL mapping revealed the co-localisation of the majority of peaks detected, and demonstrated that both approaches are highly complementary. This study, although it might currently lack in solid conclusions on genes newly discovered with respect to their role in freezing tolerance, certainly has the potential to open multiple new avenues for fundamental research. Publication of the work conducted within the context of this fellowship (Willems et al, in preparation) will provide the research community a list of candidate genes for freezing tolerance, which could be readily used by other researchers.

Freezing tolerance also gained a great deal of interest from plant breeding companies as they need to respond to the constantly growing world population by drastically increasing the food production in all crop species. As arable land becomes limited, new areas will have to be exploited where increased freezing tolerance might be crucial. For some crop species that are sown in spring, earlier sowing generally results in higher production and yield. Being able to develop crops with increased freezing tolerance is therefore more than ever of importance for plant breeding.