European Commission logo
polski polski
CORDIS - Wyniki badań wspieranych przez UE
CORDIS

Smart tools for Prediction and Improvement of Crop Yield

Final Report Summary - SPICY (Smart tools for Prediction and Improvement of Crop Yield)

Executive summary:

The EU-SPICY project aimed at developing a suite of tools for use in molecular breeding of crop plants for sustainable and competitive agriculture. The major aim was the construction of a predictive model for a complex trait like yield valid over a wide range of genotypes and environments. Complex traits typically show strong genotype by environment interaction (GEI). GEI makes prediction difficult. Decomposing a complex trait in a small number of component traits without GEI looks an attractive option. A requirement for this decomposition approach to work is that the components can be transformed into the complex trait by use of an appropriate model.

In EU SPICY, pepper was chosen as a model crop and yield as an example of a complex trait. As predictive model for yield, a source-sink based crop growth model was chosen with as component traits leaf area index rate (LAI rate), light use efficiency (LUE) and partitioning to fruits (pt_fruits). For the component traits, two sets of values were used in the predictive model. Firstly, component trait values were obtained as genotypic means from greenhouse experiments. Secondly, the genotypic means for the component traits were replaced by fitted values from QTL models for these components, where the QTLs were mapped using the genotypic means for the components as response and molecular marker information as explanatory variables. (QTL = quantitative trait locus, i.e. a genomic region whose variation was found to show association with the phenotypic trait variation).

Four experiments (two in Spain and two in the Netherlands) were carried out to generate data for yield and its components. For each of the four experiments it could be confirmed that the crop growth model could predict yield using as input genotypic means for the components and environmental input on temperature and radiation. Similarly, yield could be predicted from QTLs for the component traits, although the quality of these predictions was satisfactory for the Spanish environments only. A validation experiment with new genotypes in a new environment showed that the approach was viable as a tool for prediction of a complex trait like yield from DNA profiles for components traits. With this validation experiment, a major objective of EU-SPICY was achieved.

Although it is attractive to predict yield from simpler yield components, the measurement of yield components can still be costly. Therefore, in EU-SPICY phenotyping tools in the form of fluorescence and imaging tools were developed that were able to approximate manual measurements on yield components. Traits that were derived from these phenotyping tools showed QTLs at locations that to a certain extent coincided with those for QTLs for the yield components.

Finally, EU-SPICY developed molecular genetic tools to identify candidate genes for yield and yield components using known genes of relevant categories in related species and a gene expression study.

Results were presented to academic as well as commercial audiences and policy makers via a web site, lectures, papers, and various symposia and workshops.

Project context and objectives:

Plant breeding industry has considerably contributed to the increased quality and yield of plant products over the last decades. This has initially been achieved by a systematic comparison of crossings in an experimental set-up. The last decade the use of molecular markers has been added as a tool in breeding and this has increased insight in the genetics behind the genotypic differences. For complex traits like yield, current molecular breeding still has some severe limitations. In order to select and breed the best genotypes for a large range of diverse conditions, ideally the breeder should test all his crossings under all these conditions. Especially with complex physiological traits, this would require many expensive and large field trials.

The 'traditional' approach to link genetic markers to a trait which is the result of multiple interacting genes, is by so called quantitative trait loci (QTL) analysis: by statistical techniques based on crossing-over, we can build a genetic map of the genome and by correlating the regions in the genetic map which exhibit a correlation with the trait at hand we can find the positions of the QTL. This analysis is generally conducted for phenotypes observed in a single environment, but this is often not sufficient for complex traits that exhibit considerable genotype x environment interaction. Recently, advancements have been made by considering the combination of the QTL under different environments, a so called QTL x E analysis, and new methods are still being developed in this area. Another approach to predict the phenotypic response is through the use of crop growth models. Crop growth models have proven to be an excellent tool to predict crop yield under different environmental conditions, but then for one or a few genotypes and not for a large set of genotypes simultaneously. A crop growth model disentangles the complex trait yield under different conditions in a number of model parameters specific for the crop, based on known physiological principles like photosynthesis, and for the environment, like light and temperature.

In this research, we have integrated the two approaches of QTL and crop growth modelling. We used explanatory models to disentangle the sink and source components of growth, thereby assuming that crop growth model parameters are more directly linked to genetic information than direct plant measurements (e.g. length, fruit size, leaf area) as the latter are the final result of complex interactions between sink and source. Hence QTL regions for these model parameters were expected to be more specific and stable.

Molecular breeding will not overcome the need for large scale phenotyping. Hence, new tools allowing large-scale automated phenotyping are also important in the suite of tools a breeder should possess to be able to face the challenges at hand. Therefore we have also developed automated and fast high-throughput tools for large scale phenotyping, thereby reducing the amount of manual labour necessary in phenotyping experiments.

Aim

The aim of this study is to develop a suite of tools based on molecular breeding to help breeders in predicting phenotypic response of genotypes for complex traits like yield across a range of environmental conditions. Hence it can improve the selection efficiency of these genotypes for these traits: Smart tools for Prediction Improvement of Crop Yield (SPICY).

In general the objectives of the SPICY project are:

1. to develop an accurate prediction tool for phenotypic performance of a genotype by means of an integrated gene-to-phenotype model, thereby reducing the effort of phenotyping new genotypes;
2. to find genes within the QTL;
3. to develop fast and automatic large scale phenotyping tools, i.e. and imaging tool and a fluorescence tool;
4. dissemination to European plant breeding industry.

Objectives

The general objectives are divided in objectives per work package (WP).

WP 0: Coordination

- To ensure proper coordination, integration and execution of project activities on all levels regarding scientific matters, financial and legal matters, communication and dissemination.
- To control and evaluate the progress (control of timing, quantity and quality of the work, especially with respect to deliverables)
-To inform all beneficiaries in the project and the European Union (EU) Research Programme Officer fully on the project status, including progress reports and to ratify and provide minutes of meetings of the PMT and consortium meetings.
- To commit the members of the industrial advisory board to the project, keep them informed and interact with them on aims, objectives and results.
- To ensure interaction between beneficiaries and WPs.
- To stimulate the communication and exchange of results between beneficiaries.

WP 1 Genomic tools

- Providing a fully genotyped population of inbred lines for this project.
- Construction of a manageable list of candidate genes for direct assessment of causative association between the candidate genes and regions identified by the QTL mapping for variation in growth model parameters in pepper.
- Construction of a list of genes by studying the differential gene expression between contrasting QTL-genotypes, using the cDNA-AFLP transcript profiling technology.
- Construction of a list of genes by studying the differential gene expression between contrasting QTL-genotypes, using a RT-PCR approach involving the expression of identified candidate genes.
- Characterising candidate genes that control for major variations for plant and fruit development traits to provide highly informative markers

WP 2.1 Development of a fluorescence tool

- Development of a fluorescence tool specialized for plant phenotyping.
- Definition of a fluorescence database of plant phenotypes with decision making (phenotype selection) software.

WP 2.2 Development of an image analysis tool for large scale phenotyping

-Develop a tool for the automated measurement of phenotypic features over a large range of genotypes in a practical environment. The features are related to crop growth and development and are to be used as parameters in the crop growth models.

WP 2.3 Large scale phenotyping experiments

- Characterisation of a large number of genotypes by large-scale phenotyping experiments at different sites and under different environmental conditions.
- Use and testing of the applicability of newly developed imaging and fluorescence tools.

WP 3.1 Genotype specific crop growth model

- Develop a crop growth model for prediction of crop yield performance while accounting for environmental differences.
-A set of genotype specific model parameters, which can serve as features for a QTL-analysis.

WP 3.2 New QTL analysis tools

- New data analysis tools to estimate parameters of crop growth model, taking into account the correlation structure of the parameters.
- QTL map for complex traits, directly and through crop model parameters.

WP 4 Validation

- Compare results of different approaches to predict and select genotypes, based on genetic markers, as obtained from the development of (new) genomic tools:
- standard QTL techniques for direct trait of yield;
-QTL of crop growth model;
- candidiate gene markers.
- Validate the phentoyping tools for image analysis and fluorescence.

WP 5 Dissemination

-Dissemination of the research results through an international course on molecular breeding tools, international workshops, the internet and publication in peer reviewed international scientific journals.

Project results:

Genomic tools

Introduction and objectives

Within the SPICY project, aims of this WP are to provide plant material for all experimental, phenotyping and genomic work and to provide the molecular markers which will be used for the phenotype prediction through the gene-to-phenotype model. In order to optimize this prediction, the ambition was to use as markers the sequences of genes that are expected to be causal for the genetic variations in plant growth and yield, i.e. candidate genes. The second ambition was to develop such genomic tools in a 'non-model' plant species, i.e. pepper (Capsicum spp) for which relatively few genomic resources are available.

In this objective, two strategies were developed:

- the a priory candidate gene strategy which consists in collecting from literature a library of gene that were previously demonstrated to control plant growth and yield component in model plant species. Then, identifying the corresponding (orthologous) genes in the plant of interest, under the hypothesis that these genes control the same traits in these distinct plant species.
- The genetical genomics strategy which consists in analysing the variation of expression of the plant genes, relating it to the variations in plant phenotype and inferring these differentially expressed genes as new candidate genes.

In order to assess the causality relationships between the candidate genes and the phenotype variations, a comparative mapping approach over the pepper genome will be used. In this approach, if the phenotype variations (growth responses, yield components) succeed to be related to the variation in the gene expression, and to a polymorphism in gene sequence, the candidate gene will be considered as a functional marker. Such functional markers are expected to improve the accuracy of the phenotype prediction of the genotype to phenotype model and will be highly powerful in molecular breeding.

Results

As plant material, a progeny issued from a cross between two genetically distant pepper inbred cultivars (Capsicum annuum) differing in their plant and fruit phenotypes was chosen: Yolo Wonder as the female parent and CM334 as the male parent.

In order to explore if genes already known to control growth and yield traits in other plant species could help in finding genes controlling such traits in pepper, we focused the a priori candidate gene approach in two classes of candidate genes. The a priori candidate gene strategy identified more than 70k new genes from pepper, corresponding to known genes in model species. However, only 10 of these candidate genes were possibly mapped and tested for their effect on the plant phenotype. Two of these genes revealed a putative effect on pepper growth traits: the CC gene CYCD3 colocate with pQTLs for the plant leaf area and the dry weight of leaves in chromosome 4, and the KdsA gene reveal a candidate for the variation in stem growth traits (dry weight of stems, stem length and number of internodes).

In a second approach to identify genes underlying growth and yield variations in pepper, a genome wide analysis of differential gene expression within the segregating progeny was performed without a priori on the gene functions. A new pepper microarray was produced using all available ESTs from existing libraries in order to assess the quantitative differences in gene expression between different plant genotypes. Two strategies were developed. For fruit traits, bulk QTL expression analysis was performed by comparing the expression of genes between sets of pepper lines contrasting for their genotypes at chromosome regions known to control the fruit shape and size (chromosome 4). Gene expression was explored in immature fruit tissue harvested during the early fruit growth. For plant traits, the differential gene expression between 80 individual pepper lines was assessed and quantitative variations were affected to significant chromosome locations (expression quantitative trait loci (eQTLs)). Gene expression was explored in immature internode tissue harvested during the early elongation process. This genome-wide analysis delivered thousands of differentially expressed genes. However, to increase their probability to provide valuable candidate genes, these had to be filtered using hierarchical criteria: the first was the correlation of gene expression with phenotypic trait expression (fruit size and shape for genes expressed in fruits, collocation of eQTLs with plant growth QTLs for genes expressed in stems). The second was the expected colocation on pepper chromosomes of the the eQTL with the structural gene which had to be inferred from the tomato genome when these data were available. Endly, co-expression networks of all genes mapping to pQTLs regions were explored to look for co-regulation between the selected candidate genes and to infer functions or metabolic pathways in relation with the phenotype. This resulted in the final selection of 1139 most probable candidate genes for fruit and/or plant growth traits.

In order to locate the new candidate genes on the pepper chromosomes and assess their relationships with phenotypic traits (pQTLs), high throughput technologies are necessary, due to the high number of genes (1139). The strategy was chosen to sequence the expressed genome (transcriptome) of the parental line Yolo Wonder, using Roche 454 sequencing, and compare it to the sequences of the transcriptome of the second parental line CM334 obtained by Illumina technology. Comparing these sequences aimed at delivering sequence polymorphisms for the expressed genes (Single Nucleotide Polymorphisms or SNP markers) which segregation will be analysed in the progeny of 297 recombinant lines to locate their position in the pepper genome. As a first result, the transcriptome sequencing of Yolo Wonder provided a gene library of 23748 unique pepper genes expressed in a variety of plant organs and tissue. This is the most complete repertoire of genes in pepper. Comparison with the sequences of the genes from the CM334 parental lines yielded 11849 SNPs distributed in 5919 genes, indicating that approximately 1/4th of expressed genes are polymorphic between these cultivars.

The parental sequences of the 1139 candidate genes were mined in this new pepper gene library and delivered SNPs between the two parental lines for 337 of these candidate genes. The analysis of the pepper progeny for the SNP markers identifying these new candidate genes and the previous a priori candidate genes permitted to locate these genes on the pepper genome. This resulted in a new linkage map of pepper, including 230 additional functional markers (sequences of expressed genes). Among those markers, 126 were shown to collocate with differentially expressed genes and can be related with phenotypic differences for growth and yield traits. These functional markers are of high interest, for yield prediction by the genotype-to-phenotype model which is the central point of the SPICY project, and widely for breeders in charge of genetic improvement in pepper.

Another way to assess the impact of genes on phenotype is to correlate the sequence polymorphism of genes with the diversity of plant phenotypes in large collections of plant cultivars. With this objective, a large collection of 1322 pepper genotypes was analysed using molecular markers and yield related traits and a sub sample of this collection was selected, forming a core-collection of the 330 most representative individuals. Associations between the candidate gene sequences and yield related traits were further explored within this core-collection. Relationships between the sequence polymorphism of candidate genes and yield related traits were performed in the panel of 330 cultivars from the collection. Significant associations were detected for several candidate genes, with 10 candidate genes that were previously located on the same chromosome positions than expression QTLs and phenotypic QTLs. Thus, candidate genes controlling variations between plants for leaf development, stem development and fruit width and length were shown to be highly probable.

Conclusions

This WP of the SPICY project firstly provided genomic and genetic resources for the plant genetics international community, and particularly the pepper breeders, namely:

- A nearly complete repertoire of expressed genes (transcriptome library) from pepper genome which is of high interest since this plant genome has not been sequenced due to its large size and abundance of repeated non coding elements.
- A large set of SNPs (11849) in expressed genes, which provide polymorphic markers for use in the cultivated pepper gene pool.
- An enlarged list of candidate genes for plant and fruit traits in pepper and solanaceous crops (tomato, potato, eggplant), amplifying the knowledge transfer from model species (A. thaliana) to plant species of agricultural importance.
- An improved knowledge of the genetic diversity available in the pepper germplasm worldwide, with calibrated core-collections for horticultural trait breeding.

Fluorescence sensor

Introduction and objectives

Phenotypic characterization of crops of different genotypes requires large data sets of diverse types for statistical reliability. Temporal monitoring of plant fluorescence is able to capture the dynamics of the photosynthesis process that is summarised in a number of parameters for which the genotypic heritability can be calculated. Unlike imaging or light reflection measurements, fluorescence-type measurements give a deep insight of a plant's physiology related to its photosynthetic activity and apparatus (Baker and Oxborough, 2004).

Materials and methods

The fluorescence work package has two objectives:

(1) development of a fluorescence tool specialized for plant phenotyping;
(2) definition of a fluorescence database of plant phenotypes with decision making (phenotype selection) software.

In this WP, a complete, flexible and easy-to-use sensor system has been developed capable of high-throughput production of baseline-corrected temporal fluorescence data of crop plants with batch processing for online extraction of plenty of feature points of phenotypic relevance.

These are obtained by integrating several (direct and modulated) measurement methods applied at different wavelengths. Fluorescence is induced so that the sample is illuminated by a set of light sources with spectral radiation falling in the plant's absorption range. Different wavelengths excite different parts of the photosynthetic apparatus and yield, therefore, specific responses.

The fluorescence tool as a distributed system consisting of a central computerized unit and several (3 realized) intelligent fluorosensors (IFS). The unique features of the sensors are a compact, embedded signal guiding fibre optic system with integrated sample clip, instrument-standard variable detector and light source modules, net or wireless link for remote control and high throughput data collection. The sensor head has 4 major parts:

(1) a sample clipping mechanism with a rubber sealed sample chamber of 24 mm diameter;
(2) the light source and detection unit together with the corresponding (analogue / digital) frontend electronics;
(3) a special fibre-optic core; and
(4) the head electronics.

The optical signals are guided to/from the sample by a special fibre-optic core of 13 fibre elements with a common end at the sample chamber. Three inner fibres of the common end and a separate image conduit couple the light sources efficiently to the sample chamber. The remaining 9 outer fibres (tilted to have a common field of view of approximately 1 cm diameter in the plane of the sample leaf) are symmetrically joined into 3 triplets at the 3 corresponding detectors (690 nm, 730 nm and PAM signals). The analogue frontend contains the frontend electronics and the different channel detectors with appropriate optical filters that fit in an easy-to-change tubular socket. The fluorescence signals are amplified and digitized at 12 bits (4096 increments). For precise actinic level adjustment, a laser diode driver with true linear light power control is employed digitally adjustable in 256 steps. For controlling the LED sources, constant current drivers are utilized preset for optimum output optical powers.

A central surveillance environment has also been developed and installed for sensor control and data manipulation. The manual sample-to-data documentation is replaced by an automated registration process of each sample / sensor / data set by implementing a handheld scanner in the system with 2D code reading and added wireless link with battery operation to minimize the protocol overhead (hence measuring time). A prerequisite of this feature is that the samples be fitted with a unique ID. This feature allows a refined auto-start of a measurement by linking together the 'sample and sensor registered' and the 'clip-closed' conditions. Finally, a database generator software has been developed to create uniform databases and allow pre-processing of large datasets.

Conclusions

The complete system has been tested in different greenhouse environments. The DC transient peaks Fm and Fm' in the dark and light acclimated phases, respectively, as well as the calculated direct F0' parameters showed high heritability (H2 greater than 0.7) based on predicted genotypic means of the parameters obtained by mixed modelling. Hence, a careful selection of such fluorescence parameters can be treated as phenotypic traits and may provide additional inputs for advanced crop grow models or QTL analysis. A peak value of greater than 100 samples per day with a sampling protocol time of 8 minutes per sensor per sample suggests that the sensor parallelisation is feasible concept in the application of plant fluorescence technique for high throughput monitoring of crops.

Development of an image analysis tool

Introduction and objectives

Digital image analysis has the potential to automate the measurements taken from whole plants or plant parts, saving time and increasing precision. There are a number of high-throughput systems available for automated plant phenotyping, but these require plants to be individually transported to a controlled imaging environment. Transporting growing plants is problematic, because of the risk of damage to plants invalidating experiments. More critically, for many important greenhouse crops such as pepper and tomato, the plants can grow several meters tall and are simply too large to be transported.

The objectives of this WP were twofold:

1. develop a recording device to collect images of growing pepper plants in a greenhouse;
2. automatically extract phenotypic features over a range of genotypes, which correlate with either manual measurements or genotype.

Results

We have designed a SPYSEE imaging robot that is able to be moved around a greenhouse and 'observe' plants by collecting images which record their characteristics during growth. In effect, our methodology brings the imaging equipment to the plants, rather than moving the plants themselves.

SPYSEE has to move in the aisles between rows of pepper plants, so the distance from the camera to the plants is very short. Therefore, to photograph plants for their full height, high-resolution cameras with a very large field-of-view lens are used at four height levels. For each of the four height levels, the following cameras have been used:

1. RGB colour camera;
2. range cameras, based on the time of flight (ToF) principle;
3. near-infrared camera.

For illumination, flashlights with a pulse duration of 30 µs were used. This allows a short shutter time for the cameras, resulting in sharp images unaffected by ambient light. Images are collected every 5 cm as SPYSEE is manually pushed down each aisle in the greenhouse.

We have pursued two distinct approaches for analysing images to predict plant traits. The first approach is to use image analysis methods to combine range camera and stereo images to generate three-dimensional (3D) reconstructions of plants, from which plant parts can be identified and measured. The second approach extracts statistical features from the images without first trying to separate individual plant parts from the background. The criteria for evaluating the second approach are high heritability, which includes reproducible differences between genotypes, and strong genetic correlation with yield or its components, or with manual measurements.

Given a pair of adjacent colour images and a range image captured by SPYSEE, we explored methods that allow colour images and range images to be combined. The first task was to identify the best stereo vision algorithm from a number of possibilities, and we developed an algorithm to convert depths from a range image to complement the limitations of stereo vision. We tested our approach, and developed evaluation methods for both qualitative and quantitative results. We then developed an approach for combining edges in the colour and depth images, in order to automatically extract foreground leaves from a colour image and investigated surface smoothing for accurate 3D surface reconstruction. With our automatic leaf extraction and 3D reconstruction methods, we achieved a correlation of 0.98 between automatic and manual measurements.

We applied our methods of 3D leaf reconstruction and statistical models to obtain features including leaf size, leaf angle, plant height and total leaf area from collected images of pepper plants. We then identified quantitative trait loci (QTL) from each feature for plant growth and genetic analysis. Two QTLs were found on linkage groups 2 and 4 by our image analysis tools, which were also identified by the hand measurements. An extra QTL on linkage group 1 was identified by the hand measurements. The three other image features were also used to identify QTLs. Our results are very promising, since the developed image analysis tools required less time for the measurement process and no need to destructively harvest the growing plants.

Conclusions

We have developed imaging equipment (the SPY-SEE robot) able to be moved around a greenhouse and recorded images of growing pepper plants with colour, range and near-infrared cameras.

1. Methods based on images from colour and range cameras have been developed to reconstruct the plant canopy in 3D, and automatically identify individual leaves in images. We have also developed statistical methods to extract heritable traits from images.
2. Although our work used pepper plants as an example, our methodologies in principle can be applied to other plants grown in greenhouse or field.

Phenotyping experiments

In breeding the best genotypes for diverse conditions, ideally the breeder should test all his crossings under all these conditions. Especially with complex physiological traits like yield, which exhibit large variation, this would require many expensive and large field trials. The SPICY project therefore aims at the development of a suite of tools to help breeders in predicting phenotypic response of genotypes for complex traits under a range of environmental conditions, i.e. an integrated gene-to-phenotype model and fluorescence and imaging sensors. To generate data to develop and validate these models, a recombinant inbred line (RIL) population of Capsicum annuum 'Yolo Wonder' x 'Criollo de Morelos 334' was phenotyped at two sites (the Netherlands and Spain) and in two seasons (spring and autumn). Phenotyping was done both manually by measuring characteristics like fruit set and development rate, and by newly developed image analysis and fluorescence tools permitting high throughput phenotyping of dynamic trait expression. Data sets were made available to the other work packages in the SPICY project and the data were used there.

Genotype specific crop growth model

Introduction and objectives

Central in this project is the use of crop growth models to predict yield. A crop growth model disentangles the complex trait yield in a number of underlying traits (model parameters) specific for the crop. These model parameters are assumed to be more stable in different environments than yield. In crop physiology, crop growth models have proven to be an excellent tool to predict crop yield under different environmental conditions. However, these models are usually developed for only one genotype (cultivar), with model parameters specific for that one genotype. To be useful in plant breeding, a crop growth model should be able to predict yield for many genotypes in several contrasting environments, using genotype specific model parameters.

The objective of this work package is to develop a crop growth model for prediction of crop yield performance while accounting for environmental differences and to determine a set of genotype specific model parameters, which can serve as features for a QTL-analysis.

Results

The model used is based on LINTUL (Van Ittersum et al., 2003) and has three genotype-specific parameters: light use efficiency, leaf area index development rate and partitioning to the fruits (harvest index). Radiation and temperature are the main driving variables for crop growth and development. We did choose for a 'simple' model with only a few parameters, as only then all parameters can be determined for each genotype. In a simple model (almost) all parameters can be measured, but parameters themselves are 'lumped entities'. For example, light use efficiency is one parameter integrating underlying processes like leaf photosynthesis and respiration. Each of these processes has a large number of parameters, which are not feasible to measure in large experiments.

The model parameters were derived from the four phenotyping experiments. Yield was represented by total fruit dry matter production. Subsequently, QTL analyses were applied to yield and the model parameters. Only 2 QTL were found for yield, which could explain 11 - 28 % of the observed variation in yield. QTL analysis conducted on the three model parameters leaf area index development rate, light use efficiency and partitioning into the fruits resulted in four to seven QTL per model parameter. Most of the observed variation was explained for leaf area index development rate (32 - 52 %), while for light use efficiency and partitioning into the fruits less variation could be explained by the QTL (23 - 40 % and 8 - 33 %, respectively). Following the QTL analyses, estimates of the model parameters based on the QTL were made, and these estimates were used in the simulation model to predict yield. Simulations using the QTL estimates of the model parameters could explain 17 - 37 % of the observed variation in yield, only for the spring experiment in the Netherlands almost no variation was explained. On average, this was equal to or more variation explained by the simulations using the QTL estimates of the model parameters than by QTL of yield itself. Results of the validation experiment in 2010 in Spain were used as an independent test. Here, QTL of yield could explain 43 % of the observed variation in 18 new genotypes (genotypes not used in the four big phenotyping experiments), while simulation using QTL estimates of the model parameters could explain 53 % of the variation.

An initial theoretical modelling study showed that additional parameters were needed to create crossovers (a cross-over means that the yield ranking of the genotypes depends on the environment). These crossovers are observed a lot in breeding experiments and a realistic model should be able to realise crossovers. However, these parameters could not be derived from the experiments. Further analysis of the model results showed that partitioning into the fruits was the weakest model parameter, having the lowest number of QTL which show large environment- specific effects. However, breaking down partitioning into the fruits into other components could not be done as the appropriate data were not available.

Conclusions

Our main conclusions are listed below:

- A crop growth model can help in yield prediction in different environments, based on underlying genotype-dependent yield components (traits, model parameters).
- QTL analysis performed on model parameters instead of on yield directly results in more QTL and improved prediction of yield.
- The model and QTL of the model parameters were also able to predict yield of an independent experiment.

QTL analysis tools

Introduction and objectives

An interesting strategy for improvement of a complex trait dissects the complex trait in a number of physiological component traits, with the latter having hopefully a simple genetic basis. The complex trait is then improved via improvement of its component traits. In plant breeding experiments phenotypic measurements on a large range of traits are collected simultaneously. These traits are often genetically correlated and the genome-wide availability of genetic markers allows us to study whether these genetic correlations are caused by pleiotropic QTLs and/or closely linked QTLs. Similarly important is the understanding of correlation of genotypic performances between multiple environments as these will impact transferability and predictability. Objectives of this work package were therefore to develop improved and novel statistical methods and strategies to adequately describe and analyse the SPICY data sets and to arrive at sound QTL results.

Results

Multiple traits multiple environment (MTME)
Mixed models offer a suitable framework for handling complex correlation structures describing the dependencies of:

1) instances of the same trait in multiple environments;
2) multiple trait in single environments;
3) pairs of traits across environments (Boer et al., 2007; Malosetti et al., 2008; van Eeuwijk et al., 2010).

A most general statistical model for identifying QTL in the presence of the above mentioned correlations is the MTME QTL model. This model will help to identify the genome regions responsible for genetic correlations between trait by environment combinations, whether caused by pleiotropy or genetic linkage, and can show how genetic correlations depend on the environmental conditions. Specifically for application in SPICY, we developed a MTME QTL model. We developed a protocol to fit MTME QTL models for large sets of trait by environment combinations (60 in the case of SPICY).

The protocol contained the following steps:

1) Initial screening for QTLs was done on principal components of the genotype by trait x environment matrix of phenotypic means (Blues). This analysis gave a first idea of the places where QTLs could be expected from the MTME QTL analysis.
2) Next a simple interval mapping QTL scan was run for all trait by environment combinations simultaneous.
3) QTLs identified in this scan were then subjected to a backward elimination procedure to arrive at a final QTL model.
4) As a post- analysis diagnostic final QTL locations were inspected for being trustworthy by a graphical procedure based on sign changes of QTL test profiles for individual trait by environment combinations.

The results from our MTME revealed a large number of QTLs influencing yield and related traits. Many QTLs were pleiotropic, with patterns consistent with genetic correlations between traits. Both consistent and environment-specific QTLs were identified. Three of the yield QTLs were consistently detected in the four environments with the other QTLs being environment-specific. Total explained trait variance by the QTLs varied between 42 % and 56 %. Comparing the QTL results from MTME with those from the initial STSE (single trait single environment) revealed up to a doubling of the number of QTLs identified and the proportion of variance explained by these QTL. These results confirm the power of the MTME QTL methodology developed by us to identify QTLs.

Bayesian mixture modelling of correlated traits

In the mixed model framework, we developed an MTME QTL model in which no organization of traits was imposed beyond those implicit in the VCOV structure. We also explored a Bayesian hierarchical modelling (Ehsani et al., 2012) in which we explicitly defined pleiotropic marker effects (455 positions) to exist for subsets of traits, while correlations between environments were defined via latent variables. We considered 10 traits in the NL1 trial and defined three subsets (blocks) of traits. The block 1 comprised all traits implying that a QTL would affect all traits (some with possibly large and some with small or negligible effects). The block 2 comprised the key traits DWF, LUE, NF, and the leaf-related traits, while block 3 comprised the key traits DWF, LUE, NF, and the non-leaf-related traits. Important to note is that the model thus includes 3 times the set of 455 markers and by regularization the majority of marker (pleiotropic) effects will have tiny effects while those markers with pleiotropic effects that associate well with the variance and covariance structures among traits will have a substantial effects. Those markers with substantial effects on multiple traits may account for the genetic correlations that were observed in previous analyses. Therefore, this Bayesian mixture approach was an additional tool to study groups of traits in further detail.

The first results of this Bayesian mixture approach were encouraging as among the three blocks differences in explained variances per trait were apparent. For example the traits LAI and DWL had more variance explained in block 2 than in block 1. Block 1 seemed to be dominated by trait NLE (0.38 explained variance). The total explained variances for the ten traits ranged from 0.10 to 0.70 with an intermediate value (0.46) for total dry weight of fruits (DWF - representing yield). The correlations with DWF were all close to zero in block 1. These correlations became more divergent when smaller subsets of traits were considered (block 2 and 3). For example, the correlation between DWF and DWL increased up to 0.34 in block 2, while the correlation between DWF and DWS became more negative (-0.18). These divergences in correlations when modelling multiple blocks will further strengthen our knowledge on the genetic dependencies among traits.

Conclusions

-A powerful mixed model multi-trait multi-environment QTL mapping tool was developed that will be a valuable tool for plant breeders and physiologists.
-A Bayesian modelling strategy was developed to investigate hypotheses about pleiotropic effects for subsets of traits.
-The use of sophisticated mixed models for QTL analyses revealed many chromosomal regions with segregating QTL. The number of QTL and the percentage explained variance per trait increased when correlation structures between traits and between environments were better accounted for in the model.

Potential impact and dissemination of the EU SPICY project

Potential impact

The objective of Theme 2: Food, Agriculture and Fisheries, and Biotechnology is the construction of a European Knowledge Based Bio-Economy (KBBE), by bringing together science, industry and other stakeholders to exploit new and emerging research opportunities that address social, environmental and economic challenges. The aim of SPICY was to develop a suite of tools for molecular breeding of crop plants for sustainable and competitive agriculture. These tools can help the breeder in predicting phenotypic response of genotypes for complex traits under a range of environmental conditions. In this project, research institutions and breeding industry cooperate with the aim to give the European breeding industry a competitive edge for the breeding of crop plants for sustainable and competitive agriculture. The project thereby applied to the objectives of the Environmental Technology Action Plan (ETAP).

Plant breeding strategy and recent developments

Plant breeding aimed at improved growth and yield under specific environmental conditions has for most of history been based on the breeder's subjective evaluation (based on numerous years of experience), and in the more important crops on large-scale phenotyping and on quantitative genetic and statistical analysis. During the last decade, QTL analysis has been added as a tool to obtain markers linked to genes affecting these traits, allowing for indirect, marker-based selection.

However, so far success has been limited, due to several reasons:

(1) the relatively low heritability of these traits, which is caused by a relatively high effect on environmental variation;
(2) the fact that several or many genes have an effect on these traits,
(3) the fact that the assessment of phenotypes is expensive and time-consuming; and
(4) the fact that the effect of a given gene is dependent on that of other genes and on the environmental conditions.

Due to this complex of factors, QTLs have generally been mapped with a low precision, which is insufficient for marker-aided selection. Also, many growth and yield-related QTLs have only been detected in a few experiments or only under specific conditions, which has limited their use in breeding.

In many respects, the situation has been advancing over the last decade. Genotyping is becoming more efficient and cheaper. Also advances are being made in high-throughput phenotyping technology. Both of these aspects are highly relevant for breeding for quantitative traits that are affected by environment, such as the growth and yield traits targeted in this project. However, breeding companies are still in search for efficient phenotyping tools, and efficient ways to link the (relevant) phenotypic traits to the genetic information.

Stakeholders

The SPICY project is primarily intended to generate new tools that can help breeding companies in predicting phenotypic response of genotypes for complex traits under a range of environmental conditions. The breeding industry has been involved in this project by means of participation in the industrial advisory board of this project. In the industrial advisory board, the international breeding companies Rijk Zwaan, Monsanto, Syngenta, Nunhems, ENZA, Vilmorin, Ramiro Arnedo and the company LemnaTec have been actitively involved during the SPICY project. The meetings of the industrial advisory board, frequently combined with the consortium meeting to ensure active interaction between the companies and the researchers, have focused on the contents of the work and on the results of the SPICY project in relation to current developments in the European breeding industry.

Implications of the newly developed tools on current breeding industry

Genetic tools and candidate genes:

The availability of markers closely linked to genes of importance for growth and yield would considerably improve the efficiency of breeding for these traits, as they would replace expensive large-scale experiments by much cheaper genotyping for a limited number of markers. The more closely the markers are linked to the genes of interest, the better the selection response will be and the lower the amount of linked, unwanted genes that are introgressed along with the gene. The ultimate markers are those that are located in the targeted genes themselves, hence the importance of the candidate gene approach.

The work on the genetic tools and candidate genes provided some genomic and genetic resources for the plant genetics international community, and particularly the pepper breeders, namely:

- A nearly complete repertoire of expressed genes (transcriptome library) from pepper genome which is of high interest since this plant genome has not been sequenced due to its large size and abundance of repeated non coding elements.
- A large set of SNPs (11849) in expressed genes, which provide polymorphic markers for use in the cultivated pepper gene pool.
- An enlarged list of candidate genes for plant and fruit traits in pepper and solanaceous crops (tomato, potato, eggplant), amplifying the knowledge transfer from model species (A. thaliana) to plant species of agricultural importance.
- An improved knowledge of the genetic diversity available in the pepper germplasm worldwide, with calibrated core-collections for horticultural trait breeding.

These resources were made available through scientific publications; sequences and SNP data are under deposition in an international public database.

Imaging tool:

Current characterisation of genotypes is laborious work and is largely based on the expert-knowledge of the breeder who visually determines whether or not a genotype looks promising enough to enter a following round of testing. An imaging tool with automated image analysis would be a powerful tool to speed up the process of testing the genotypes.

In the SPICY project, we have developed imaging equipment (the SPY-SEE robot) able to be moved around a greenhouse and recorded images of growing pepper plants with colour, range and near-infrared cameras. This system is currently (summer 2013, after the end of the SPICY project) running in the greenhouses of one of the breeding companies that was seriously interested in the possibilities of the SPY-SEE for automated, high throughput phenotyping.

In our project, we have developed methods based on images from colour and range cameras to reconstruct the plant canopy in 3D, and automatically identify individual plant organs (i.e. fruits) in images. We have also developed statistical methods to extract heritable traits from images (leaf area, leaf angle, plant height, number of fruits). Results showed that with automated image analysis, two QTL were found which were also identified by the hand measurements. These results are very promising, since the developed image analysis tools required less time for the measurement process and no need to destructively harvest the growing plants. A number of breeding companies have expressed their interest in the sensor and the image analysis tools, with one of them we are currently involved in a project to further develop the SPY-SEE sensor to the tool they can routinely in their greenhouses.

Fluorescence tool:

One of the main characteristics of a plant is its ability of photosynthesis, i.e. the production of assimilates by the fixation of CO2 which is converted to sugars in the presence of energy from the sunlight. This characteristic is currently hardly used in the selection procedure of genotypes, since it requires long-term measurements to establish the photosynthesis rate of a plant. In the SPICY project, an intelligent fluorosensor (IFS) fluorescence tool was developed, which allows fast measurement of fluorescence kinetics as an indication of photosynthesis rate. Results of this system showed that the DC transient peaks Fm and Fm' in the dark and light acclimated phases, respectively, as well as the calculated direct F0' parameters show high heritability (H2 greater than 0.7 and above), based on predicted genotypic means of the parameters obtained by mixed modelling. This shows that a careful selection of a measuring protocol and fluorescence parameters can provide phenotypic traits that can be used for QTL analysis or in the crop growth model. Currently, the IFS tool is used in research, education, and in research projects with commercial companies.

Modelling and QTL analysis tool:

In current breeding, genotypes are evaluated visually and the expert knowledge of the breeder plays an important role in the selection procedure. Large-scale phenotyping experiments are necessary to generate the phenotypic data. However, crop growth models have proven to be excellent tools to predict crop yield under different environmental conditions. A crop growth model disentangles the complex trait yield under different conditions in a number of model parameters specific for the crop, based on known physiological principles like photosynthesis, and for the environment, like light and temperature. In SPICY, the idea was to use a crop growth model as a tool to predict the phenotypic response of a genotype under different environmental conditions and to use genetic markers in the QTL regions to estimate the genotype specific model parameters. Specific QTL-analysis methods were developed as a tool to find the corresponding QTL for the crop growth parameters.

Our work had led to the following results and conclusions:

- The crop growth model developed can predict yield in different environments, based on underlying genotype-dependent yield components (traits, model parameters).
- QTL analysis performed on model parameters instead of on yield directly resulted in more QTL and improved prediction of yield.
- Validation showed that the new crop growth model and the QTL of the model parameters were also able to predict yield of an independent experiment.
- A powerful mixed model multi-trait multi-environment QTL mapping tool was developed that will be a valuable tool for plant breeders and physiologists.
- A Bayesian modelling strategy was developed to investigate hypotheses about pleiotropic effects for subsets of traits.
- The use of sophisticated mixed models for QTL analyses revealed many chromosomal regions with segregating QTL. The number of QTL and the percentage explained variance per trait increased when correlation structures between traits and between environments were better accounted for in the model.

The results are currently being used in research, in education and are used in research project with commercial (breeding) companies.

Conclusions

The EU-funded research project SPICY has realised the goals that were originally planned, i.e. a gene-to-phenotype model, and a fluorescence sensor and an imaging sensor. All attracted serious interest of breeding companies, and currently in all these fields further progress is made in projects together with one or more breeding companies.

Dissemination

Aim and objectives

Aim of this WP is the dissemination of the research results through an international course on molecular breeding tools, international workshops, the internet and publication in peer reviewed international scientific journals.

Results

International scientific symposia

The first international symposium on 'Improving yield prediction by combining statistics, genetics, physiology and phenotyping: the EU SPICY project in pepper' was held on 7 March 2012, in Wageningen, the Netherlands. This symposium gave an overview of the results of SPICY, in particular how recent development in genomics, phenotyping, physiological modelling, statistical genetics, and bio-informatics can be combined in an integrated approach of molecular breeding of pepper.

This first symposium was attended by approximately 75 participants from breeding companies (RijkZwaan, ENZA Zaden, Anthura, Monsanto, Nunhems, Vilmorin, HZPC, Gautier Semences, Bejo Zaden, Nickerson Zwaan, Syngenta), research organisations in Spain (CNB-CSIC), United Kingdom (Rothamsted Research), Belgium (ILVO) and various groups of Wageningen UR as well as a number of commercial companies (Keygene, Agro Adviesbureau, Genetwister Technologies, Phenospex, Floreac).

The second international scientific symposium 'Improving yield prediction by combining statistics, genetics, physiology and phenotyping' was held on 4 September 2012, preceding the Eucarpia conference ' Biometrics in Plant Breeding', held in Hohenheim, Germany. The symposium consisted of contributions from members of the EU-SPICY project as well as from key researchers in the disciplines of central importance to the EU-SPICY project.

This second symposium was attended by approximately 60 participants from breeding companies (Van der Have, Deutsche Saatveredelung, Euralis Semences, Limagrain, IT-Breeding, HZPC, Pioneer), universities from Sweden, Finland, the Netherlands, United States (US), United Kingdom, Italy and Germany, research organisations (INRA, BioSS, Institute of Crop Science and Resource Conservation, ITQB) as well as a number of commercial companies (BASF, Biogemma, Agri Information Partners, KWS, Genetwister, JatroSelect, Keygene, Biosearch Data Management).

SPICY website

The website has been generated and launched in the world wide web under the webpage name http://www.spicyweb.eu/ since November of 2008. Updates and upgrades of the website are done on a regular basis.

International course on the use of molecular breeding tools

The international SPICY symposium held on 7 March 2012, in Wageningen, was followed by two-days workshop for breeders and molecular geneticists to get acquainted with specialized techniques used in the SPICY. There were four modules:

1) Bioinformatics and statistical genetics and genomics.
This workshop offered researchers a theoretical and practical introduction to plant comparative sequence analysis using PLAZA, a resource for plant comparative genomics (see http://bioinformatics.psb.ugent. be/plaza online), and to gene network approaches that can be used in the functional analysis of genes identified through large-scale experimental approaches. Both modules offered hands-on exercises on gene family analysis, GO enrichment analysis, detection of orthologous genes in other species such as pepper, and on the construction, functional annotation and visualization of co-expression networks, identification of subnetworks, and comparison and integration of different networks using CORNET (see http://bioinformatics.psb.ugent.be/cornet online) and Cytoscape (see http://www.cytoscape.org online).

Special attention was given to statistical methods for QTL analysis of complex traits and related component traits across multiple environments. A further point that was discussed concerned the use of genomic prediction strategies as compared to traditional QTL analyses in the context of the prediction of complex traits using models based on component traits. Statistical analyses with hands on exercises were done using Genstat v14.

2) Image analysis for automatic phenotyping.
This one-day workshop provided an overview of image analysis for automatic plant phenotyping. It is primarily intended for technical experts in breeding companies who use, or plan to use, image analysis. The course was a mixture of theory and hands-on training.

The morning session covered various aspects of image recording including camera, illumination, calibration, colour imaging and recording of 3D images. Participants had the opportunity to work with the different recording techniques and learn to understand the importance of proper image recording.

The afternoon started with an introduction of Image-J, free open-source software for image analysis. Next participants got an overview of the different aspects of image analysis, and experimented with image enhancement techniques like contrast stretching, linear and non-linear filtering including edge detection. Next image segmentation and different binary operations (erosion, dilation, opening / closing, propagation, skeletonisation) were treated. An important aspect in plant phenotyping is the automatic measurement of features describing aspects of plant size, shape, colour and texture. Participants experimented with these different types of features. Finally, pattern recognition techniques were used to select image features and build classifiers.

3) Understanding the impact of crop characteristics on yield through crop growth modelling
What is a crop growth model and why is it relevant? Yield component analysis leading to a simple process-based simulation model for crop yield was introduced and the use of such a model in the SPICY project was explained. In a computer practical (hands-on) participants build this simple model in EXCEL. A phenotypic dataset from the SPICY project was provided to calculate genotype-dependent model parameters. Participants ran the model for 10 genotypes in 2 environments and observed that model parameters with no genotype by environment interaction (GxE) can result in yield data with GxE.

Participants performed a QTL analysis on yield, biomass and yield and biomass components. They interpreted the QTLs for components in relation to those for yield and biomass. Attention was given to QTL by environment interactions. Predictions from QTL models for yield and biomass components were used as input parameters for a crop growth model and subsequently predictions for yield and biomass itself will be produced. These predictions were compared to those from a crop growth model using the component traits itself instead of the QTL predictions for the component traits.

4) The use of fluorescence in phenotyping
Fluorescence measurements can give a rapid indication of the photosynthetic characteristics of a plant. In this course, participants got an overview of the fluorescence technique and its possible application in phenotyping.

In the SPICY project, a new fluorescence tool has been developed aiming at measuring high quantities of plants in a short period of time. This new tool is a flexible and easy-to-use system of a host unit with its multiple sensors deployable easily in large span greenhouse environments with remote control and centralized, quasi real-time data collection. The unique feature of each sensor is a compact, all-in-one design, and a measurement protocol integrating several methods. The system is capable of mass producing baseline-corrected temporal fluorescence curves with batch processing for prompt extraction of plenty of physiologically relevant feature points. In this course, a practical (step-by-step) session on the system usage was presented.

In the workshops, 11 - 42 people participated, depending on the topic of the workshop.

Press releases and publications

During the SPICY project, breeding companies were regularly informed about the progress, via meetings of the IAB, usually combined with consortium meetings. Furthermore, they were informally informed via phone and e-mail. Throughout the project, two articles were published in the International Innovation Magazine, which targets to a broad audience, with approximately 30 000 copies distributed per published edition. During the project, approximately 16 scientific papers were written, 35 presentations were given at conferences or other occasions for an international audience, and a number of other dissemination activities were undertaken (professional journals, films).

Conclusion

The results obtained in the SPICY have been disseminated to a large audience of scientists and policy makers, via scientific publications, press releases, posters and presentations.

Project website: http://www.spicyweb.eu