European Commission logo
English English
CORDIS - EU research results
CORDIS
Content archived on 2024-06-18

High-Density Peptide MicroArrays and Parallel On-line Detection of Peptide-Ligand Interactions

Final Report Summary - PEPCHIPOMICS (High-density peptide microarrays and parallel on-line detection of peptide-ligand interactions)

Executive summary:

Proteins are indispensable elements of life. Some proteins serve as structural or mechanical components of our cells, whereas others serve functional purposes including enzymatic reactions, metabolic processes, cellular signalling and receptor interactions; and they are key targets of the immune recognition etc. Proteins are therefore of considerable scientific and practical interest and it is easy to understand why they are frequent targets of medical intervention.

Proteins are made up of amino acids and shorter amino acid chains with a length of up to about 100 amino acids are conventionally defined as peptides. Peptides have emerged as indispensable tools in the quest to understand protein function and interaction at the single protein level to the proteomic level. Chemical synthesis of peptides is a standard procedure, however, the costs of acquiring and handling the number of peptides needed for even a modest screening programme is often prohibitive. Thus, the pharmaceutical and biotechnology industries, as well as academia, encounter significant economic and logistical barriers when they attempt to apply peptide based screening programmes in the exploitation of '-omics' information.

This project has used novel photochemistry to synthesise large arrays of peptides on microarrays (chips) the size of a stamp. Far exceeding the original plans, we can now synthesise up to two million pre-addressable peptides of known amino acid sequence per chip at a cost of only a few cents per peptide. This is enough to express the entire human proteome (or pathogen proteomes) as a systematic set of overlapping peptides. To detect molecular interactions with individual peptides on these chips, we have used both optical detection of labelled molecules and more advanced label-free detection systems, which allow us to analyse kinetic parameters of several thousand peptide interactions in parallel. We have generated advanced bioinformatics to interpret these chip data and in the process made these tools available to the scientific community at large. Finally, we have tested the impact of this new technology in select cases of considerable medical interest such as mapping the human proteome, defining protease cleavage sites and understanding how target are selected by the immune system.

Thus, a wealth of data on peptide identities and on how each interacts with a given analyte can now be provided at a fraction of the cost and time of competing technologies. In this sense, peptide based screening will begin to match the information flow of genome-based technologies. This will allow acquisition of peptide and protein data at an unprecedented scale thereby supporting a rapid and rational development of new and safe medicines. A routine implementation of peptide microarray technologies should be of tremendous value in drug discovery, contributing to the elucidation of protein function and the general understanding of the multitude of ligand interactions in living tissues.

The small and medium sized enterprises (SMEs) involved in this project has harvested considerable benefits. Entirely new products and services have evolved and will be marketed at the conclusion of the project. Furthermore, a new spinout start-up company focussing on these new technologies have been created.

Project context and objectives:

The universe of proteins is extremely plastic and diverse. Proteins are frequent target structures in biology generating a tremendous interest in identifying and understanding proteins involved in biological processes. In some cases, fragments of proteins, peptides, may correspond to native protein target structures and even when the native target structure is too complex to be represented by peptides, the plasticity and diversity of peptides is sufficient to mimic virtually any complex target structure. Thus, a large-scale peptide array technology should make it possible to identify or mimic virtually any biological target structure. One could object that the diversity of peptides, even at the oligopeptide level, is so large that it easily exceeds whatever advances in peptide microarray technology we can provide in the foreseeable future. To solve this conundrum, we have enabled an iterative approach, where the results from an initial wide-screen peptide microarray experiment is interpreted by bioinformatics and used to design more narrow-screen next generation peptide microarray experiments. Thus, it is our scientific rationale that a high throughput peptide microarray technology can now represent, or mimic, any proteome derived structure and allow mapping of the fine specificity of virtually all protein (and most non-protein) interactions.

A scientific objective of a high density peptide microarray technology is to enable a faster, more efficient and complete scientific process. To exemplify this, we have used the peptide microarray technology to map several different specificities of significant scientific interest. One of our examples is that of antibody proteomics, which is used to examine the tissue distribution of all human proteins, a.k.a. the human protein atlas (HPA) initiative. Another is that of proteases, which are crucially involved in protein metabolism. Finally, we have shown that this new peptide microarray technology can be used to perform an extensive mapping of the peptide binding specificities of class I and II molecules of the human immune system. human leukocyte antigen (HLA) molecules are specific peptides receptors responsible for selecting and presenting peptides to immune T cells and identifying HLA specificities can be used in vaccine design, e.g. severe acute respiratory syndrome (SARS) or bird flu vaccine design). These are only examples; the utility of an affordable high-throughput peptide microarray technology will be pervasive.

The technological objectives have been challenging, indeed. In brief, we have established a robust peptide microarray (peptide chip) technology exploiting recent advances in imaging, solid phase peptide chemistry and automated liquid handling. Then, we have established a sensitive and spatially discriminatory detection system allowing easy and accurate identification of molecular interactions with specific peptides; and in some cases even enabled real-time, label free detection allowing the establishment of thermo-dynamical parameters from thousands of parallel interactions. Finally, we have developed the bioinformatics resources needed to analyse the resulting signals and identify any common target structure.

The results should be pervasive. Proteins constitute the functional and structural expression of the genome and pharmacological leads frequently target proteins. Scientists from academia, biotechnology and industry often use fragments of proteins (also known as peptides) to identify leads. Increasingly, they request large libraries of peptides in attempts to represent, or mimic, all possible peptide targets. By way of examples, mapping of microbial antigens during development of vaccines and mapping of potentially allergenic or autoimmune epitopes, require large series of peptides. This use of peptides to address immunological questions is not surprising since peptides themselves are targets for immune recognition. Other current applications include profiling of kinases, phosphatases, proteases and other receptor/enzyme systems. Even with current technology, the market for peptide microarrays is booming with projected annual growth rates of 30 % (BCC research, RB-169 on protein chips). More applications are likely to follow as a high-throughput peptide microarray technology matures and becomes available at reasonable price.

Another aspect of the project was to support the research and development of two SMEs within the biotechnology sector. As mentioned above, proteins constitute the functional and structural expression of the genome and pharmacological leads frequently target proteins. These SMEs aimed to exploit the fact that academia, biotechnology and industry often use fragments of proteins, or peptides, to identify lead candidates; and increasingly request large libraries of peptides in attempts to represent, or mimic, all possible peptide targets.

Project results:

Work package one (WP1): 'Development and Implementation of high-density peptide microarrays' (Leader P2, SCHAFER-N, SME)

Synthesis of peptide microarrays in chip-format is achieved using light-directed photochemistry. In this process, the incoming amino acids in each synthesis step are standard [(9-fluorenylmethyl)oxy]carbonyl (Fmoc) or Boc-) protected amino acids activated by any of the techniques employed in ordinary solid phase synthesis of peptides. At the onset of the synthesis, a monolayer of amino groups on the chip surface is derivatised by an activated photoprotection group. Then, ultraviolet (UV) light is directed onto the synthesis surface using digital micromirrors to create well defined illuminated patterns that correspond to the sites where cleavage of the photoprotection group is desired. The exposed amino groups are then coupled with an activated amino acid with ordinary chemical protection of its alpha-amino group. This process is repeated until all the different amino acids in the first layer are attached to the chip surface. At this point the layer is deprotected using standard chemical deprotection reagents to create a new monolayer of amino groups. The entire process can then be repeated a number of times corresponding to the length of the peptides in the arrays. All steps are handled by synthesis device and the hands-on-time for synthesis of a peptide array is therefore very low.

Light from a 365 nm source is collimated and projected onto a digital mirror device. Light reflected from selected mirrors in the digital mirror device is directed into an imaging system that generates an image of the digital mirror device on the synthesis surface. At the outset of each cycle of a solid-phase peptide synthesis strategy, the N-terminal of all the amino acids of the growing peptide chain is protected by a photosensitive protection group (filled circles, step 1). The photosensitive protection groups are removed by UV irradiation of predefined array fields in which the peptide chains are subsequently extended with a predefined amino acid protected by a base-labile Fmoc group (filled triangles). With natural peptides, one elongation cycle involves one UV deprotection step for each of the 20 different amino acids (e.g. steps 2 to 4 illustrate extension by glutamic acid, steps 5 to 7 with alanine and steps 8 to 10 with valine etc.). At the completion of an elongation cycle, all peptides have been extended by the intended Fmoc amino acid and can be prepared for the next elongation cycle by removal of the Fmoc groups by piperidine treatment liberating free amino terminals (open circles, step 11) followed by coupling of photosensitive groups to the exposed n-terminal amino groups (step 12).

During the project, a new high-resolutions peptide microarray synthesiser have evolved through multiple major revisions of all involved components (chemistry, optics, hardware, software etc.). This has resulted in a synthesiser that routinely achieves several hundred thousand peptide fields per array and arrays with a number of peptide elements close to the two million-limit defined by the digital mirror device (DMD) can now be made. Thus, the final synthesiser is capable of performing automated, unattended synthesis of custom designed peptide microarrays featuring up to two million different peptides per chip.

The instrument is highly versatile. It allows several different kinds of synthesis slides to used thus accommodating different needs. The standard slide is a 1 x 3 microscope slide. Some users will request maximum diversity of the synthesiser in order to ask high complex questions (e.g. addressing whole proteomes), whereas others will request the ability to test multiple different analytes (i.e. different reagents of interest such as serum from different patients) against less complex questions (e.g. addressing one or a few proteins of interest). To allow the latter use, we have enabled an option allowing the synthesis and analysis of multiple separate sectors in a single array. Several devices have been developed that allow physical separation between probe-containing liquids applied to the surface of the synthesis substrate. Note that even after sub sub-divisions the diversity of peptides that can be expressed per sector exceeds the diversity that can be expressed on entire chips by competing technologies.

In addition, the synthesiser can accommodate so-called Kretschmann prism, which can be used to perform label-free surface plasmon resonance imaging (SPRi) detection (see WP3 below). The basic design of the flow cells holding the synthesis substrates permits rapid design of flow cells holding other formats.

A computer programme has been developed to control all the hardware of the synthesiser according to fully customisable designs. Upon completion of an array synthesis, all settings and reagent lists together with a complete time schedule are added to the encrypted layout file so that data records are maintained at glucagon like peptide (GLP) level. The final data file together with a digital image of the array recorded after analysis constitutes the input for the analysis programme described in WP2. Few constraints are put on the nature of the image of the array, except that some kind of visual indication (fluorescence, enzyme staining, radioactivity, plasmon resonance etc.) must be present that reveals which peptide fields have been influenced by the analysis.

From the onset of the project, it was decided that the chemistry should be based on standard amino acid derivatives for peptide synthesis and that introduction of photosensitive protection groups should be done in situ, i.e. on the synthesis substrate during synthesis of the peptide arrays. This strategy eliminates the need for expensive prefabricated photoprotected amino acid derivatives and greatly expands the selection of amino acid derivatives that can be incorporated in the syntheses. During the project period, experiments have been made to determine optimum conditions for storage, coupling and UV induced cleavage of various photoprotecting groups. A set of conditions have been identified that makes it possible to run unattended syntheses during five to seven days using the same set of reagents, i.e. that allows for completely automatic synthesis of multiple peptide arrays. With the photoprotecting group used for most syntheses, optimum cleavage conditions have been identified and although some side effects from generation of photoproducts must be expected, successful identification of peptide epitopes have been made in arrays with 20-mer peptides which is close to the limits that would expected from crude peptides in arrays made using traditional solid phase peptide chemistry.

WP2: 'Optical detection of peptide-receptor interactions' (Leader P2, SCHAFER-N, SME)

The above described software that controls the synthesis hardware is one of three programmes necessary for production and interpretation of high-density peptide arrays. The other two programmes are the array design programme and the array analysis programme.

The design-programme provides an interface for rapid definition of the general layout of the peptide arrays. With up to several hundred thousand peptide fields in each array, the user needs tools for rapid handling of large blocks of peptide information and for compact procedures for generation of systematic modifications of the peptides in the array.

The design programme provides a relatively simple graphical interface in which the user can copy and paste lists of individual peptide sequences, single protein sequences or groups of protein sequences in FASTA format. Sequences entered by the user are automatically proof-read by the design programme to ensure that the formal syntax is correct, that control characters are removed and that all amino acid residues in the sequences are either listed as standard one letter codes or, for unusual residues, are contained in a closed hierarchy of parentheses. If more than one set of sequences is entered, e.g. as in the case of multiple FASTA sequences, the design programme automatically assigns a group name to each set of sequences. If the sequence entered in a group contains more than a user defined number of residues, it will be synthesised as smaller overlapping peptide sequences. As an example, a protein sequence with several hundred amino acids cannot be synthesised in full length using contemporary synthesis techniques and it is therefore reasonable to assume that it should be synthesised as smaller peptides with a defined length and with defined overlaps.

In the example shown above, the entire amino acid sequence for bovine serum albumin (BSA) has been entered in the large memo-field and a group name has been assigned to this entry. The user has defined that the length of the peptides to be synthesised should be 15 residues and that the overlap should be 14 residues (i.e. that all 15-mer peptides in the BSA sequence should be synthesised). The user has further specified that 10 copies of each peptide should be made and that each peptide field in the array should be defined by 2 x 2 mirrors on the DMD. It is further stated that every 9th field should be 'blank' field and what the sequence in the blank fields should be. The synthesis area of the entire array has been selected by the user to consist of 6 x 4 = 24 subsectors each spaced by a distance of 105 µm. The aqua coloured sectors B1, C1, B3, C4 etc. have been clicked upon by the user to indicate that the defined group of peptides from BSA are to be synthesised in these sectors (dark blue sectors do not contain peptides in the current group, but other groups are present in these sectors, whereas no groups have yet been assigned to the white sectors). It is also stated that the current group of consists of a total of 5 940 peptides (copies included) and that 17 850 peptide fields can be allocated to each of the 24 sectors. Finally, a 'marker' sequence has been defined. This sequence will be synthesised in a characteristic pattern in the four corners of each subsector and is intended to serve as a directly or indirectly visible sector locator during analysis of the arrays. The array design programme contains many more advanced features for systematic generation of variants of the original peptides entered and the short description above is meant only as an illustration of how a large number of peptides in the arrays can be generated with only a few entries. In typical applications the array design programme defaults to a scheme where the peptides are distributed randomly within the sectors chosen for them. The random distribution minimises recording errors due to fluctuations in the signal related to local imperfections in the synthesis substrate.

The analysis part of the programme allows the user to interpret recording of the peptide arrays, which are confined to a circa 1 x 2 cm rectangular area. The image can be recorded directly using digital cameras, laser scanners, matrix assisted laser desorption ionisation (MALDI) imagers etc. When quantitative data have been assigned for the fields during the image analysis, the results are stored as encrypted .txt files and can now be analysed. It is even possible to perform timed recordings allowing the same field to be analysed in time, say during a surface Plasmon resonance imaging (SPRi) analysis, (WP3). These analysis files are constructed as extensions of the synthesis file which in turn was constructed as an extension to the design file. The final file thus contains information about the layout of the array, the synthesis schedule, the reagent used during synthesis, the name of the image used for the quantitative analysis and finally the value recorded for each of the peptide fields during analysis of the array. The analysis programme provides a large collection of procedures for graphical display of the analysis results and for export of the analysis results as delimited .txt files suitable for further processing by other programs. Some examples of analyses made on the exported .txt files are shown below in other WPs.

WP3: 'SPRi detection of peptide-receptor interactions' (Leader P3, GenOptics, SME) SPR is an optical detection process that can occur when polarised light hits a prism covered by a thin metal layer. Under certain conditions (wavelength, polarisation and incidence angle), free electrons at the surface of the biochip absorb incident light photons and convert them into surface plasmon waves. Perturbations at the surface of the biochip, such as the interaction between probes immobilised on the chip and analytes (targets), induce an easily measurable response. An imaging version of this technology developed by GenOptics allows sensitive, label-free detection of interaction over the entire biochip area thanks to a video charge coupled device (CCD) camera. This feature allows the collection of real time data from all the different spots.

Briefly, a broad monochromatic polarised light (at a specific wavelength) illuminates the whole functionalised area of the SPRi-biochip, which is combined with a detection chamber. A CCD video camera gives access to array format by image capture of all local changes at the surface of the SPRi-biochip. The signals obtained can be assigned to the pre-addressable peptide of the microarray and by analysing the on and off rates of analyte binding, the mean affinity (or avidity) of these can be calculated.

For the purpose of this project, the GenOptics 'SPRi-Plex' platform has been modified to accommodate peptide microarrays. This entailed modifications to the flow cell, the coatings (ensuring bio-inert and chemical resistance), hardware allowing real-time and multiplex investigation of several thousand interactions, software able to analyse more than thousand kinetics events extracting etc. and validation studies in biological model. The high quality of the recording with very distinct squares and clearly distinguishable line of division (each 30 µm in width) should be noted.

The instruments and hardware has successfully been adapted and validated for the simultaneous recording of an unprecedented more than 3 000 high quality spots per chip. The data recoding does not only contain the position and intensity of the staining, but also the time of recording allowing the on and off rates of any interaction to be determined. Uniquely adapted and dedicated software had to be written to handle this wealth of data. A user interface to manage several thousand spots was created. During each injection, there is a real time display of the surface as in the standard software (raw and difference image) in order to have a qualitative check during the injection. No kinetic interaction curve is displayed in real time but all the images of the injection are saved as well as a text file containing necessary data to extract kinetic curves from the images. After each injection, the programme loads the acquired images and calculates the interaction kinetic curves, which are displayed. This software can handle the many thousands spots of the peptide microarray. The system has been validated and found capable of extracting kinetics data in several different model systems.

WP4: 'Optimal coatings of peptide synthesis matrices' (Leader P4, COM.IHC)

The peptide synthesis involves very harsh chemical conditions that are not readily adaptable for the SPRi detection system. This required the development of new strategies for peptide synthesis preferably on gold surfaces suitable for both peptide micro-array fabrication and SPRi. In particular, it was essential to develop a suitable chemistry for covalent attachment of peptide amino groups to the gold-coated surfaces compatible with peptide synthesis chemistry and SPRi-detection. Different chemistries have been tested and the density, spatial distribution and surface coverage of the first layer of immobilised peptides have been optimised in order to achieve a higher density of immobilised peptides. Whereas several approaches failed, one approach involving self-assembling monolayers (SAMs) of thiolated compounds proved successful. SAM-based coatings have subsequently been examined in detail including their optimal coating density.

Using the developed peptide microarray matrices, the magnitude of the change in SPRi reflectivity could be monitored simultaneously for all the peptide fields in real time. The kinetic constants and the binding affinity between the antibody and each of the fields can be derived from the data if different concentrations of the analyte are injected over the chip surface. With the latest generation of coatings, more than 3 600 fields could be analysed on SPRi prism.

As a real application, linear 11 residue peptides derived from a more than 500 amino acids long model protein and overlapping 10 amino acids were synthesised on a SPRi chip coated with the SAM optimised chemistry. The resulting high density microarray bearing 3 600 peptide fields was monitored for biomolecular binding of human monoclonal antibodies against the protein by the automated SPRi sensor. Monoclonal antibodies against the protein were screened for their signature by sequentially injecting them over the chip surface. If sequences reacting with the antibody, and therefore candidates to be part of the epitope, are zoomed, it can be observed that, albeit with a lower intensity, the antibody reacts with the surface and some with specific peptide sequences. These results were successfully benchmarked with other epitope mapping techniques, such as enzyme linked immuno sorbent assay (ELISA), western blot and spotted peptide microarrays with fluorescence readout). This demonstrated the feasibility and extremely high potential of the combination of these technologies. The reusability of the microarray allows the fingerprinting of one antibody every eight minutes. We foresee vast potential in several applications, considering that multiple proteins can be synthesised on one chip and their linear epitopes could be identified within minutes. Thus, a spinout company has been created.

WP5: 'Interpretation and design of peptide microarrays'? (Leader P5, DTU)

The objectives of this work package were to create a database/data mining system for peptide microarrays and to assist in peptide chip designs. To this end, WP5 has exploited a state of the art bioinformatics based method, NNAlign, which is generally applicable to any biological problem where quantitative peptide data is available. This method, which is based on previous work by the group, efficiently identifies underlying sequence patterns by simultaneously aligning peptide sequences and identifying motifs associated with quantitative readouts. The method is capable of handling large data sets and is therefore suitable for analysing high-throughput peptide array data. To demonstrate the utility of NNAlign, we have successfully applied this method to several different peptide microarray data sets including some sets containing more than 100 000 data points.

Making this method generally available for the scientific community, we have embedded it into a public online web interface that facilitates both handling of input data, optimisation of essential training parameters and visual interpretation of the results. The resulting method may also be readily applied to generate predictions on user specified proteins and peptides, or even complete proteomes, thereby guiding the generation of other experiments. Through the server, the user can easily set up a cross-validation experiment to estimate the predictive performance of the trained method and automatically reduce redundancy in the data. The logo visualisation is also improved with an algorithm that aligns individual neural networks to maximise the information content of the combined alignment. This web-based extension of the NNAlign method empowers experimentalists of limited bioinformatics background with the ability to perform advanced bioinformatics-driven analysis of their own sets of large-scale data. The web-server interface of the NNAlign method is publicly available at 'http://www.cbs.dtu.dk/services/NNAlign'.

We also developed a new algorithm to aid the design of high throughput experiments and minimise the number of sequences required to cover a large set of proteins. The algorithm generates, from a number of protein sequences, a minimum set of peptides of a given length L that guarantee the presence of any possible stretch of E amino acids in the sequence data set. This reduction strategy allows reducing greatly the number of peptides to be tested in a peptide chip compared to, for example, a systematic scan with overlapping peptides on the query sequences. For example, generating 20 amino acids long peptides from the whole human proteome with an overlap of 19 amino acids (i.e. an offset of one), produces more than 12 million different peptides. With the reduction algorithm, we can for example ensure coverage of all possible stretches of eight amino acids (E=8) with less than a million peptides. By means of this reduction strategy and with current density achievable with the peptide array, it is theoretically possible to include the complete human proteome on a single peptide array slide.

WP6: 'Mapping antibody specificities' (Leader P6, KTH)

This is one of three WPs aimed at validating and illustrating the utility of the peptide microarray technology. In this particular WP, the aim was to use peptide chips developed within the programme for high-throughput characterisation of mono-specific antibody epitopes and increase the value of the antibodies as proteomic tools. A set of 22 suitable proteins was used to generate mono-specific antibodies (msAbs). The required peptide length for an antibody to interact with the peptide array and yield a clear signal was evaluated. As an example, polyclonal rabbit antibodies against a 132-mer protein fragment from human EGFL6 were produced and a peptide array was synthesised by P2. This array contained all possible 15-mer peptide fragments from the original 132-mer fragment. The array was probed with the rabbit antibodies followed by a fluorochrome labelled secondary antibody. Six to nine regions of the peptide antigen that are bound by the polyclonal rabbit antibody were clearly revealed. Similar analyses of other peptide antigens supplied by P6 have revealed more than 70 different peptide epitope regions and such regions were found in more than 95 % of the antigens when analysed with their corresponding polyclonal antibodies.

Apart from being contained within the corresponding 15-mer peptides, little detailed information about the determinant amino acid residues within the 15-mers can be extracted from the bar chart. A feature in the design programme allow for layout of arrays in which each peptide is synthesised in multiple variants characterised by systematic single-residue substitutions of the original amino acids in the peptides with other amino acids. The analysis programme is able to make statistical analyses of the signals obtained from each variant of the peptides and, if possible, to deduce which amino acid in the original peptide contribute significantly to the binding of antibody. As an example, detailed analyses made on two antibody binding peptides in the EGFL6 antigen. In both examples each of the amino acids in the original 15-mer peptides has been replaced, one by one, by one of the 20 common amino acids. The statistical analysis was performed as an ANOVA to determine if any of the average values for the columns differed significantly from one, i.e. if substitutions at the corresponding position influenced the binding of antibody to the peptide. If yes, then a further statistical analysis, the Tukey's honest statistaical difference (HSD) analysis was performed to determine which columns differed significantly from one.

For validation purposes, epitope mappings were done using two orthogonal techniques: bacterial surface display and Luminex peptide screening. When looking at the different methods' abilities to deliver reliable epitope data, it is so far clear that the PEPCHIPOMICS method worked well on linear epitopes however a more detailed comparison between the PEPCHIPOMICS method and the other methods revealed some minor difference, which were expected due to their differences in nature and ability to present structured surfaces for binding. The PEPCHIPOMICS method could generally determine the interaction more closely and detailed than the other methods, but missed some regions that might require a larger peptide or a peptide presented in a different context for efficient binding.

When using antibodies as a diagnostic tool in a clinical context it is highly desirable to have an antibody which show a distinct selective binding to the desired protein of interest and limited binding to other regions and proteins (off-target binding). We wanted to explore the possibility to use epitope information to generate a cleaner signal from a candidate antibody when used for detection of a cancer associated protein (SATB2) in a paraffin embedded human cancer tissue biopsy (IHC). Our setup was to first perform epitope mapping on the polyclonal antibody and then use the epitopes for selective fractionation into epitope specific antibodies by affinity chromatography. The performance of the antibodies was then evaluated using IHC, western blotting and subcellular localisation using the HPA workflow.

WP7: 'Mapping protease cleavage specificities' (Leader P7, JGUM.IFI)

The generation of peptides from intact proteins, which are detected by cytotoxic T-lymphocytes (CTL) after binding to major histocompatibility complex (MHC) molecules, requires the action of several proteases (most prominently the proteasome) located in the cytosol and endoplasmatic reticulum of the cell in contact with CTL. These proteolytic processes, which act on all proteins exposed to the cytosolic protein degradation pathway, can also lead to the generation of peptides derived from tumour associated or 'specific' proteins. Detection of these peptides by CTL will lead to the specific elimination of tumour cells. Therefore, a detailed understanding of the involvement of the different proteases and their specificities will greatly support the understanding of the interaction of the immune system and tumours and will aid the development of new tumourspecific immunotherapies.

Currently, knowledge about the specificity of the different proteases is based on experimental data from in vitro digest of short peptides and a few model proteins. Therefore, we attempted to extend the current knowledge on the specificity of these proteases and their effect on the generation of tumour specific and viral CTL epitopes using high-density peptide arrays. The in-depth analysis of cleavage specificity of these proteases might help to identify especially those CTL epitopes where the proteasome is only responsible for the generation of C- but not the N-terminus. We generated a new fluorescence based approach to detect protease digestion on the peptide microarray platform.

This strategy successfully generated data in excess of 115 000 peptides being digested by different model proteases such as trypsin and chymotrypsin. The data could be analysed by the NNAlign algorithm of P5 and demonstrated intelligible patterns that could be used to generate efficient predictive models. Thus, these experiments demonstrate utility in generating biologically meaningful data on protease recognition specificity.

In addition, WP7 developed and validated an SPRi-based approach towards protease activity detection on high density peptide microarrays, facilitating direct label free kinetic monitoring of protease activity.

This WP also had a very ambitious goal of addressing the specificity of the considerably more complicated multi-catalytic protease complex, the proteasome. Since the proteasome is a barrel shaped structure with several inner proteolytic sites we knew that it would be critical to design linkers that would allow model peptides still tethered to the peptide microarray to enter the proteasome and be digested. Unfortunately, we were unable to identify conditions that would allow this to happen.

WP8: 'Mapping cellular immune specificities' (Leader P1, UCPH)

Peptide recognition is at the heart of the cellular, or T cell, immune system. An efficient peptide microarray technology will therefore be immensely useful in the mapping of T cell specificities. All T cells are said to be MHC-restricted i.e. their T cell receptor (TcR) recognise peptide presented in the context of MHC molecules. MHC molecules are broadly specific peptide receptors, which sample peptides from our protein metabolism. MHC class I molecules are involved in accessing intracellular protein metabolism and presenting peptides from our own proteins (or from viral proteins made by our own protein synthesis machinery) to cytotoxic T cells. The MHC class I allows the immune system to examine the origin of any protein currently being synthesised and eradicate any target cell harbouring foreign genes encoding foreign proteins. MHC class II molecules are involved in accessing extra cellular proteins, which are internalised by antigen presenting cells and degraded in the endosomal system. The MHC class II allows the immune system to examine the origin of proteins in the extra cellular space and help other components of the immune system attack such threats. These complicated systems allow the immune system to detect the integrity of our body and react to any foreign threat. A detailed description of how the immune system handles proteins and generates peptide should enable scientists and clinicians to analyse any protein of interest for the presence of potentially immunogenic CTL epitopes (e.g. vaccine candidates).

The array application to map cellular immune specificities is also an extremely ambitious goal of the project. This WP initially encountered major problems caused by suboptimal MHC preparations and suboptimal peptide microarray matrices. New methods had to be established to monitor and then improve peptide binding. Pilot experiments were conducted in surrogate peptide microarray systems to establish the optimal MHC preparations and binding conditions. Likewise, several different peptide microarray matrices were tested. Eventually a method was developed showing signal intensity and quality of MHC class II at a level comparable to those seen for antibodies and with a resolution at the single mirror (10 x 10 µM) level.

The application of the peptide microarray technology to MHC class I molecules is even more ambitious. The primary reason for this complication lays in the binding cleft of MHC class I, which is closed at both ends. In principle, this prevents the binding of peptides that are anchored to a support at either end. An elaborate branched and chemically controllable peptide synthesis support system was developed that would allow a peptide to be synthesised, then tethered through a predetermined internal amino acid residue to the matrix and finally released with free and intact amino and carboxy-ends while still being attached to chip. Pilot experiments demonstrated the feasibility of this strategy, which was then transferred to the peptide microarray synthesis. MHC class I binding to the peptide microarray was detected using the conformationally specific antibody, W6/32. The results demonstrate specific and strong binding of relevant MHC class I molecules to appropriate peptides. The data obtained in the peptide array have been analysed using the NNAlign server of P5 and the resulting logo shows a great similarity to the logo made from data from other assays.

From a structural point of view, it is inconceivable that the internally tethered residue would be acceptable in the crucial anchor positions of MHC binding. Internal tethering can be done at any position in the peptides. It is reassuring to find that systematic changes of the tethering position shows that tethering is not allowed in the known anchor positions all peptide fragments from.

We conclude that peptide microarray can make a significant contribution to the analysis of cellular immune responses.

Potential impact:

The extreme number of different peptides achievable with this new technology is capable of representing all peptide fragments from all human proteins (the human proteome), or microbial proteomes, on a single peptide chip. Thus, it should now be possible to screen for targets at the proteomic level. This will support new high-throughput technologies for peptide and protein-driven research and development. The ability to address peptides at a level that matches the current genomics revolution may have a significant impact on how scientific and/or clinical questions can be addressed in the future and on how biotechnology and pharmaceutical problems can be solved. We expect the peptide microarray platform to become one of the high-throughput tools to interpret genome and proteome information. The technology will be applicable to a range of other biological projects aiming at identification and characterisation of ligand interactions. Among these, the search for ligands influencing neural plasticity, tumour metastasis and receptor function will be prominent, but the possibilities are legio and many spin-off activities are envisaged. This should constitute a market that can be exploited by the SME partners of the project. In addition, the successful results of the project have led to the creation of a new spinout company.

The socioeconomic impact and the wider societal implications

A mature peptide microarray platform will support the general objective of overcoming bottlenecks in the investigation of protein functions in cells and contribute to the strategic objective of developing the knowledge, tools and resources needed to exploit the full potential of genome information and apply it to human health, and at the same time stimulate industrial and economic activity.

The successful conclusion of this project will contribute to the goals of strengthening the competitiveness of the European economy, generating a knowledge-based economy, solving major societal questions etc. More specifically, the immensely successful human genome project has generated a complete map and sequence representation of the entire ensemble of human genes. Now follows the even more complex task to study and understand all the proteins encoded by the genome i.e. the human proteome. The success of proteomics will depend upon development of new and improved technologies including high-throughput assays allowing rapid and large-scale analysis of proteins. Even though the task ahead is tremendous, the rewards may be very substantial. The European Federation for Pharmaceutical Sciences (EUFEPS) has recently stated that 'the chance of a lead molecule becoming a commercial medicine is less than 2 %'. Although the genomic revolution has generated more leads, which could improve this success rate, the techniques currently available for rapidly selecting the most promising candidates are inadequate 'an improvement of the entire research and development process is, thus, essential'. Hence, the European pharmaceutical industry recognises that the lack of efficient post-genomic technologies represents a substantial barrier for the rational development of future drugs. With this new peptide microarray technology we will be narrowing the industry gap by developing new high-throughput tools in the fields of new diagnostic, prevention and therapeutic tools.

The two involved SMEs have generated new products and services that they will start marketing after the completion of the project. In addition, one spinout start-up company has been created as a result of this project: BioSyPher Ltd ('http://biosypher.com/index.html') providing services using this biointeraction kinetic screening platform for peptide biointeraction monitoring, epitope kinetic mapping and bioassay development while at the same time developing its advanced biointerfaces for SPRi and biotechnological products. Gerardo Marchesini, a joint research centre (JRC) postdoc for the PEPCHIPOMICS project, is the founder of BioSyPher Ltd. He has negotiating access to the European Commission JRC facilities and a license of existing intellectual property (IP) and IP generated during the PEPCHIPOMICS project for its commercial valorisation.

List of websites:

The website of the project can be found at 'http://www.pepchipomics.ku.dk'.

Contact details:

Professor Søren Buus, MD, PhD, Laboratory of Experimental Immunology

Faculty of Health Sciences, University of Copenhagen

Panum Building 18.3.12 Blegdamsvej 3

DK-2200 Copenhagen N, Denmark

Telephone: +45-353-27885, Fax: +45-353-27696, Email: 'sbuus@sund.ku.dk'