High-resolution Mapping of Linear Antibody Epitopes Using Ultrahigh-density Peptide Microarrays*
Antibodies empower numerous important scientific, clinical, diagnostic, and industrial applications. Ideally, the epitope(s) targeted by an antibody should be identified and characterized, thereby establishing antibody reactivity, highlighting possible cross-reactivities, and perhaps even warning against unwanted (e.g. autoimmune) reactivities. Antibodies target proteins as either conformational or linear epitopes. The latter are typically probed with peptides, but the cost of peptide screening programs tends to prohibit comprehensive specificity analysis. To perform high-throughput, high-resolution mapping of linear antibody epitopes, we have used ultrahigh-density peptide microarrays generating several hundred thousand different peptides per array. Using exhaustive length and substitution analysis, we have successfully examined the specificity of a panel of polyclonal antibodies raised against linear epitopes of the human proteome and obtained very detailed descriptions of the involved specificities. The epitopes identified ranged from 4 to 12 amino acids in size. In general, the antibodies were of exquisite specificity, frequently disallowing even single conservative substitutions. In several cases, multiple distinct epitopes could be identified for the same target protein, suggesting an efficient approach to the generation of paired antibodies. Two alternative epitope mapping approaches identified similar, although not necessarily identical, epitopes. These results show that ultrahigh-density peptide microarrays can be used for linear epitope mapping. With an upper theoretical limit of 2,000,000 individual peptides per array, these peptide microarrays may even be used for a systematic validation of antibodies at the proteomic level.
The immune system is endowed with a highly diverse repertoire of antibodies capable of targeting virtually any molecular structure. As specific affinity reagents, antibodies have become indispensable tools with a wide range of scientific and diagnostic applications (1, 2). Thus, antibodies are the main priority of several recent initiatives such as the Human Protein Atlas (3) and the ProteomeBinders consortium (4, 5) and of efforts to generate antibodies against cancer-related targets (6, 7), all of which aim to systematically generate affinity reagents, thereby facilitating the study of proteins and their role in biology and disease. As therapeutic agents, monoclonal antibodies have emerged as essential drugs with a wide range of clinical applications, making monoclonal antibodies one of the highest priorities of the pharmaceutical industry (8–11). The efficiency, accuracy, and safety of these antibody-mediated applications depend crucially on the selected antibodies being directed against the intended, and not against any unintended, target structure(s) (12). Specificity, the quintessential characteristic of an antibody, is therefore not only of scientific interest, but also of considerable practical importance.
For any antibody-based application, the establishment of specificity constitutes an important aspect of the validation process. Traditionally, the specificity of an antibody is examined in one or more in vitro assays (ELISA, Western blot, immunohistochemistry, flow cytometry, surface plasmon resonance, and many more (12–14)). Ideally, the entire epitope space should be examined; however, it is rarely possible to test more than a minor and ostensibly relevant part of the epitope space. What is relevant depends on the intended use; thus, the same antibody might exhibit sufficient and relevant specificity in one, but not in another, application (15). An important aspect of validating the specificity of an antibody is to determine the structure of the epitope that the antibody interacts with (12). Ideally, one would like to determine the three-dimensional structure of the binding complex using x-ray crystallography (16–18) or NMR1; however, such efforts are laborious and tend to have a low success rate and throughput. Many other epitope mapping approaches, such as fragmentation (19) or deuterium exchange in the presence or absence of antibody (20), directed mutagenesis, recombinant expression (including arrayed in situ cell-free translation approaches (20, 21)) of protein and peptide arrays, etc., have been suggested (12). Despite this plethora of methods, exact epitope information is lacking for the vast majority of antibodies used in life science research, and there is a significant need for simple and rapid methods to map epitopes. The availability of such methods would also support the selection of paired antibodies that each bind to separate parts of an antigen, thereby allowing one antibody to validate the results of another (12, 22).
Proteins constitute important immune targets, and many of the methods used to address antibody specificity are tailored for protein antigens. Traditionally, protein epitopes have been divided into discontinuous/conformational epitopes, which require that the native protein structure be intact, or continuous/linear epitopes, which may be represented by consecutive overlapping synthetic peptides encompassing the complete primary structure of the target antigen (15). The mapping resolution of linear epitopes depends on the peptide length, the overlap chosen for the initial epitope location, and the scale of the subsequent fine specificity analysis (e.g. N- and C-terminal truncations; amino acid scans; random single, double, or triple substitutions; etc.). The number of peptides required can be substantial, making the cost of peptides and the logistics of handling large panels of peptides a serious impediment of the in-depth characterization of linear epitopes. Most standard peptide synthesis equipment can synthesize only up to a few hundred single peptides simultaneously, although lately up to 8000 peptides have been synthesized in parallel on a cellulose membrane (23–25) using the SPOTTM technique. In addition to performing assays directly on the membrane (26), such peptides can be released and transferred onto glass slides using additional robotics and printing techniques (25). As alternatives to synthetic peptides, phages (27), bacteria (28), and yeast cells (29) have been used to express libraries of fragmented antigens (27) or of combinatorial peptides (30). These methods can potentially generate millions of peptides covering entire protein antigens, and they may, at least in some cases, mimic conformational epitopes (15, 31–33). Major drawbacks of these methods include the lack of control of the exact peptide sequences expressed and the need for separate sequencing of positive clones. None of these drawbacks are encountered with peptide microarrays.
Here, we present the first report on the feasibility of using ultrahigh-density peptide microarrays to address antibody specificities in casu mapping the fine specificity of polyclonal antibodies raised against linear protein epitopes. This allowed a fast and exhaustive analysis of the length requirements and a detailed analysis of the fine specificity of these antibodies. We suggest that specificity analysis of linear epitopes using ultrahigh-density peptide microarrays addressing the entire human proteome is within reach.
Derivatization of Synthesis Slides
Microscope slides (Nexterion E; Schott AG, Jena, Germany) for synthesis of the arrays were derivatized via incubation with 1 g/l bovine serum albumin in 0.5 m N-methylmorpholine (NMM)/acetate pH 8.5 for 3 h at room temperature. The slides were washed in water, N-methylpyrrolidone (NMP), and dichloromethane (DCM) and stored dry until use. Synthesis of the microarrays was performed directly on the BSA-coated slides using the epsilon amino groups of sterically exposed lysines as the starting point.
Peptide arrays were synthesized by Schafer-N (Copenhagen, Denmark) using a maskless photolithographic technique (34) in which 365 nm light with an energy density of ca. 20 mW/cm2 was projected onto 3′-nitrophenylpropyloxycarbonyl (NPPOC)-photoprotected (35, 36) amino groups on a glass surface in patterns corresponding to the synthesis fields. Details of the technique will be published elsewhere, but briefly, the patterns were generated using digital micromirrors and projected onto the synthesis surface using UV-imaging optics (supplemental Fig. S1A). In each layer of amino acids, the relevant amino acids were coupled successively to predefined fields after UV-induced removal (in 1 mdiisopropylethylamine (DIEA) in NMP) of the photoprotection groups in those fields. The couplings were made using standard Fmoc-amino acids activated with O-benzotriazole-N,N,N′,N′-tetramethyl-uronium-hexafluoro-phosphate/DIEA in NMP. After coupling of the last Fmoc-amino acid in each layer, all Fmoc-groups were removed in 20% piperidine in NMP and replaced by NPPOC groups (37) coupled as the chloroformate in DCM with 0.1 m DIEA. The procedure was then repeated until all amino acids had been added to the growing peptide chains (supplemental Fig. S1B). Final cleavage of side protection groups was performed in TFA:1,2-ethanedithiol:water 94:2:4 v/v/v for 2 h at room temperature.
Epitope Mapping Using Peptide Arrays
Primary rabbit polyclonal antibodies were diluted to a concentration of around 100 ng/ml in PBS-Tween. Deprotected slides were blocked and hydrated overnight in a mixture of 1 g/l bovine serum albumin and 0.1% v/v detergent (Tween 20) in PBS and incubated for 1 h at room temperature with relevant polyclonal anti–protein epitope signature tag (PrEST) rabbit antibodies as primary reagents. After washing, the slides were incubated for 1 h at room temperature with Alexa Fluor 488-labeled goat-anti-rabbit IgG (Invitrogen, Carlsbad, CA) as a secondary reagent. Images of the stained arrays were recorded using an MVX10 fluorescence microscope equipped with an XM10 cooled digital camera (both from Olympus, Ballerup, Denmark) and analyzed using the analysis program PepArray (Schafer-N, Copenhagen, Denmark). See the supplementary information for a brief description of the PepArray program.
Epitope Mapping Using Cell-surface Display
Mapping using cell-surface display was performed as described elsewhere (28). Briefly, gene fragments encoding the different antigens were amplified separately via PCR (4.8 ml pooled), and the products were sonicated to generate random fragments. These were blunt-ended and phosphorylated before ligation into the cell-surface expression vector pSCEM2 and transformed into Staphylococcus carnosus. Cell aliquots of about 10-fold coverage of the library were incubated with about 1 ng antibody in reaction volumes of 70 μl PBS-P. Cells were washed and fluorescently labeled with Alexa 488 secondary goat-anti-rabbit antibodies (Invitrogen, Carlsbad, CA) and Alexa 647 labeled albumin for expression normalization and then washed again ahead of analysis via FACS. Single cells expressing antibody-binding peptides were sorted, sequenced, and aligned back to the target protein sequence.
Ultrahigh-density Peptide Microarrays
Peptide arrays were generated by a combined maskless photolithographic (34) and solid phase peptide synthesis strategy using a digital mirror device (1080P DMD (Digital Light Projections, Digital Light Innovations, Austin, TX) with 1920 × 1080 = 2,073,600 individually addressable micromirrors) to project 365 nm light onto NPPOC-photoprotected (35, 36) amino groups on a glass surface in patterns corresponding to the fields where the next amino acid extension should occur (supplemental Fig. S1A). Successively removing photoprotection groups extending the growing peptide chain with standard Fmoc-protected amino acids and exchanging the Fmoc-groups with NPPOC-groups after all extensions in a given layer (37) allowed individually predefined peptides to be built in each synthesis field (supplemental Fig. S1B). After synthesis of the peptide backbones, all side chain protection groups were removed via TFA treatment, leaving the peptides attached to the matrix through their C-terminals. Typically, each synthesis field was defined by a square measuring 2 × 2 (as in Fig. 1) or 3 × 3 mirrors. However, because synthesis fields defined by as few as one mirror could be discerned (Fig. 1), the maximum number of different synthetic peptides that can be realized with the current DMD device appears to be around 2,000,000 on a surface area of ∼2 cm2.
Polyclonal Antibodies Specific for Linear PrEST Epitopes on Human Proteins
PrESTs are short (50 to 150 amino acids long) fragments of proteins that have been selected to be as sequence-dissimilar as possible to all other proteins in the corresponding proteome; that is, they aim to be unique and specific representatives of the proteins in question (38). As part of the Human Protein Atlas initiative (38), polyclonal rabbit antibodies were raised against PrESTs, which were expressed in E. coli and purified under denaturing conditions. Subsequently, the antibodies were affinity-purified using the same PrESTs as used for capture reagents. The specificities of the antibodies were validated with protein microarrays using immobilized PrESTs (see supplemental Fig. S2) and by Western blotting of lysates of human cell and tissues (data not shown). This immunization and purification strategy favors the generation of antibodies specific for linear epitopes. Theoretically, polyclonal antibodies could target multiple consecutive epitopes along the sequence of an extended protein in casu encompassing an entire PrEST; however, we have recently mapped a PrEST-specific polyclonal antibody to a few separate and distinct regions of its target protein, suggesting that large parts of a target sequence may be “epitope silent” (28, 39).
The Location and Length of Linear Epitopes
The ultrahigh-density peptide microarray technology was used to map the specificities of 22 polyclonal anti-PrEST antibodies (38). Initially, we addressed the location and length of the recognized epitopes by systematically scanning through the entire sequence of each PrEST using an overlapping peptide strategy with an offset of one amino acid (this is the smallest offset possible and thus allowed us to achieve the maximum resolution) and included all lengths from 2-mers to 20-mers. This experiment entailed the synthesis of more than 74,000 different peptides. To counter the possible influences of artificially introduced N-terminals or artificially tethered C-terminals, all peptides were extended with “padding sequences” (N-terminally with GAG and C-terminally with GAGADDD).2Peptide microarrays were stained and recorded as detailed in Materials and Methods. Briefly, the slide was blocked, incubated for 1 h at room temperature with relevant polyclonal anti-PrEST antibodies as primary reagents, and stained for 1 h at room temperature with an Alexa Fluor 488-conjugated goat-anti-rabbit serum as a secondary reagent. Images of the stained arrays were captured with a fluorescence microscope and analyzed using a microarray analysis program.
All possible 15-mers from each of the 22 PrESTs were assayed with the corresponding polyclonal anti-PrEST antibodies, and the locations of one or more epitope regions were suggested for each anti-PrEST antibody. As a representative example, an epitope location and length scan of a polyclonal antiserum generated against a 145 aa long PrEST representing the 43 kDa human polypeptide 1 of the small nuclear RNA activating protein complex SNAPC1 is shown (Fig. 2). A bar graph illustrates the background-subtracted signals obtained from overlapping 15-mers with an offset of one amino acid. Several distinct peaks of reactivity were located (Fig. 2A). To give an overview of the relationship between peptide length and reactivity, data obtained with different peptide lengths were converted to color-coded strings of the PrEST sequence indicating strong, intermediary, and weak reactivity of the corresponding peptides (Fig. 2B). This readily revealed the shortest recognizable sequences of the most dominant reactivities (e.g.strongly interacting 6-mer EFKDPS and 7-mer KTNDGEE peptides; intermediary interacting 8-mer KLITSDVL, 10-mer VDKSKPDKAL, and 9-mer LDSSDSDSA peptides). The minimum length requirement thus varied considerably from epitope to epitope. For this particular polyclonal antibody preparation, no signals were obtained for peptides shorter than six amino acid residues (excluding paddings).
Mapping the Fine Specificity of Polyclonal Antibodies Reacting with Linear Epitopes
To study the fine specificity of polyclonal anti-PrEST antibodies, complete single amino acid substitution analyses were performed on the most prominent epitope regions suggested by the overlapping peptide scans described above. In an attempt to encompass the putative epitope in its entirety, each region was represented by a 15-mer peptide centered at the position of peak reactivity. All 20 naturally occurring amino acids were systematically tested as single substitutions in all 15 positions. The signal obtained from each singly substituted peptide was divided by the signal obtained with the native 15-mer peptide, and the resulting relative signal (RS) values were used to generate position-specific scoring matrices (PSSMs). As representative examples of such a single substitution analysis (SSA), the previously described EFKDPS and KLITSDVL epitopes are illustrated (Figs. 3A and and33C, respectively). For each position, the mean and standard deviation of the 20 RS values were calculated. Positions with maximum selectivity (i.e. where only one amino acid is acceptable) would be represented by an RS value of 1 for the essential amino acid and of 0 for all the other amino acids, leading to an average RS value of 0.05 for this position. In contrast, positions with minimum selectivity (i.e. where any amino acid is acceptable) would be represented by RS values of 1 for all amino acids, leading to an average RS value of 1. A one-way analysis of variance (ANOVA) (40) was done for each PSSM to determine whether two or more of the mean values differed significantly from each other. If so, then Tukey’s least significant difference (LSD) was calculated to determine which of the mean values differed significantly (p < 0.01) from the null hypothesis of no selectivity (RS = 1). For the RAEVTEEFKDPSDRV region (Fig. 3A), the average RS values of the first six and last three positions did not deviate significantly from the null hypothesis (range: 0.83–1.06). In contrast, the null hypothesis was rejected for positions 7–8 and 9–12, where the average RS values were significantly less than 1 (range: 0.14–0.56). Note that position 9 featured a borderline selective position with an average RS value of 0.82, almost introducing a gap in the middle of this selectivity hot spot. Thus, the epitope contained within this 15-mer region could be defined as the 6-mer region containing the sequence EF-DPS (the most dominating residues are underlined, and any internal nonselective positions are indicated by a dash). Similarly, the epitope contained within the AVMKLITSDVLEEML region (Fig. 3C) could be defined as the 7-mer region containing the sequence KLITSDV (average RS values ranging from 0.15 to 0.60) surrounded by nonselective residues (average RS values ranging from 0.84 to 1.05). With a visual representation of the individual RS values employing a continuous color scale (green and red showing high and low binding, respectively), the fine specificities of these interactions were distinctly visible (see Figs. 3A and and33C). Note the strong discriminatory power of these polyclonal antibodies where most of the selective positions excluded even single conservative substitutions. An alternative visual representation, a sequence LOGO, illustrated the demarcated borders of the epitopes and the presence of particularly selective positions (Figs. 3B and and33D).
Complete fine specificity analyses were extended to 22 anti-PrEST antibodies involving 79 putative epitope regions. The majority of these analyses, 49 (62%; overview in Fig. 4), resulted in PSSMs showing a highly significant pattern of selective positions indicative of the presence of antibody epitopes (p < 0.00001, ANOVA); 95% of these epitopes appeared fully contained within the 15-mer stretch selected for the fine specificity analysis, whereas the remaining 5% extended to the N- or C-terminus of the selected 15-mer stretch and could potentially extend even further. Particularly noteworthy, the epitopes identified by the substitution analysis were in general well defined with sharply demarcated borders (Figs. 3 and and4).4). In general, the epitopes were from 5 to 10 amino acids long (range from 4 to 12 amino acids; Fig. 5). The epitopes contained highly dominant positions where only a few amino acid substitutions were acceptable, less dominant positions where several amino acid substitutions were acceptable, and nonselective positions where all amino acid substitutions were acceptable (as illustrated in Fig. 3). The SSA failed to identify epitopes in 30 of the 79 (38%) 15-mer regions selected for further analysis by the length scan. On a protein antigen basis, this analysis confirmed the presence of epitopes for 20 of the 22 examined antibodies (91%).
Articles from Molecular & Cellular Proteomics : MCP are provided here courtesy of American Society for Biochemistry and Molecular Biology