Proteome of the Escherichia coli envelope and technological challenges in membrane proteome analysis

Proteome of the Escherichia coli envelope and technological challenges in membrane proteome analysis

Available online at www.sciencedirect.com Biochimica et Biophysica Acta 1778 (2008) 1698 – 1713 www.elsevier.com/locate/bbamem Review Proteome of t...

373KB Sizes 0 Downloads 5 Views

Available online at www.sciencedirect.com

Biochimica et Biophysica Acta 1778 (2008) 1698 – 1713 www.elsevier.com/locate/bbamem

Review

Proteome of the Escherichia coli envelope and technological challenges in membrane proteome analysis Joel H. Weiner a,b,⁎, Liang Li c a

Membrane Protein Research Group and The Institute for Biomolecular Design, University of Alberta, Canada Department of Biochemistry and Institute for Biomolecular Design, University of Alberta, Edmonton, Alberta, Canada T6G 2E7 Department of Chemistry and The Institute for Biomolecular Design, University of Alberta, Edmonton, Alberta, Canada T6G 2E1

b c

Received 18 April 2007; received in revised form 19 July 2007; accepted 23 July 2007 Available online 11 August 2007

Abstract The envelope of Escherichia coli is a complex organelle composed of the outer membrane, periplasm-peptidoglycan layer and cytoplasmic membrane. Each compartment has a unique complement of proteins, the proteome. Determining the proteome of the envelope is essential for developing an in silico bacterial model, for determining cellular responses to environmental alterations, for determining the function of proteins encoded by genes of unknown function and for development and testing of new experimental technologies such as mass spectrometric methods for identifying and quantifying hydrophobic proteins. The availability of complete genomic information has led several groups to develop computer algorithms to predict the proteome of each part of the envelope by searching the genome for leader sequences, β-sheet motifs and stretches of αhelical hydrophobic amino acids. In addition, published experimental data has been mined directly and by machine learning approaches. In this review we examine the somewhat confusing available literature and relate published experimental data to the most recent gene annotation of E. coli to describe the predicted and experimental proteome of each compartment. The problem of characterizing integral versus membraneassociated proteins is discussed. The E. coli envelope proteome provides an excellent test bed for developing mass spectrometric techniques for identifying hydrophobic proteins that have generally been refractory to analysis. We describe the gel based and solution based proteome analysis approaches along with protein cleavage and proteolysis methods that investigators are taking to tackle this difficult problem. © 2007 Elsevier B.V. All rights reserved. Keywords: Cytoplasmic membrane; Periplasm; Outer membrane; Mass spectrometry; Proteomics; Polyacrylamide gel electrophoresis

Contents 1. 2. 3.

4.

Introduction . . . . . . . . . . . . . . . . . . . E. coli envelope composition . . . . . . . . . . The outer membrane proteome . . . . . . . . . 3.1. Predicted outer membrane proteome . . . 3.2. Experimental outer membrane proteome . The periplasmic proteome . . . . . . . . . . . . 4.1. Predicted periplasmic proteome . . . . . 4.2. Experimental periplasmic proteome . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

1699 1699 1700 1700 1701 1702 1702 1703

Abbreviations: 2MEGA, dimethylation after lysine guanidination; ABC, ATP binding cassette; CNBr, cyanogen bromide; ESI, electrospray ionization; ICAT, isotope-coded affinity tag; IEF, isoelectric focusing; IMP, inner membrane protein; LC, liquid chromatography; MAAH, microwave-assisted acid hydrolysis; MALDITOF, matrix assisted laser desorption/ionization-time of flight; MS, mass spectrometry; OMP, outer membrane protein; ORF, open reading frame; PAGE, polyacrylamide gel electrophoresis; PMF, peptide mass fingerprinting; SDS, sodium dodecyl sulfate; TFA, trifluoroacetic acid; TMM, transmembrane ⁎ Corresponding author. Department of Biochemistry and Institute for Biomolecular Design, University of Alberta, Edmonton, Alberta, Canada T6G 2H7. Tel.: +1 780 492 2761; fax: +1 780 492 0886. E-mail addresses: [email protected] (J.H. Weiner), [email protected] (L. Li). 0005-2736/$ - see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.bbamem.2007.07.020

J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

5.

The cytoplasmic membrane proteome . . . . . . . . . 5.1. Predicted cytoplasmic membrane proteome . . . 5.2. Respiratory complexes. . . . . . . . . . . . . . 5.3. Experimental cytoplasmic membrane proteome . 6. Comparison with microarray . . . . . . . . . . . . . . 7. Methodology for membrane proteome analysis. . . . . 7.1. Gel-based proteome analysis platform. . . . . . 7.2. Solution-based proteome analysis platform . . . 8. Future perspectives . . . . . . . . . . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

1. Introduction The availability of complete genomic sequences for hundreds of microorganisms, as well as transcriptome measurements of mRNA levels under a variety of physiologic conditions, has spurred investigators to determine the complete protein composition, or proteome, of entire bacterial cells and how the proteome changes with physiologic shifts. Membrane proteins which play essential roles in energetics, metabolism, signal transduction and transport compose as much as 40% of the entire proteome of a bacterial cell. The hydrophobic nature of membrane proteins has hindered their investigation at the functional and structural level. Similarly, characterizing the membrane proteome has lagged studies of the soluble proteome. However, recent technical advances in separation methodologies by 2-dimensional electrophoretic techniques and 2dimensional HPLC methodologies, coupled with improved proteolytic methodologies, and advanced mass spectrometry instrumentation is opening the membrane proteome to experimental evaluation. This review will summarize current data on the predicted and observed proteome of the Escherichia coli K-12 envelope along with advances in methodology that allow more detailed exploration of this proteome. The complete proteome of E. coli is being examined for many reasons. The International E. coli Alliance [1] brings together biochemists, system biologists and computer scientists with the aim of developing a detailed understanding of the entire transcriptome, proteome, interactome, metabolome and physiome of E. coli in response to physiologic, genetic or environmental variation with the aim of creating an in silico E. coli where changes can be predicted and experimentally verified. Whereas the transcriptome determined by microarray techniques gives a whole genome profile at the mRNA level, proteomics gives information on expression levels. Proteomics provides information on post-translational modifications which is not possible by microarray and proteomics can be used to investigate subcellular localization such as the envelope. The E. coli genome contains 4452 open reading frames (ORF). Of these 2403 (54.1%) have an experimentally determined function and a further 1425 (32%) have a computationally determined function. The remaining 663 (14.9%) are of unknown function. Interestingly, 238 ORFs (5.3%) do not have homologues in other genomes [2]. About one third of the total

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

1699

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

1704 1704 1705 1705 1706 1706 1706 1708 1710 1710 1710

cellular proteome is associated with the envelope and at least one half of these proteins are of unknown or putative function. It is quite possible that the functionally uncharacterized proteins catalyze unique metabolic transitions and these may prove useful for the biotechnology industry. Thus, a full understanding of E. coli requires the development of novel technologies including proteomic analysis to determine the function of these proteins. Proteomic studies under varying growth, regulatory and stress conditions will help with the functional assignment of unknown proteins. In addition, because of the hydrophobicity of membrane proteins there are significant technical difficulties in studying the membrane proteome and E. coli provides a standard for evaluating and validating new technologies. E. coli offers an excellent model system to develop new extraction, solubilization, proteolytic, electrophoretic and mass spectrometry techniques which will be applied to the more complex proteomes of higher organisms. 2. E. coli envelope composition The envelope of E. coli is a complex structure composed of the outer membrane, periplasm/peptidoglycan layer and the cytoplasmic membrane. The envelope is usually prepared by lysozyme-EDTA lysis [3], French press lysis [4] or Avestin EmulsiFlex [5] lysis of bacterial cultures. These processes form vesicles that can be isolated by differential centrifugation. Inner and outer membranes can be isolated by density centrifugation comprising an 8000×g spin to sediment unbroken cells and debris followed by a high speed spin (150,000×g) to sediment inner and outer membrane vesicles [6]. Inner from outer membrane vesicles can then be separated by density centrifugation using a sucrose gradient, with the outer membranes having a higher density. In most cases vesicle integrity is such that cytosolic proteins trapped in the vesicles can be removed by buffer washing. Loosely-associated extrinsic proteins are often removed by washing with chaotropic agents such as Na carbonate [7]. Fractions containing periplasmic proteins can be prepared by the cold osmotic shock method [8]. One of the first attempts to characterize the envelope proteome was carried out by Ames and Nakaido in the mid-1970s [9] using sodium dodecyl sulfate solubilization coupled with O'Farrell two-dimensional polyacrylamide gels (isoelectric focusing (IEF) in the first dimension and SDS polyacrylamide gel electrophoresis (PAGE) in the second dimension). They

1700

J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

Table 1 Predicted outer membrane proteome Prediction

β-barrel

Lipoprotein

Reference

Molloy-SwissProt38 BOMP TMB-HUNT PSORTdb Riley et al.

39 103 69 91 23

23 N/A N/A Does not list 101

[21] [17] [22] [23] [2]

noted the complexity of the sample and observed over 150 spots on the gel. It is not clear that this represented 150 different proteins as proteolysis, post-translational modification and isoelectric changes (e.g. deamidation) could have resulted in multiple spots for the same polypeptide. Technology at the time did not allow them to identify the proteins. As will be described below, recent advances in analytical techniques such as the melding of two-dimensional liquid chromatography, improved two dimensional gel methods, mass spectrometry, selective proteolysis and bioinformatics allow relatively facile identification of the individual proteins. The entire genomic sequence of E. coli MG1655 has been available for many years [10], and multiple attempts have been made to assign a function and location for each open reading frame. Bioinformatic attempts to assign a function are often carried out in parallel with experimental approaches to define function and location and by inference those proteins associated with the envelope. The challenge of such efforts is to provide insightful information that is neither contradictory nor confusing. This is not a simple exercise. 3. The outer membrane proteome 3.1. Predicted outer membrane proteome The outer membrane consists of phospholipids, membrane proteins, lipoproteins and lipopolysaccharide. It forms the outermost layer of the cell envelope and is covalently attached to the peptidoglycan by an almost continuous layer of small lipoprotein molecules (Lpp or Braun's lipoprotein) [11]. The peptidoglycan (murein) layer is a lattice formed by repeating disaccharides interconnected by peptides that make up the bacterial cell wall. It is a polymer of N-acetylglucosamine (gluNAc) alternating with molecules of N-acetylmuramic acid (murNAc). In the eubacteria these molecules are cross-linked by pentapetides (L-alanine-D-glutamate-meso-diaminopimelic acid-D-alanine-D-alanine). These peptides are unusual in that they contain the rare D-enantiomers of the Ala and Glu residues [12]. The outer membrane of Gram-negative bacteria is the interface between the cell and the environment and this distinguishes Gram-negative from Gram-positive organisms that lack this layer. The double leaflet membrane is unlike a typical phospholipid bilayer as it has phospholipids on the internal leaflet and lipopolysaccharide on the outer leaflet. Integral membrane proteins of the inner and outer membrane differ in structure. Cytoplasmic membrane proteins are hydrophobic and are composed of α-helices 15–25 amino acids in

length that cross the membrane. Many algorithms are available to identify these, ranging from the original Kyte Doolittle algorithm [13] to the TMHMM algorithm [14–17] which gives the most reliable prediction of the presence of transmembrane α-helices. The outer membrane encompasses a limited number of proteins and structural studies of several of them [18,19] have shown that they are comprised of β-barrel motifs and lack the αhelices found in cytoplasmic membrane proteins. The β-barrel structures form mono, di and trimeric structures with 8–22 βTable 2 Outer membrane proteome comparison Riley

Riley

Riley

Riley Molloy Fountoulakis LopezGevaert Campistrous

acrE bcsZ blc borD btuB cirA csgG cusC ecnA ecnB emtA fadL fecA fepA fhuA fhuE fimD fiu flgH fliL hslJ kefA lamB lepB lolB mipA mltA mltB mltC mltD mulI nfrA nlpB nlpC nlpD nlpE nlpI nmpC nrfG ompA ompC ompF

ompG ompL ompN ompW ompX osmB osmE pal panE phoE pldA rcsF rlpA rzoD rzoR sfmD slp slyB smpA spr srlD tolC tsx uidC vacJ wza yaeF yaeT yafT yaiW yajG yajI ybaY ybbC ybdA ybeT ybfM ybfN ybfP ybgQ ybhC ybjP

ybjR ycaL ycbS yccZ ycdR ycdS yceB ycfL ycfM ycjN ydbA ydbJ ydcL yddW ydeK yeaY yecR yedD yegR yehB yfaZ yfcT yfeN yfeY yfgH yfgL yfhG yfiB yfiL yfiM yfiO ygdI ygdR ygeR yggG yghG ygiB yhcD yhdV yhfL yiaD yidQ

yidX yjaH yjbF yjcP yjeI yjfO ymcC ynbE yncD yoaF yohG ypdI yqhH yraJ yraP ysaB

btuB cirA fadL fepA fhuA fhuE flu lamB nlpB ompA ompC ompF ompP ompT ompW ompX pal slp slyB tolC tsx ybhC ybiL yaeT yeaF

btuB cirA fecA fepA fhuA fimD flu lolB lamB mltA nfrA nlpB ompA ompC ompF ompT pal pldA tolC tsx yaeT yciD yeaF yncD

btuB cirA fadL fecA fhi fhuA hemM imp lamB mulI nlpB ompC ompF ompT ompX osmE pal rlpA slp slyB tolC tsx yajG ybhC ybjP yciD yeaF yedD yfgL yfiO yifL ynfB yraM yraP

aceE fimD lolB nlpB ompA ompC ompF ompP ompX rlpA ycjI yedD yfeY yfiO ygdI yjeI yraP

The predicted outer membrane proteome from the Riley consortium annotation [2]. This list includes those proteins listed as outer membrane by “cell localization” as well as those proteins listed as outer membrane by “gene product description” that are underlined. A comparison of the outer membrane proteins determined by Molloy [21], Fountoulakis [25] and Lopez-Campistrous [28] using 2D gel electrophoresis and Gevaert [26] using LC-MS was undertaken by combining the Supplementary Data in their publications with the Riley et al. annotation. Proteins in underline italics are common to all three reports. The unique proteins identified by Gevaert [26] by LC MS are in bold.

J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

barrels. The β-barrel is composed of alternate polar and nonpolar amino acids. The non-polar amino acids point into the lipid and protein interface, while the polar amino acids point into the interior of the barrel. Because of this complexity, facile prediction of the location of the transmembrane domains of outer membrane proteins remains elusive. In addition to the outer membrane proteins with the β-barrel architecture, many lipoproteins [20] are found on the outer membrane including the major outer membrane protein MulI (Lpp, Braun's lipoprotein) [11] that links the outer membrane to the peptidoglycan layer. It is estimated that each log phase E. coli cell contains 1.6 × 105 copies of this protein, making it one of the most abundant in the cell. Other major outer membrane proteins include OmpA ∼ 105 copies per cell, OmpC ∼ 2 × 104 copies per cell and OmpF ∼ 104 copies per cell. These proteins form hydrophilic channels allowing free diffusion of small molecules across the outer membrane. The outer membrane also contains specific channels such as PhoE for phosphate, LamB for maltose and maltodextrins, Tsx for nucleosides. These proteins also serve as attachment sites for colicins and bacteriophages (e.g. Tsx for phage T6, LamB for phage λ) from which their names derive. There are also high affinity receptors for ferric iron (FepA, FhuA), vitamin B12 (BtuB), fatty acids (FadL). β-barrel structures are difficult to predict due to variable properties and the short stretch of hydrophobic amino acids. Molloy [21] carried out one of the earlier analyses of the outer membrane proteome using the SwissProt Release 37 database that had 58 potential outer membrane proteins based on a combination of experiment, computer driven prediction and similarity to OMPs from other organisms. Thirty-seven β-barrel integral proteins were predicted to have a pI of 4–7, two integral proteins had a pI N 7. There were 10 lipoprotein OMPs with pI of 4–7 and 9 proteins with pI N7 (Tables 1 and 2). No OMPs had a predicted pI b4. The more recent release of the SwissProt database Release 42 reveals 71 OMPs, but this list contains 12 proteins (FhiA, GspD, HlpA, HofQ, Imp, PgpB, YiaT, YifL, YnfB, YpjA, YqiG, YraM) that are probably not OMPs based on the Riley annotation [2]. BOMP [17] (Tables 1 and 2) was developed to predict integral β-barrel OMPs. BOMP uses a C-terminal pattern typical of internal β-barrel proteins and comparison to stretches of amino acids typically found in transmembranal β-strands of proteins with resolved crystal structures. These workers found ninety-one β-barrel proteins using their algorithm and an additional 12 proteins based on BLAST searches for a total of 103 of the 4346 polypeptides in E. coli to be possible integral OMPs. Sixty-seven of the proteins were previously annotated as OMP within SwissProt leaving thirty-six possible additional OMPs. Using this algorithm on various prokaryotic genomes gave a range of 1.8% to 3% OMPs. The TMB-Hunt algorithm [22] (Tables 1 and 2) is designed to find β-barrel proteins based on the presence of a signal sequence and amino acid composition. Using the NCBI FTP site, various strains of E. coli (including pathogens) have a total of 5341 different proteins and, of these, 1032 polypeptides have a signal sequence to direct them to the periplasm or outer membrane. Eighty-seven proteins were identified as transmem-

1701

branal β-barrel. In E. coli K12, as many as 782 proteins have a signal sequence and they predict that 69 are transmembranal βbarrel proteins. Unlike transmembranal β-barrel proteins, outer membrane lipoproteins are easier to identify. These proteins can be identified by the presence of a leader with a common consensus sequence [20]. The leader is typically between 15 and 40 amino acid residues in length, and has at least one arginine or lysine in the first seven residues. The leader is cleaved by signal peptidase II, on the amino terminal side of the cysteine residue which is then enzymatically modified by the addition of N-acyl and Sdiacyl glyceryl groups [20]. This lipid serves to anchor these proteins to the outer leaflet of the inner membrane or inner leaflet of outer membrane so that they can function in the aqueous periplasmic interface. PSORTdb is a database that combines experimentation and computational prediction to determine the subcellular location of a protein [23] (Tables 1 and 2). Their analysis of the E. coli genome lists 91 outer membrane proteins, but does not distinguish between β-barrel proteins and lipoproteins. Recently, a consortium of investigators has produced a new annotation of the ORFs of E. coli [2]. The E. coli consortium database relied more on experimental data for annotation and found that only 23 β-barrel proteins are identified along with 142 potential outer membrane lipoproteins. This listing comes from a combination of proteins annotated as “outer membrane” by “cell location” (125 proteins) as well as 17 proteins annotated with an outer membrane location by “gene product description” (underlined in Table 2). 3.2. Experimental outer membrane proteome Analysis of the outer membrane is complicated by a series of technical problems including the extraction and solubilization of outer membrane proteins prior to 2D PAGE, solubilization of intractable proteins, trypsin digestion, artifacts due to the issue of loading large amounts of protein onto 2D gels, protein microanalysis by mass spectrometry and the databases used for identification of peptides. These problems also hold for the inner membrane proteome. Molloy [21] used a combination of 2D PAGE and matrix assisted laser desorption/ionization-time of flight (MALDITOF) MS to characterize the outer membrane proteome. E. coli cells were broken by French press lysis to prepare a total envelope fraction that was washed with 0.1 M Na carbonate to remove peripheral proteins. The whole envelope was solubilized in 7 M urea, 2 M thiourea, 1% ASB14 surfactant (amidosulfobetaine with an alkyl tail containing 14–16 carbons [24] (Table 2). Apparently, this surfactant mixture did not solubilize proteins from the inner membrane or proteolytic digestion of integral peptides taken from gel plugs was not efficient. Molloy identified 40 unique proteins, 75% of which were outer membrane or membrane-associated. They identified 21 of the 37 predicted integral OMPs (78% of OMP not including hypothetical) plus one probable OMP, 5 lipoprotein OMPs, 2 cytoplasmic membrane associated polypeptides (AtpB, NuoC), 1 cytoplasmic membrane lipoprotein (AcrA), 4 periplasmic proteins,

1702

J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

Fig. 1. Venn diagram noting the overlapping and uniquely identified proteins in the experimentally identified outer membrane proteins by Molloy (M), Fountoulakis (F) and Lopez-Campistrous (L) data as taken from Table 2 and [21,25,28].

flagellin, 2 cytoplasmic, 2 unknown, and 1 unknown lipoprotein. Re-analysis of their list with the Riley listing indicates 25 OMPs (Table 2). Fountoulakis and Glasser [25] carried out a proteomic analysis of E. coli envelopes composed of both inner and outer membranes. They solubilized the membranes with mild (CHAPS) or strong detergents (SDS, sodium cholate or sodium deoxycholate), carried out 2D PAGE, in gel proteolysis and MALDI-TOF MS. They identified a total of 394 different gene products of which 25 were annotated to the outer membrane (24 using the Riley annotation list in Table 2). In one of the first gel-free proteomic analysis of the E. coli proteome, Gevaert et al. [26] identified more then 800 proteins. They analyzed only methionine containing peptides in a total peptide mixture of an unfractionated lysate. They identified 11 outer membrane proteins using the MASCOT search engine for mass spectrometry [27] and a database of E. coli methionine containing peptides and 14 proteins using the PROCORR database at the 95% confidence limit (12 proteins at 98% confidence). Re-analysis of their data using the Riley et al. [2] database indicates that 17 OMPs were identified (Table 2). Lopez-Campistrous et al. [28] carried out a large scale 2D gel analysis of the E. coli proteome in cells fractionated into cytoplasm, inner membrane, periplasm and outer membrane by the Yamato procedure [6]. They identified 60 proteins in the isolated outer membrane fraction. Thirty-one proteins are characterized in Swiss-Prot as outer membrane proteins (34 proteins using the Riley [2] annotation Table 2). Fig. 1 is a Venn diagram, graphically depicting the degree of overlap and uniquely identified proteins between the Molloy, Fountoulakis, and Lopez-Campistrous databases. A large number of soluble

cytoplasmic proteins were found in the outer membrane fraction. Fountoulakis [25] also found a large number of cytoplasmic proteins associated with the membrane envelope. This highlights the difficulty of using fractionation methodologies as proteins from the cytoplasm can adventitiously stick to other fractions giving erroneous localization. Inner membrane proteins can be found in the outer membrane fraction if there are zones of adhesion between the membranes [29]. Lai et al. [30] extended the proteomic characterization of the E. coli membrane to one of the first functional analysis. Using 2D PAGE, image analysis techniques and trypsin in gel digestion with MALDI mass spectrometry they compared the membrane proteome of minicells that are enriched in polar proteins to rod cell membranes. One hundred seventy-three spots were analyzed and 54 proteins identified. Thirty-six spots were enriched in minicells and 15 spots in rod cells showing that there is polar distribution of envelope proteins. Examination of their results indicated that they identified 14 integral outer membrane proteins, 6 lipoproteins and 4 associated cytoplasmic membrane proteins. Only 1 integral membrane protein, YiaF, was identified. In a recent study Marini et al. [31] examined new candidate outer membrane proteins that have been identified by bioinformatics prediction. The proteins were cloned and overexpressed, and finally localized by cell fractionation to the outer membrane. They confirmed the outer membrane localization for five proteins – YftM, YaiO, YfaZ, CsgF, and YliI – and also provide preliminary data supporting an outer membrane location for a sixth — YfaL. Thus, unlike the predicted outer membrane proteome, experiments to date have only identified a limited number of proteins in the outer membrane fraction. 4. The periplasmic proteome 4.1. Predicted periplasmic proteome The periplasmic space is a gel-like layer composed of soluble proteins located between the outer and inner membranes. It is highly enriched in proteases and nucleases that would be detrimental if they were located in the cytoplasm. The periplasm contains many substrate binding proteins to capture nutrients which are often present at very low concentrations. Proteins localized in the periplasmic space can be identified by searching the genome for leader sequences associated with Sec [32] or Tat motifs [33] which address these proteins to their respective translocons. Several algorithms and databases are available for this analysis. PSORTdb predicts 141 periplasmic proteins [23]. Nielsen et al. [34] used an approach based on neural networks trained on separate sets of prokaryotic sequences and identified 224 proteins. The Project Cybercell (www.projectcybercell.ca) database lists 200 periplasmic proteins. The recent consortium database [2] lists 367 periplasmic proteins (Table 3) using a combination of “cell localization” and “gene product description” and is the most complete annotation of E. coli. This listing may be an over-estimate as it includes some proteins that are associated

J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713 Table 3 Predicted and experimental periplasmic proteome

1703

Table 3 (continued)

Riley

Riley

Riley

Riley

Lopez-Cam

Gevaert

ackA agp alsB amiA amiB amiC ampC ampH ansB aphA appA araF argT artI artJ aslA asr bax bglH bglJ bglX btuB btuF carB ccmB ccmE ccmH chbB chiA citE coaE cpdB cpxP creA csgA csgB csgC csgE cueO cusF cybC cysP dacA dacB dacC dacD dcrB ddpA degP degQ degS dmsA dppA dsbA dsbC dsbG eco ecpD endA erfK eutP fabF fdoG fecA

hdeB hisJ hlpA hofQ htrE hybA iap imp ivy kdsC ligB livJ livK lolA lsrB malE malM malS marB mdh mdoD mdoG mepA metQ metR mglB miaA modA mpl mppA mqo mviM napA napB napC napD napF napH nfo nfrA nikA nrfA nrfB nrfC ompA ompG ompT oppA osmY paaH pbl pbpG phnD phnP phoA potD potF ppiA prc prkB proX pspE pstS pta

treA trxB tynA ugpB uidC ushA wcaM xylF yaaI yacC yacH yadM yadN yaeT yagW yagX yagY yagZ yahB yahJ yahO yaiT ybaE ybaV ybcL ybcS ybfC ybgD ybgF ybgO ybgP ybgQ ybgS ycbB ycbF ycbQ ycbR ycbV yccT ycdB ycdK ycdO ycdS yceI ycfR ycgH ycgJ ycgK ychP yciM ycjN ydaS ydbD ydcA ydcS yddB yddL ydeI ydeN ydeR ydfD ydgD ydgH ydhO

yfcU yfdX yfeK yfeW yffQ yfgI yfhD yfiR yfjT ygcG ygcI ygeQ yggE yggM yggN yghX ygiL ygiS ygiW ygjG ygjJ ygjK yhbN yhcA yhcD yhcE yhcF yhcN yhdW yheM yhhA yhjJ yiaO yiaT yibG yieL yifB yigE yiiQ yiiX yijF yjbG yjcO yjcS yjfJ yjfN yjfY yjhA yjhT yjjA ykfB ykfF ykgI ykgJ yliB yliI ymcA ymcB ymgD yncD yncE yncI yncJ ynfB

agp ansB araF artI bglX cpdB dppA dsbA dsbC eco fkpA fliY glnH glpQ hisJ livJ livK lolA malE mdh mdoG mglB modA mppA nikA oppA osmY potD potF proX ptr pstS rbsB slt sodC sufI surA thrC tolB treA ugpB ushA ybgF ycdO ydcS ydeN yehZ yggN ygiW yhbN yncE ynjE yrbC ytfQ

ackA alsB ansB aphA araF artI bglX btuB cpxP dacA dacC dcrB degP dppA dsbA dsbG erfK fabF fdoG fhuE fimC fliY galF glpQ gltI gltL gltX glyA glyQ glyS gnd gnsB gntR gor gph gpmB greA greB groL grpE grxB gshA gshB gsk gst guaA guaB guaD gudD gudX gyrA gyrB hdeB hisJ ivy lolA malE malS mdh mdoG mglB miaA mqo napA

(continued on next page)

Riley

Riley

Riley

Riley

fecB fepB fhuD fhuE fimC fimF fimI fkpA flgA flgD flgI flhE fliY frdA frwB frwD galF gatB ggt glcG glnH glpQ gltF gltI gntX gpsA gspD hcaD

pth ptr puuE rbsB recG rffD rna rseB rsxG sapA sbp sfmA sfmC sfmF sfmH slt sodC spy ssuA sufI surA tauA tbpA tolB tolC torA torT torZ

ydhS ydiY ydjG yebF yecG yecT yedS yedX yedY yeeJ yeeZ yegJ yehA yehC yehE yehZ yejA yejO yfaL yfaP yfaQ yfaS yfaT yfcO yfcQ yfcR yfcS

ynfD ynfE ynfF ynhG ynjB ynjE ynjH yobA yodA yoeA ypdH ypeC yphF yqhG yqiH yqiI yqjC yraH yraI yraJ yraK yrbC yrfA ytfJ ytfM ytfQ znuA zraP

Lopez-Cam

Gevaert oppA osmY potD prc pspE pta pth ptrA rbsB rseB slt sufI surA tolB tolC torT trxB ugpB ushA yaeT yahO yciN ydcS ydeN ydgH yeeZ yhhA yliB ynfF ynjE ytfQ znuA

The theoretical periplasmic proteome is taken from the Riley annotation [2]. The list includes those proteins listed as periplasmic by “cell localization” as well as those proteins listed as periplasmic by “gene product description” shown in bold. The underlined proteins are annotated as periplasmic, but it is this is not the case for DmsA, FrdA, OmpA, OmpG, OmpT, YnfE and YnfF. A listing of the periplasmic proteins determined by Lopez-Campistrous [28] using 2D PAGE methodology and Gevaert [26] using LC-MS was undertaken using the Supplementary Data in their publications with the Riley et al. annotation [2].

with the cytoplasmic membrane, but have Tat leaders (e.g. DmsA, YnfE, YnfF), proteins that are associated with the outer membrane (e.g. OmpA, OmpG, OmpT) and proteins that are clearly on the cytoplasmic side of the membrane (e.g. FrdA). Nonetheless, the consortium database provides the best estimate of the number of periplasmic proteins. 4.2. Experimental periplasmic proteome E. coli is a facultative anaerobe that can grow in diverse environments by inducing the synthesis of appropriate transporters and enzymes. The periplasmic protein composition can respond to changes in the physiologic or environmental state and we can expect that only a subset of the proteins in the predicted periplasmic proteome will be seen in any one condition. Most of the studies reported are carried out with cells grown in Neidhardt's glucose minimal medium. Several studies of the soluble protein proteome of E. coli have been carried out [35–37]. These studies did not distinguish

1704

J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

Fig. 2. Venn diagram noting the overlapping and uniquely identified proteins in the experimentally identified periplasmic proteins by Lopez-Campistrous (L) and Gevaert (G) data as taken from Table 3 and [26,28].

between cytoplasmic and periplasmic proteins or localization. In the Gevaert analysis (see above) [26] of 800 E. coli proteins they found 39 periplasmic proteins using MASCOT and 42 periplasmic proteins using PROCORR. We have re-analyzed their supplementary data and find 96 periplasmic proteins in their total database (Table 3). This assumes that their peptide assignments were correct. In the large-scale 2D PAGE study of Lopez-Campistrous et al. [28] 107 different proteins were identified on the gel of the periplasmic fraction isolated by cold osmotic shock of which at least 54 were periplasmic proteins based on the Riley et al. database [2]. Fig. 2 is a Venn diagram, graphically depicting the degree of overlap and uniquely identified proteins between the Gevaert, and Lopez-Campistrous databases. Hervey et al. (ASM Meeting 2006 Abstract K-125) used differential isotope labeling with 14N and 15N minimal medium coupled with LC-MS-MS to identify periplasmic proteins released by cold osmotic shock. They analyzed 667 proteins and found 103 with a high periplasmic:whole cell mass ratio. Of these 39 were annotated as periplasmic in Swissprot, an additional 43 contained a signal sequence based on various prediction algorithms and 21 of the 103 were membrane proteins. Twenty-one proteins were cytoplasmic and presumably arose from cell lysis. Twelve known periplasmic proteins had low periplasmic:whole cell mass ratios either because they were associated with the membrane or had unusual turnover properties. 5. The cytoplasmic membrane proteome 5.1. Predicted cytoplasmic membrane proteome The cytoplasmic membrane (inner membrane or plasma membrane) retains the cytoplasm and separates it from the surrounding environment. This membrane also serves as a selectively permeable barrier: it allows particular ions and molecules to pass, either into or out of the cell, while preventing the movement of others. The bacterial cytoplasmic

membrane is the location of a variety of crucial metabolic processes, such as respiration and the synthesis of lipids and cell wall constituents. Finally, the membrane contains special receptor molecules that help bacteria detect and respond to chemicals in their surroundings. Based on protein diversity the cytoplasmic membrane is by far the largest and most complex compartment of the envelope. As for the other compartments, the protein composition of the membrane will change with physiologic and environmental conditions. It is useful to define all the open reading frames on the genome that can be membrane-bound and several groups have developed algorithms to search the genomic sequence for potential integral membrane proteins. Defining what is a membrane-associated protein has been the subject of infinite debate [38]. The simple situation involves those proteins that traverse the membrane as an α-helical bundle or contain a lipid anchor. Far more difficult is defining peripheral or extrinsic proteins that are associated with the membrane. There will be proteins that migrate from cytoplasm to membrane due to posttranslational modifications or physiologic changes. An example is proline dehydrogenase, PutA [39,40] that migrates to the membrane upon reduction. Early studies by the Corbin et al. [36] group attempted to define a database of known and predicted membrane proteins by combining data from the E. coli entry point (http://coli.berkeley.edu/cgi-bin/ecoli/coli_entry.pl) (156 proteins), proteins listed as membrane proteins in the GenProtEC database (http://genprotec.mbl.edu) (634 proteins), or proteins that contained at least two transmembrane (TM) helices by the PHD algorithm [41] (821 proteins). This created a database of 1017 proteins that occurred in at least one of the databases. The GenProt database includes integral membrane proteins as well as some extrinsic proteins that are part of multi-subunit respiratory complexes, but it is not comprehensive. For example, FrdA and FrdB, extrinsic subunits of the fumarate reductase complex [42] are not included, but DmsA and DmsB, extrinsic subunits of the DMSO reductase complex [43], are included. The extrinsic components of ATP Binding Cassette (ABC) transporter class of membrane proteins [44] are generally not included. Furthermore, some proteins, known to be membrane-associated through hydrophobic interactions such as Dld, D-lactate dehydrogenase [45] or GlpD, glycerol-3phosphate dehydrogenase [46] are defined as cytoplasmic in all of the databases. Further complicating the creation of a database, many of the proteins identified in the Corbin et al. database [36] may not be associated with the membrane. This is because single transmembrane-spanning proteins can often be confused with proteins that will be exported through the Sec [47] or Tat [48] translocons to the periplasm. These proteins have an amino terminal leader that can mimic a transmembranal helix [49]. Proteins with two or more helices are more readily identified. Wallin and von Heijne [50] carried out a statistical analysis of helix bundle proteins in many organisms including E. coli. They found that 20–30% of all the open reading frames in an organism code for α-helical membrane proteins. In E. coli the value ranges from 24% (∼ 1028 proteins) using two “certain” transmembrane helices for the analysis to 40% (∼ 1714

J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

proteins) for proteins with two “putative” or “certain” transmembrane helices. The Riley et al. [2] database identified 757 integral membrane proteins, 144 membrane anchored proteins, 12 inner membrane lipoproteins and 11 membrane associated lipoproteins that could be on the inner membrane. 5.2. Respiratory complexes While multi-spanning integral membrane proteins and lipoproteins can be identified by machine searching techniques, the proteome also contains many extrinsic polypeptides that are hydrophilic subunits of respiratory chain complexes. This problem was noted above in consideration of the GenProt database. The subunits of respiratory complexes should be considered part of the membrane proteome, as they will be found in experimental proteomic analysis. The structures of the succinate dehydrogenase (SdhCDAB) complex [51], the fumarate reductase (FrdABCD) complex [42,52], the formate dehydrogenase complex [53], and the nitrate reductase complex [5] have all been determined by X-ray crystallographic techniques and each is composed of hydrophilic subunits bound to hydrophobic, membrane-integral, anchoring subunit(s). These hydrophilic subunits are exposed to the cytoplasmic or periplasmic side of the membrane and are clearly part of the membrane proteome. Unfortunately, they are often annotated as cytoplasmic or periplasmic, depending on the side of the membrane to which they are bound, as in the Riley et al. database [2]. This analysis can still be open to error. For example, FrdA which has been shown in many publications to be on the cytoplasmic side of the membrane [42,52] is listed as periplasmic [2]. Similar problems arise with subunits of ATP Binding Cassette transporters. A further complication of determining the membrane proteome relates to soluble cytoplasmic proteins that will interact with membrane proteins through various types of functional association or metabolons. These peripheral membrane proteins will only be identified by experimental analysis of the proteome or tagging and immunologic precipitation studies [54]. These functional associations can be very important. For example, in the red blood cell the soluble enzyme catalase is associated with the Band III chloride/bicarbonate exchanger [55] to form a metabolon. Current efforts often utilize a Na carbonate wash to remove peripheral proteins [7]. Our view is that these associated proteins must be considered part of the proteome and as methodologies improve it will be possible to analyze these peripheral proteins in detail. 5.3. Experimental cytoplasmic membrane proteome Until recently, analysis of the membrane proteome relied on 2D PAGE and this method identified primarily hydrophilic polypeptides associated with the membrane. It failed to identify many integral proteins. This is the result of difficulty in solubilizing these proteins in a detergent suitable for isoelectric focusing in the first dimension and problems with in situ proteolysis of the proteins to obtain peptides for MS analysis.

1705

In the Gevaert analysis [26] of 800 E. coli proteins they found 56 IMPs using MASCOT and 68 IMPs using PROCORR (13% of the proteins in their database (69/525). In the largescale analysis of the E. coli proteome carried out by LopezCampistrous et al. [28], 479 spots on the cytoplasmic membrane gel could be identified by MALDI-TOF MS. This corresponded to 164 unique proteins. When these proteins were characterized as to subcellular location using the Riley annotation, we find 94 cytoplasmic, 16 periplasmic, 12 outer membrane, 2 outer membrane lipoproteins, 14 inner membrane, 16 membrane associated, 6 membrane lipoprotein and 3 unknown. Although only 30 proteins (18%) were identified as inner membrane or membrane-associated, twenty-five of the cytoplasmic proteins are extrinsic components of respiratory complexes, F0F1 ATPase or ABC transporters and six are membrane lipoproteins (19%). Thus, 37% of the proteins identified were clearly associated with the cytoplasmic membrane. Nonetheless, this study showed that relatively few of the α-helical membrane proteins were identified. Corbin et al. [36] used SDS solubilization and an LC MS/MS approach to analyze the membrane protein profile as part of a global analysis of protein expression in E. coli. They reported a “long list” of 1147 proteins with at least one peptide used for identification. Of these, 287 were classified as membrane proteins using their database of 1017 membrane proteins. Yan et al. [56] used fluorescence difference 2D gel electrophoresis (2D-DIGE) with proteins differentially labeled with Cy3 (control) and Cy5 (benzoic acid treatment of cells to reduce the proton motive force) dyes. They analyzed the gels by DeCyder software and studied 197 spots by tryptic digestion and MALDI-TOF and QTOF mass spectrometry to identify the proteins. Although they indicate that a number of membrane proteins were identified, examination of their data indicates that these are outer membrane or peripheral proteins. In a recent study Ji et al. [57] compared N-terminal dimethylation after lysine guanidation (2MEGA labeling) in concert with 2-dimensional LC MS/MS to analyze the membrane proteome and reduce the number of false positive proteins. Six hundred forty proteins were identified in the membrane fraction, which included 258 membrane and membrane-associated proteins. The labeling method resulted in 153 integral proteins being identified compared to only 77 proteins in the unlabeled sample. Molloy et al. [58] used organic solvent extraction with chloroform:methanol to try to enrich for integral membrane proteins prior to 2D gel electrophoresis. This did not result in the identification of any additional integral proteins. Zhang et al. [59] used a combination of SDS and methanol assisted protein solubilization with LC MS/MS to provide a further characterization of the membrane proteome. They identified 431 different proteins of which 217 (50%) were membrane intrinsic (168 with two or more transmembrane α-helices) and an additional 55 were lipoproteins or components of membrane-bound complexes. Additionally, 29 outer membrane proteins were identified in this analysis. Daley et al. [49], Granseth et al. [60] have taken a different approach to study the E. coli proteome. Although about 1000

1706

J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

proteins have at least two transmembrane helices, they studied 737 inner membrane proteins that traverse the membrane at least twice and are greater than 100 amino acids in length. Each open reading frame was tagged at the C-terminus with alkaline phosphatase and green fluorescent protein. Over six hundred membrane proteins could be expressed and their topology determined. A tagging approach has great benefit for determining the subcellular localization of each protein and in many cases will provide functional information. For example, it has been possible to localize cell division proteins within the bacterial cell [61], However, this approach has limitations as the chimeric proteins are normally expressed from an inducible promoter. In order to compare different metabolic, environmental or genetic variations a detailed profile of proteins identifiable by MS techniques and/ or 2-dimensional PAGE will be required. 6. Comparison with microarray Both Corbin et al. [36] and Zhang et al. [59] were able to compare their protein profile with Affymetrix microarray data of the transcriptome obtained under identical growth conditions. In both studies there was a strong correlation between mRNA expression level and protein identification indicating that the most highly expressed proteins were found. Additional purification and LC techniques will be required to identify low abundance proteins. 7. Methodology for membrane proteome analysis Proteome analysis generally involves several steps including cell lysis, protein extraction, protein separation, protein digestion, mass spectrometric analysis of peptides, and data processing for peptide and protein identification. Each step is important and should be optimized to generate a comprehensive profile of the E. coli membrane proteome. While there are many sample handling protocols and analytical techniques available to detect hydrophilic and readily soluble proteins, analysis of hydrophobic membrane proteins is still a challenging task. Fortunately, in the past several years, a number of research groups have been actively involved in developing new and improved analytical tools to characterize the membrane proteome of cells, tissues, and other samples. Good progress has been made for membrane proteome analysis. Characterization of the membrane proteome of a relatively simple microorganism, E. coli, is one of the excellent ways of judging the performance of a newly developed technique. Although techniques described below may not directly involve the analysis of the E. coli membrane proteome, they should be applicable to E. coli samples with or without any modifications. There are mainly two major analytical platforms commonly used for analyzing membrane proteomes. One uses gel electrophoresis for proteome display followed by in-gel digestion of protein spots of interest and mass spectrometric analysis of the resultant peptides for peptide and protein identification (i.e., gel-based method) [7,21,38,62–65]. Another platform involves the digestion of a proteome with little or no separation at the protein level followed by liquid chromatography (LC) separa-

tion of the complex peptide mixture and mass spectrometric sequencing of individual peptides for identification with electrospray ionization (ESI) [66] or MALDI [67] (i.e., solutionbased method). As discussed below, each platform has its pros and cons and they often provide complementary information on a proteome. While technical advances in both platforms have been made for membrane proteome analysis, the solution-based method has gained popularity in the past few years due to its impressive proteome coverage and rapid quantification power. 7.1. Gel-based proteome analysis platform There are several ways of implementing the gel-based membrane proteome analysis platform [38]. Isoelectric focusing (IEF) of proteins can be carried out using a solution-based IEF system [68–71] or an immobilized IEF gel strip to provide the first dimension protein separation. Polyacrylamide gel electrophoresis (PAGE) can then be employed to separate the proteins further according to their molecular sizes. The combination of IEF and PAGE (i.e. 2D-PAGE) provides an efficient means of separating proteins from a complex proteome sample. Because of the high resolving power, particularly for separating proteins with different conformers or different degrees of modifications, 2D-PAGE is an excellent tool to unravel subtle changes in the proteome. For example, it is possible to monitor the change of posttranslational modifications of proteins from cells of different states that is difficult or impossible to detect using the solution-based platform [56]. Applying 2D-PAGE to separate very hydrophobic proteins such as integral membrane proteins is challenging. Solubilization of these proteins requires the use of strong surfactants, such as SDS which are not compatible with IEF. Several zwitterionic detergents have been shown to improve the performance of 2DPAGE separation of integral membrane proteins [38,72,73]. However, hydrophobic proteins may precipitate at their isoelectric points during the IEF separation process, resulting in sample loss and difficulty for second dimension separation. 2DPAGE analysis of the yeast, tobacco or Arabidopsis proteome revealed that many integral transmembrane proteins could not be identified [64]. To overcome the problem of IEF, Zahedi et al. reported an interesting technique based on the use of cationic detergent benzyldimethyl-n-hexadecylammonium chloride for the first dimension protein separation followed by anionic detergent SDS-PAGE separation ([74]). While separation of several integral membrane proteins was demonstrated, it remains to be seen whether this technique offers sufficient separation power for a complicated membrane proteome. It appears that 1D-SDS-PAGE is at present a better choice for handling very hydrophobic membrane proteins, albeit with reduced resolving power compared to 2D-PAGE [75]. One of the major advantages of the gel-based method for proteome analysis is that it can provide quantitative information on proteins directly, offering a convenient means of comparing proteomes of different samples. If protein isoforms and modified proteins are resolved in the gel, they can be profiled and relative distributions of these proteins can potentially be correlated with a certain phenotype of the samples to study their

J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

functions. To generate quantitative information, gels are often stained with Coomassie blue or silver. Fluorescence dyes are being increasingly used for protein display. Differential display and analysis of relative abundance changes of two proteomes is possible by labeling individual proteomes with different dyes and mixing of the labeled proteins followed by gel separation and fluorescence detection of labeled proteins with different wavelengths in a gel [56]. These staining methods are compatible with mass spectrometric analysis, providing high quality reagents are used (i.e., using mass spectrometry compatible reagents available from leading gel electrophoresis suppliers). Radiolabeling or Western blotting of proteins of interest can be more sensitive than silver or fluorescence staining methods for detecting proteins in a gel, but the sensitivity of mass spectrometric techniques may not be sufficient to identify these proteins. To identify the proteins detected in a gel by mass spectrometric techniques, all gel spots (e.g., in qualitative proteome profiling) or a selected few spots of interest (e.g. only the proteins showing significant changes in relative quantification of proteomes) are excised and subjected to in-gel digestion with an enzyme, such as trypsin, or a chemical reagent, such as cyanogen bromide (CNBr). Extracting or blotting proteins from a gel band to a solution for digestion is less efficient than in-gel digestion where proteins are degraded into peptides which can be more readily extracted into a solution than the intact proteins. As expected, in-gel digestion of membrane proteins and subsequent sample preparation for mass spectrometric analysis are much more difficult than in the case of analyzing hydrophilic proteins. The standard trypsin digestion protocol can be applied to membrane proteins where trypsin digestion can take place in the gel and the resulting peptides effectively extracted for MS analysis. However, the method would fail if digestion is inefficient due to lack of cleavage sites or lack of accessibility of the protease to the cleavage sites, such as those within the membrane-spanning segments of a transmembrane protein. The method would also fail if the resulting peptides are too large or too hydrophobic to be extracted or to be analyzed by MS and MS/MS analysis. New digestion and sample handling protocols are being developed to tackle these problems. For example, van Montfort et al. reported a protocol involving sequential in-gel trypsin and CNBr digestion [76] in which larger fragments generated from in-gel trypsin digestion were further digested by CNBr to yield smaller peptides which could be more efficiently extracted for MS analysis. Quach et al. have developed an improved in-gel digestion protocol that involves the use of CNBr digestion, followed by further trypsin digestion [77]. The chemical cleaving reagent CNBr is expected to interact readily with a membrane protein, compared to a protease which is bulky and must retain its optimal conformer for activity. CNBr selectively cleaves methionine, but due to the low number of methionine in residues in proteins, CNBr cleavage produces a small number of large peptide fragments with MW typically N 2000 Da that are also difficult to extract from gel pieces with high efficiency. In addition, these large peptides cannot be readily fragmented to produce MS/MS spectra for protein identification. To generate a

1707

larger number of small peptides than can be obtained using CNBr alone, trypsin can be used to digest further the CNBrcleaved fragments. The protocol has been applied for the analysis of bacteriorhodopsin, nitrate reductase 1 gamma chain (NarI E. coli), and a complex protein mixture extracted from the endoplasmic reticulum membrane of mouse liver [77]. In these instances it was demonstrated that the sensitivity of membrane protein identification is in the low picomole regime that is compatible with Coomassie staining of gel-spots. Mass spectrometric analysis of the extracted peptides from gel spots can be done on a variety of instruments. When 2D-PAGE is used for protein separation, an individual spot may contain a protein with relatively high abundance plus a few other minor components. As long as one protein is a dominant species in a spot, peptide mass mapping or fingerprinting (PMF) can be applied to identify the protein. In PMF, the peptide extract is analyzed directly by MALDI or ESI MS. Peptide masses determined can be searched against a proteome database comprised of protein names, protein sequences and their predicted peptide fragments after enzyme digestion with known specificity. In the case of trypsin digestion, expected sequences and masses of tryptic peptides from cleavage of terminal lysine or arginine residues can be readily generated a priori for any given protein in a proteome whose sequence is known. By comparing the number and quality of peptide mass matching between the experimental data and the predicted peptides in the database, one can often identify a protein with high confidence. Several search engines are available to perform PMF. For example, the web-based, free search engine MASCOT can be used to perform quickly PMF and to generate a report of protein candidates along with confidence levels of the matches [78]. PMF works well if many peptides are detected from a protein and the mass measurement accuracy for determining the peptide mass is moderately high. It is a rapid method and does not require the use of sophisticated mass spectrometric instrumentation. However, if only one or two peptides are detected from a gel spot or the gel spot contains more than one dominant protein, PMF would fail to arrive at a unique match. In this case, a more powerful technique, tandem mass spectrometry or MS/MS, is required for protein identification. There are a number of different types of tandem mass spectrometric instruments being used to produce MS/MS spectra. In general, they start with the generation of a mass spectrum of the peptides (MS mode), followed by selecting a peptide ion and subjecting it to collision-induced dissociation using a collision gas introduced to the mass spectrometer (MS/MS mode). The intensities and mass-to-charge values (m/z's) of the fragment ions are recorded to produce a MS/MS spectrum. Peptide ion fragmentation takes place mainly from the breakage of the amide bonds between amino acids. Although a MS/MS spectrum usually does not contain complete amino acid sequence information for de novo sequencing of a peptide, it contains partial information on peptide sequence. For protein identification, an un-interpreted MS/MS spectrum can be entered into a database search engine where the fragment ion masses (some search engine also considers intensities) are searched against the predicted fragment ion spectra of individual peptides of protein digests for possible matches.

1708

J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

From the automated MS/MS database search, a report is generated including matched peptide sequences from a given MS/MS spectrum and their corresponding proteins along with matching scores. A probability-based matching algorithm has been introduced recently in several search engines to provide a statistical evaluation of the fragment ion and sequence matches. For example, in the MASCOT search engine, for each peptide sequence matched to a MS/MS spectrum, a matching score as well as a threshold score is listed [78]. The threshold score defines the minimum score required to call a positive identification within a confidence level (e.g., N95% certainty). For a peptide with a matching score of slightly above the threshold (e.g., confidence level of between 95% and 99%), manual comparison of the MS/ MS spectrum with the predict fragmentation pattern of the peptide ion may be carried out to confirm or discard the peptide match from the automated database search step. MS/MS spectral acquisition can be carried out in conjunction with liquid chromatography (LC) separation of the extracted peptides from a gel spot. Either LC-ESI or LC-MALDI MS and MS/MS or both may be used for analyzing a sample [67]. This is particularly important for identifying proteins separated only by one-dimensional gel electrophoresis. A gel band from 1DPAGE may contain many proteins. Thus, the rapid and simple PMF method would fail to generate unique protein identification. The peptide sample from a 1D gel band is a complex mixture. Direct MS/MS analysis of the mixture may result in the detection of only a few readily detectable peptides (i.e. high abundance and more ionizable peptides) due to an ion suppression effect, i.e., a few peptides dominating the spectrum at the expense of other low abundance and less ionizable peptides. Peptide separation by LC can significantly reduce the ion suppression effect, resulting in the detection of a greater number of peptides from a complex mixture [79]. To summarize the application of the gel-based platform for E. coli membrane proteome analysis, it has been shown that, if the membrane proteins are amenable to 2D-PAGE separation, relative quantification of proteomes can be carried out at the protein level with the possibility of examining proteins with different conformers or modifications. Protein identification can be carried out by in-gel digestion, peptide extraction and PMF or MS/MS of the peptides. If only 1D-PAGE can be used to separate the membrane proteins, due to its limited separation power, individual gel bands would contain mixtures of proteins, making quantitative analysis of individual proteins difficult. Identification of proteins residing in a gel band can be done by using in-gel digestion, peptide extraction and MS/MS or LC MS/MS of the peptide mixture. 7.2. Solution-based proteome analysis platform In the past several years, a solution-based or gel-free technical platform has been developed for membrane proteome analysis [79]. In the solution-based method, the entire protein mixture is first digested and the resulting peptides analyzed by LC MS/MS. In applying this method to the membrane proteome, protein solubilization and digestion must proceed with high efficiency. Due to the hydrophobic nature of membrane

proteins, dissolving all proteins in a solvent system that is comparable with enzymatic or chemical protein digestion can be challenging. An efficient means of degrading proteins into peptides of suitable length for LC separation and MS analysis is also vital to the success of the solution-based platform. Among the reported protein digestion methods, trypsin digestion is still the most commonly used due to trypsin's relatively high enzyme specificity. Trypsin generates peptides of near ideal size (b 30 residues containing basic amino acids Arg and/or Lys) for MS and MS/MS and digestion can be carried out as long as the solvent system used to dissolve the membrane proteins does not totally denature the trypsin. Ammonium bicarbonate buffer is widely used to dissolve soluble proteins and may dissolve a membrane protein containing extensive hydrophilic moieties. To increase the solubility of membrane proteins, surfactants can be added to a solution. For example, 0.5% SDS has been used by Han et al. to solubilize a membrane-enriched microsomal fraction, followed by trypsin digestion in dilute SDS solution [80]. However, surfactants may adversely affect the digestion process. For example, 1% SDS, a strong ionic surfactant, can be used to dissolve many membrane proteins, but trypsin will be denatured at this high concentration of SDS, making its useless for digestion. However, by diluting the membrane protein solution to about 0.1% SDS, protein digestion can proceed with reasonably good efficiency [59]. A cleavable detergent, 3-[3-(1,1-bisalkyloxyethyl)pyridin-1-yl]propane-1-sulfonate (PPS) is compatible with trypsin digestion and has been useful to dissolve membrane proteins [81]. Wu et al. reported a method for comprehensive membrane protein analysis using non-specific Proteinase K for digestion and subsequent analysis by LC-ESI MS/MS [66]. Both organic acids (e.g. trifluoroacetic acid (TFA)) and organic solvents (e.g. methanol) have been reported to be effective in dissolving membrane proteins. Washburn et al. developed a surfactant-free method that used 90% formic acid to solubilize proteins in the presence of CNBr, with further enzymatic digestion of the CNBr-cleaved protein fragments by LysC and trypsin [82]. Recent studies suggest that trypsin is functional for digestion with a methanol concentration of up to about 65% [83– 87]. Thus, a high concentration of methanol can be used to solubilize membrane proteins, followed by trypsin digestion and this technique has been reported to be useful for membrane proteome analysis [85–87]. The use of an organic solvent, such as 60% methanol, to solubilize membrane proteins has one major advantage compared to the use of a solvent system containing a strong surfactant, such as SDS. The organic solvent can be readily removed after protein digestion, while a strong surfactant, such as SDS, is difficult to remove. The advantage of SDS over organic solvents is that it can dissolve a wider range of proteins, including misfolded and precipitated proteins. As SDS can degrade the performance of reversed-phase separations and MS analyses of peptides, the remaining SDS in the digested peptide sample must be removed by using an ion-exchange column. In a solution-based method, where two-dimensional (2D) peptide separation is used, the first dimension of separation is generally based on ion-exchange chromatography. Thus, SDS removal from the digested peptide sample is integrated into the first-dimensional peptide separation.

J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

Zhang et al., compared the ability of 60% methanol and 1% SDS to dissolve the inner membrane fraction of an E. coli K12 cell lysate [59]. By using trypsin digestion and 2D-LC ESI MS/ MS they found that 358 proteins (1417 unique peptides) and 299 proteins (892 peptides) were identified from the methanolsolubilized protein mixture and the SDS-solubilized sample, respectively. The methanol method detected more hydrophobic peptides, resulting in a greater number of proteins identified, than the SDS method. 159 out of 358 proteins (44%) detected by SDS solubilization and 120 out of 299 proteins (40%) detected by methanol solubilization were integral membrane proteins. Of a total of 190 integral membrane proteins 70 were identified exclusively in the methanol-solubilized sample, 89 were identified by both methods, and only 31 proteins were exclusively identified by the SDS method. In addition, it was determined that the protein solubilization potential of SDS or methanol was not the crucial parameter determining overall performance of each method but rather the compatibility of these two reagents with protein digestion, downstream peptide separation and ESI-MS analysis was critical for their analytical performance. The better compatibility of the methanol method with 2D-LC-MS/MS resulted in a higher number of total peptide and protein identification, higher reproducibility and pronounced bias of the method towards hydrophobic peptide and integral membrane proteins. Using the combined datasets obtained from the two methods, it was shown that there was no bias in the experimental proteome compared to the predicted membrane proteome [59]. To achieve maximum digestion efficiency, Ji et al. applied two consecutive digestion steps for the analysis of the E. coli membrane proteome [57] in which membrane proteins were first dissolved in 60% methanol and digested with trypsin for 5 h. In a second step un-dissolved protein was pelleted and resuspended in 0.05% SDS with trypsin and digested overnight. Both digests were pooled and subjected to LC-ESI MS/MS analysis. When enzymes or chemical reagents are used to degrade membrane proteins, an important requisite for these protein degradation methods to work is that the proteins to be degraded must be dissolved in a suitable solution. Unfortunately, during the protein sample workup, many proteins may become highly de-natured and are not readily soluble in any solvent including that containing strong surfactant. As a consequence, these proteins are not detected by the solution-based proteome analysis method. Zhong et al. have recently described a method that does not require the use of a solvent to solubilize proteins by using microwave-assisted acid hydrolysis (MAAH) [88] in 25% aqueous TFA to degrade proteins into peptides for MS characterization [89]. Compared to enzymatic digestion, MAAH is fast and detergent-free. It involves a simple sample handling process and there are no background peptides, such as those from protease autolysis in enzyme digestion, introduced in MAAH. MAAH is particularly useful in dealing with membrane proteome samples of cultured cells and tissue samples. The application of this method, in combination with LC-MALDI MS and MS/MS, was illustrated in the analysis of membrane proteins isolated from a human breast cancer cell line [89] and a heart tissue sample [Mulu et al., submitted].

1709

Considering the complexity of the membrane proteome, it appears that one method of solubilization and digestion cannot be universally applied to handle all proteins. A combination of several complementary methods may result in greater proteome coverage. More recently, Wang et al. reported a sequential protein solubilization and digestion protocol for zebrafish liver proteome analysis [90]. In their work, it was found that, after dissolving the protein pellet from the liver tissue extracts in a basic buffer and subjecting it to trypsin digestion, a non-soluble residue remained in the vials. The residue was subjected to additional levels of digestion, first by methanol-assisted trypsin digestion, followed by SDSassisted trypsin digestion and finally by the MAAH method. The peptide mixtures were pooled and subjected to strong cation exchange chromatographic separation, followed by a reversed-phase column and LC-ESI MS/MS analysis. Proteome analysis using the combined solubilization/digestion methods led to the identification of 1204 unique proteins. Among the 1204 proteins identified, 224 (19%) were found in all three samples, while 113 (9%), 420 (35%), and 214 (18%) proteins or related protein groups were uniquely observed in buffer/methanol digest, SDS digest, and MAAH digest, respectively. In the solution-based method, relative quantification of proteomes of different samples is commonly done using stable isotopic labeling of proteins or peptides [80]. One sample is labeled with a light isotope and another one with a heavy isotope. Isotope incorporation can be achieved at the protein level during cell growth using either an isotope-enriched minimal medium or conventional medium plus isotope-labeled amino acids [91,92]. This is followed by enzyme or chemical digestion of labeled proteins to generate isotope-tagged peptides. Note that this labeling strategy works only for cells that can grow in the special culture medium and cannot be readily used for other samples, such as tissues or body fluids. An alternative strategy is to use chemical derivatization to attach an isotope tag to the digested peptides of a proteome. In both strategies, the isotope labeled peptides from a mixture of light- and heavy-isotope-labeled samples produce pairs of peaks in a mass spectrum when the mixture is analyzed using the MS scan mode. The relative intensities of a pair of peaks can be used to determine the abundance change of the peptide and its corresponding protein. MS/MS spectra of the peptide pair can be generated for peptide and protein identification. There are many isotope labeling reactions reported for relative proteome quantification. Among them, the isotope-coded affinity tag (ICAT) approach pioneered by Aebersold and coworkers [80,93–95] has been extensively used. The main advantage of this method is that it enriches peptides containing the rare amino acid cysteine, thereby significantly reducing the complexity of the peptide mixture and increasing the dynamic range of MS analysis. On the other hand, the use of the ICAT reagents fails for quantification of cysteine-free proteins. In addition, the ICAT reagents are structurally complex and thus the cost of the reagents is high. As alternatives to ICAT, other chemical labeling protocols of peptides after protein digestion have been developed and recently reviewed [96].

1710

J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

Simple labeling chemistry and inexpensive reagents are important in applying the isotope labeling approach for relative quantification of proteomes. As an example, H218O can be used in protease digestion to introduce the 18O tag through the hydrolysis reaction to label all proteolytic peptides uniformly at the C-terminus [97–99]. Another example is differential dimethyl labeling of N-termini and α-amino groups of lysine residues of tryptic peptides with d(0)- or d(2)-formaldehyde [100–102]. Dimethyl labeling combined with LC-MALDI MS and MS/MS has been applied to determine the proteins differentially expressed between an E-cadherin-deficient human carcinoma cell line (SCC9) and its transfectants expressing E-cadherin (SCC9-E) [103]. A total of 5480 peptide pairs were examined and 320 of them showed relative intensity changes of greater than 2-fold which led to the identification of 49 differentially expressed proteins. More recently, Ji et al. reported a modified N-terminal dimethyl labeling strategy, in which the N-termini of tryptic peptides are differentially labeled with either d(0),12C-formaldehyde or d(2),13C-formaldehyde after lysine residues in peptides are blocked by guanidination [57]. Guanidination is known to be effective for selective labeling of lysine residues in peptides [104]. It has been demonstrated that N-terminal dimethylation (2ME) after lysine guanidination (GA) or 2MEGA provides uniform 6Da differential isotope tags on peptides that facilitates protein identification and quantification [57]. In summary, with the development of new protocols for solubilizing and digesting membrane proteins as well as improved isotope labeling chemistry, the solution-based method will play an increasingly important role in membrane proteomics. It is anticipated that this method, combined with multi-dimensional LC separation techniques, will allow us to examine the E. coli membrane proteome with unprecedented coverage and accuracy. In this review we have taken the protein identifications as published by the investigators. In many studies confidence limits are not presented and the number of false positives is not listed. In the Gevaert study [26], they indicate that of the 689 proteins identified using the PROCORR algorithm approximately 42 false positives are obtained at the 98% confidence limit. This rises to 69 false positives at 95% confidence. The issues of variation in mass spectrometry technology and identification algorithms need to be addressed so that readers can have confidence in the data presented. As proteomics moves to rely on increased quantitative analysis this information will become even more important. 8. Future perspectives Membrane proteomic studies are rapidly progressing with ever more different proteins identified with increasing levels of confidence. Future studies will utilize newly devised quantitative methods to monitor the membrane proteome and how it changes as the physiologic or environmental conditions change [105,106]. The turnover of membrane proteins will be investigated providing the ability to correlate mRNA turnover with protein turnover. It will also be possible to monitor posttranslational modifications of membrane proteins. More complex will be the identification of metabolons where

historically “soluble” cytoplasmic proteins will be shown to be part of functional complexes on the membrane. New nanotechniques along with rapid proteomic approaches will allow the characterization of bacterial infections without the need to wait for cultures to grow in the clinical laboratory. Proteomics is already being used to identify increased expression of proteins in the clinical situation and the formation of biofilms [107]. The techniques pioneered with the model organism E. coli will be applied to the plasma membranes of eukaryotic cells and the membranes of organelles. Acknowledgements Research in the authors' laboratories is supported by the Canadian Institutes of Health Research and the Natural Sciences and Engineering Research Council of Canada. JHW is a Canada Research Chair in Membrane Biochemistry, LL is a Canada Research Chair in Analytical Chemistry. JHW would like to thank the Alberta Heritage Foundation for Medical Research for support during the writing of this review and Dr. Fraser Armstrong and the Laboratory of Inorganic Chemistry at Oxford University. We thank Philip Winter for preparation of the Venn diagrams. References [1] C. Holden, Cell biology. Alliance launched to model E. coli, Science 297 (2002) 1459–1460. [2] M. Riley, T. Abe, M.B. Arnaud, M.K. Berlyn, F.R. Blattner, R.R. Chaudhuri, J.D. Glasner, T. Horiuchi, I.M. Keseler, T. Kosuge, H. Mori, N.T. Perna, G. Plunkett III, K.E. Rudd, M.H. Serres, G.H. Thomas, N.R. Thomson, D. Wishart, B.L. Wanner, Escherichia coli K-12: a cooperatively developed annotation snapshot-2005, Nucleic Acids Res. 34 (2006) 1–9. [3] P. Owen, H.R. Kaback, Antigenic architecture of membrane vesicles from Escherichia coli, Biochemistry 18 (1979) 1422–1426. [4] P. Dickie, J.H. Weiner, Purification and characterization of membranebound fumarate reductase from anaerobically grown Escherichia coli, Can. J. Biochem. 57 (1979) 813–821. [5] M.G. Bertero, R.A. Rothery, M. Palak, C. Hou, D. Lim, F. Blasco, J.H. Weiner, N.C. Strynadka, Insights into the respiratory electron transfer pathway from the structure of nitrate reductase A, Nat. Struct. Biol. 10 (2003) 681–687. [6] I. Yamato, M. Futai, Y. Anraku, Y. Nonomura, Cytoplasmic membrane vesicles of Escherichia coli. II. Orientation of the vesicles studied by localization of enzymes, J. Biochem. (Tokyo) 83 (1978) 117–128. [7] M.P. Molloy, N.D. Phadke, J.R. Maddock, P.C. Andrews, Twodimensional electrophoresis and peptide mass fingerprinting of bacterial outer membrane proteins, Electrophoresis 22 (2001) 1686–1696. [8] H.C. Neu, L.A. Heppel, The release of enzymes from Escherichia coli by osmotic shock and during the formation of spheroplasts, J. Biol. Chem. 240 (1965) 3685–3692. [9] G.F. Ames, K. Nikaido, Two-dimensional gel electrophoresis of membrane proteins, Biochemistry 15 (1976) 616–623. [10] F.R. Blattner, G. Plunkett III, C.A. Bloch, N.T. Perna, V. Burland, M. Riley, J. Collado-Vides, J.D. Glasner, C.K. Rode, G.F. Mayhew, J. Gregor, N.W. Davis, H.A. Kirkpatrick, M.A. Goeden, D.J. Rose, B. Mau, Y. Shao, The complete genome sequence of Escherichia coli K-12, Science 277 (1997) 1453–1474. [11] V. Braun, Covalent lipoprotein from the outer membrane of Escherichia coli, Biochim. Biophys. Acta 415 (1975) 335–377. [12] W. Vollmer, J.V. Holtje, Morphogenesis of Escherichia coli, Curr. Opin. Microbiol. 4 (2001) 625–633.

J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713 [13] J. Kyte, R.F. Doolittle, A simple method for displaying the hydropathic character of a protein, J. Mol. Biol. 157 (1982) 105–132. [14] C.P. Chen, A. Kernytsky, B. Rost, Transmembrane helix predictions revisited, Protein Sci. 11 (2002) 2774–2791. [15] E.L. Sonnhammer, G. von Heijne, A. Krogh, A hidden Markov model for predicting transmembrane helices in protein sequences, Proc. Int. Conf. Intell. Syst. Mol. Biol. 6 (1998) 175–182. [16] A. Krogh, B. Larsson, G. von Heijne, E.L. Sonnhammer, Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes, J. Mol. Biol. 305 (2001) 567–580. [17] F.S. Berven, K. Flikka, H.B. Jensen, I. Eidhammer, BOMP: a program to predict integral beta-barrel outer membrane proteins encoded within genomes of Gram-negative bacteria, Nucleic Acids Res. 32 (2004) W394–W399. [18] A. Pautsch, G.E. Schulz, High-resolution structure of the OmpA membrane domain, J. Mol. Biol. 298 (2000) 273–282. [19] S.K. Buchanan, B.S. Smith, L. Venkatramani, D. Xia, L. Esser, M. Palnitkar, R. Chakraborty, D. van der Helm, J. Deisenhofer, Crystal structure of the outer membrane active transporter FepA from Escherichia coli, Nat. Struct. Biol. 6 (1999) 56–63. [20] H.C. Wu, Biosynthesis of lipoproteins, American Society for Microbiology, 1996. [21] M.P. Molloy, B.R. Herbert, M.B. Slade, T. Rabilloud, A.S. Nouwens, K.L. Williams, A.A. Gooley, Proteomic analysis of the Escherichia coli outer membrane, Eur. J. Biochem. 267 (2000) 2871–2881. [22] A.G. Garrow, A. Agnew, D.R. Westhead, TMB-Hunt: an amino acid composition based method to screen proteomes for beta-barrel transmembrane proteins, BMC Bioinformatics 6 (2005) 56. [23] S. Rey, M. Acab, J.L. Gardy, M.R. Laird, K. deFays, C. Lambert, F.S. Brinkman, PSORTdb: a protein subcellular localization database for bacteria, Nucleic Acids Res. 33 (2005) D164–D168. [24] M. Chevallet, V. Santoni, A. Poinas, D. Rouquie, A. Fuchs, S. Kieffer, M. Rossignol, J. Lunardi, J. Garin, T. Rabilloud, New zwitterionic detergents improve the analysis of membrane proteins by two-dimensional electrophoresis, Electrophoresis 19 (1998) 1901–1909. [25] M. Fountoulakis, R. Gasser, Proteomic analysis of the cell envelope fraction of Escherichia coli, Amino Acids 24 (2003) 19–41. [26] K. Gevaert, J. Van Damme, M. Goethals, G.R. Thomas, B. Hoorelbeke, H. Demol, L. Martens, M. Puype, A. Staes, J. Vandekerckhove, Chromatographic isolation of methionine-containing peptides for gelfree proteome analysis: identification of more than 800 Escherichia coli proteins, Mol. Cell. Proteomics 1 (2002) 896–903. [27] M. Hirosawa, M. Hoshida, M. Ishikawa, T. Toya, MASCOT: multiple alignment system for protein sequences based on three-way dynamic programming, Comput. Appl. Biosci. 9 (1993) 161–167. [28] A. Lopez-Campistrous, P. Semchuk, L. Burke, T. Palmer-Stone, S.J. Brokx, G. Broderick, D. Bottorff, S. Bolch, J.H. Weiner, M.J. Ellison, Localization, annotation, and comparison of the Escherichia coli K-12 proteome under two states of growth, Mol. Cell. Proteomics 4 (2005) 1205–1209. [29] M.E. Bayer, Zones of membrane adhesion in the cryofixed envelope of Escherichia coli, J. Struct. Biol. 107 (1991) 268–280. [30] E.M. Lai, U. Nair, N.D. Phadke, J.R. Maddock, Proteomic screening and identification of differentially distributed membrane proteins in Escherichia coli, Mol. Microbiol. 52 (2004) 1029–1044. [31] P. Marani, S. Wagner, L. Baars, P. Genevaux, J.W. de Gier, I. Nilsson, R. Casadio, G.G. von Heijne, New Escherichia coli outer membrane proteins identified through prediction and experimental verification, Protein Sci. 15 (2006) 884–889 (2002 #74). [32] J.D. Bendtsen, H. Nielsen, G. von Heijne, S. Brunak, Improved prediction of signal peptides: SignalP 3.0, J. Mol. Biol. 340 (2004) 783–795. [33] J.D. Bendtsen, H. Nielsen, D. Widdick, T. Palmer, S. Brunak, Prediction of twin-arginine signal peptides, BMC Bioinformatics 6 (2005) 167. [34] H. Nielsen, J. Engelbrecht, S. Brunak, G. von Heijne, Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites, Protein Eng. 10 (1997) 1–6. [35] A.J. Link, K. Robison, G.M. Church, Comparing the predicted and observed properties of proteins encoded in the genome of Escherichia coli K-12, Electrophoresis 18 (1997) 1259–1313. [36] R.W. Corbin, O. Paliy, F. Yang, J. Shabanowitz, M. Platt, C.E. Lyons Jr., K.

[37]

[38] [39]

[40]

[41] [42]

[43]

[44] [45]

[46]

[47] [48]

[49]

[50]

[51]

[52]

[53]

[54]

[55] [56]

[57]

1711

Root, J. McAuliffe, M.I. Jordan, S. Kustu, E. Soupene, D.F. Hunt, Toward a protein profile of Escherichia coli: comparison to its transcription profile, Proc. Natl. Acad. Sci. U. S. A. 100 (2003) 9232–9237. R.A. VanBogelen, K.Z. Abshire, B. Moldover, E.R. Olson, F.C. Neidhardt, Escherichia coli proteome analysis using the gene-protein database, Electrophoresis 18 (1997) 1243–1251. V. Santoni, M. Molloy, T. Rabilloud, Membrane proteins and proteomics: un amour impossible? Electrophoresis F 21 (2000) 1054–1070. J.M. Wood, Membrane association of proline dehydrogenase in Escherichia coli is redox dependent, Proc. Natl. Acad. Sci. U. S. A. 84 (1987) 373–377. E.D. Brown, J.M. Wood, Conformational change and membrane association of the PutA protein are coincident with reduction of its FAD cofactor by proline, J. Biol. Chem. 268 (1993) 8972–8979. B. Futcher, G.I. Latter, P. Monardo, C.S. McLaughlin, J.I. Garrels, A sampling of the yeast proteome, Mol. Cell. Biol. 19 (1999) 7357–7368. B.D. Lemire, J.J. Robinson, R.D. Bradley, D.G. Scraba, J.H. Weiner, Structure of fumarate reductase on the cytoplasmic membrane of Escherichia coli, J. Bacteriol. 155 (1983) 391–397. P.T. Bilous, S.T. Cole, W.F. Anderson, J.H. Weiner, Nucleotide sequence of the dmsABC operon encoding the anaerobic dimethylsulphoxide reductase of Escherichia coli, Mol. Microbiol. 2 (1988) 785–795. K.P. Locher, Structure and mechanism of ABC transporters, Curr. Opin. Struck. Biol. 14 (2004) 426–431. E.A. Pratt, J.A. Jones, P.F. Cottam, S.R. Dowd, C. Ho, A biochemical study of the reconstitution of D-lactate dehydrogenase-deficient membrane vesicles using fluorine-labeled components, Biochim. Biophys. Acta 729 (1983) 167–175. A. Schryvers, E. Lohmeier, J.H. Weiner, Chemical and functional properties of the native and reconstituted forms of the membrane-bound, aerobic glycerol-3-phosphate dehydrogenase of Escherichia coli, J. Biol. Chem. 253 (1978) 783–788. A.J. Driessen, P. Fekkes, J.P. van der Wolk, The Sec system, Curr. Opin. Microbiol. 1 (1998) 216–222. B.C. Berks, T. Palmer, F. Sargent, The Tat protein translocation pathway and its role in microbial physiology, Adv. Microb. Physiol. 47 (2003) 187–254. D.O. Daley, M. Rapp, E. Granseth, K. Melen, D. Drew, G. von Heijne, Global topology analysis of the Escherichia coli inner membrane proteome, Science 308 (2005) 1321–1323. E. Wallin, G. von Heijne, Genome-wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms, Protein Sci. 7 (1998) 1029–1038. V. Yankovskaya, R. Horsefield, S. Tornroth, C. Luna-Chavez, H. Miyoshi, C. Leger, B. Byrne, G. Cecchini, S. Iwata, Architecture of succinate dehydrogenase and reactive oxygen species generation, Science 299 (2003) 700–704. T.M. Iverson, C. Luna-Chavez, L.R. Croal, G. Cecchini, D.C. Rees, Crystallographic studies of the Escherichia coli quinol-fumarate reductase with inhibitors bound to the quinol-binding site, J. Biol. Chem. 277 (2002) 16124–16130. M. Jormakka, S. Tornroth, J. Abramson, B. Byrne, S. Iwata, Purification and crystallization of the respiratory complex formate dehydrogenase-N from Escherichia coli, Acta Crystallogr., D Biol. Crystallogr. 58 (2002) 160–162. G. Butland, J.M. Peregrin-Alvarez, J. Li, W. Yang, X. Yang, V. Canadien, A. Starostine, D. Richards, B. Beattie, N. Krogan, M. Davey, J. Parkinson, J. Greenblatt, A. Emili, Interaction network containing conserved and essential protein complexes in Escherichia coli, Nature 433 (2005) 531–537. B.V. Alvarez, G.L. Vilas, J.R. Casey, Metabolon disruption: a mechanism that regulates bicarbonate transport, EMBO J. 24 (2005) 2499–2511. J.X. Yan, A.T. Devenish, R. Wait, T. Stone, S. Lewis, S. Fowler, Fluorescence two-dimensional difference gel electrophoresis and mass spectrometry based proteomic analysis of Escherichia coli, Proteomics 2 (2002) 1682–1698. C. Ji, A. Lo, S. Marcus, L. Li, Effect of 2MEGA labeling on membrane proteome analysis using LC-ESI QTOF MS, J. Proteome Res. 5 (2006) 2567–2576.

1712

J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713

[58] M.P. Molloy, B.R. Herbert, K.L. Williams, A.A. Gooley, Extraction of Escherichia coli proteins with organic solvents prior to two-dimensional electrophoresis, Electrophoresis 20 (1999) 701–704. [59] N. Zhang, R. Chen, N. Young, D. Wishart, P. Winter, J.H. Weiner, L. Li, Comparison of SDS- and methanol-assisted protein solubilization and digestion methods for Escherichia coli membrane proteome analysis by 2-D LC-MS/MS, Proteomics 7 (2007) 484–493. [60] E. Granseth, D.O. Daley, M. Rapp, K. Melen, G. von Heijne, Experimentally constrained topology models for 51,208 bacterial inner membrane proteins, J. Mol. Biol. 352 (2005) 489–494. [61] C.A. Hale, P.A. de Boer, Recruitment of ZipA to the septal ring of Escherichia coli is dependent on FtsZ and independent of FtsA, J. Bacteriol. 181 (1999) 167–176. [62] B. Herbert, Advances in protein solubilization for 2-dimensional electrophoresis, Electrophoresis 20 (1999) 660–663. [63] M.P. Molloy, Two-dimensional electrophoresis of membrane proteins using immobilized pH gradients, Anal. Biochem. 280 (2000) 1–10. [64] V. Santoni, S. Kieffer, D. Desclaux, F. Masson, T. Rabilloud, Membrane proteomics: use of additive main effects with multiplicative interaction model to classify plasma membrane proteins according to their solubility and electrophoretic properties, Electrophoresis 21 (2000) 3329–3344. [65] V. Santoni, P. Doumas, D. Rouquie, M. Mansion, T. Rabilloud, M. Rossignol, Large scale characterization of plant plasma membrane proteins, Biochimie 81 (1999) 655–661. [66] C.C. Wu, M.J. MacCoss, K.E. Howell, J.R. Yates, A method for the comprehensive proteomic analysis of membrane proteins, Nat. Biotechnol. 21 (2003) 532–538. [67] N. Zhang, N. Li, L. Li, Liquid chromatography MALDI MS/MS for membrane proteome analysis, J. Proteome Res. 3 (2004) 719–727. [68] T.Q. Shang, J.M. Ginter, M.V. Johnston, B.S. Larsen, C.N. McEwen, Carrier ampholyte-free solution isoelectric focusing as a prefractionation method for the proteomic analysis of complex protein mixtures, Electrophoresis 24 (2003) 2359–2368. [69] G. Weber, M. Islinger, P. Weber, C. Eckerskorn, A. Voelkl, Efficient separation and analysis of peroxisomal membrane proteins using freeflow isoelectric focusing, Electrophoresis 25 (2004) 1735–1747. [70] T. McDonald, S. Sheng, B. Stanley, D. Chen, Y. Ko, R.N. Cole, P. Pedersen, J.E. Van Eyk, Expanding the subproteome of the inner mitochondria using protein separation technologies. One- and two-dimensional liquid chromatography and two-dimensional gel electrophoresis, Mol. Cell. Proteomics 5 (2006) 2392–2411. [71] X. Zuo, K.-B. Lee, D.W. Speicher, Fractionation of complex proteomes by microscale solution isoelectrofocusing using ZOOM IEF fractionators to improve protein profiling, Proteomics Protocols Handbook, 2005, pp. 97–117. [72] M. Aivaliotis, W. Haase, M. Karas, G. Tsiotis, Proteomic analysis of chlorosome-depleted membranes of the green sulfur bacterium Chlorobium tepidum, Proteomics 6 (2006) 217–232. [73] R. Henningsen, B.L. Gale, K.M. Straub, D.C. DeNagel, Application of zwitterionic detergents to the solubilization of integral membrane proteins for two-dimensional gel electrophoresis and mass spectrometry, Proteomics 2 (2002) 1479–1488. [74] R.-P. Zahedi, C. Meisinger, A. Sickmann, Proteomics 5 (2005) 3581–3588. [75] R.J. Simpson, L.M. Connolly, J.S. Eddes, J.J. Pereira, R.L. Moritz, G.E. Reid, Electrophoresis 21 (2000) 1707–1732. [76] B.A. van Montfort, M.K. Doeven, B. Canas, L.M. Veenhoff, B. Poolman, G.T. Robillard, Combined in-gel tryptic digestion and CNBr cleavage for the generation of peptide maps of an integral membrane protein with MALDI-TOF mass spectrometry, Biochim. Biophys. Acta, Bioenerg. 1555 (2002) 111–115. [77] T.T.T. Quach, N. Li, D.P. Richards, J. Zheng, B.O. Keller, L. Li, Development and applications of in-gel CNBr/tryptic digestion combined with mass spectrometry for the analysis of membrane proteins, J. Proteome Res. 2 (2003) 543–552. [78] D.N. Perkins, D.J.C. Pappin, D.M. Creasy, J.S. Cottrell, Probabilitybased protein identification by searching sequence databases using mass spectrometry data, Electrophoresis 20 (1999) 3551–3567.

[79] C.C. Wu, J.R. Yates, The application of mass spectrometry to membrane proteomics, Nat. Biotechnol. 21 (2003) 262–267. [80] D.K. Han, J. Eng, H. Zhou, R. Aebersold, Quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry, Nat. Biotechnol. 19 (2001) 946–951. [81] J.L. Norris, N.A. Porter, R.M. Caprioli, Mass spectrometry of intracellular and membrane proteins using cleavable detergents, Anal. Chem. 75 (2003) 6642–6647. [82] M.P. Washburn, D. Wolters, J.R. Yates III, Large-scale analysis of the yeast proteome by multidimensional protein identification technology, Nat. Biotechnol. 19 (2001) 242–247. [83] L.M. Simon, M. Kotorman, G. Garab, I. Laczko, Structure and activity of a-chymotrypsin and trypsin in aqueous organic media, Biochem. Biophys. Res. Commun. 280 (2001) 1367–1371. [84] W.K. Russell, Z.Y. Park, D.H. Russell, Proteolysis in mixed organicaqueous solvent systems: applications for peptide mass mapping using mass spectrometry, Anal. Chem. 73 (2001) 2682–2685. [85] J. Blonder, M.L. Hale, D.A. Lucas, C.F. Schaefer, L.-R. Yu, T.P. Conrads, H.J. Issaq, B.G. Stiles, T.D. Veenstra, Proteomic analysis of detergentresistant membrane rafts, Electrophoresis 25 (2004) 1307–1318. [86] J. Blonder, M.B. Goshe, R.J. Moore, L. Pasa-Tolic, C.D. Masselon, M.S. Lipton, R.D. Smith, Enrichment of integral membrane proteins for proteomic analysis using liquid chromatography-tandem mass spectrometry, J. Proteome Res. 1 (2002) 351–360. [87] J. Blonder, P. Conrads Thomas, L.-R. Yu, A. Terunuma, M. Janini George, J. Issaq Haleem, C. Vogel Jonathan, D. Veenstra Timothy, A detergent- and cyanogen bromide-free method for integral membrane proteomics: application to Halobacterium purple membranes and the human epidermal membrane proteome, Proteomics 4 (2004) 31–45. [88] H. Zhong, Y. Zhang, Z. Wen, L. Li, Protein sequencing by mass analysis of polypeptide ladders after controlled protein hydrolysis, Nat. Biotechnol. 22 (2004) 1291–1296. [89] H. Zhong, S.L. Marcus, L. Li, Microwave-assisted acid hydrolysis of proteins combined with liquid chromatography MALDI MS/MS for protein identification, J. Am. Soc. Mass Spectrom. 16 (2005) 471–481. [90] N. Wang, L. MacKenzie, A.G. De Souza, H. Zhong, G. Goss, L. Li, Proteome profile of cytosolic component of zebrafish liver generated by LC-ESI MS/MS combined with trypsin digestion and microwave-assisted acid hydrolysis, J. Proteome Res. 6 (2007) 263–272. [91] X. Chen, L. Sun, Y. Yu, Y. Xue, P. Yang, Amino acid-coded tagging approaches in quantitative proteomics, Expert Rev. Proteomics 4 (2007) 25–37. [92] M. Mann, Functional and quantitative proteomics using SILAC, Nat. Rev., Mol. Cell Biol. 7 (2006) 952–958. [93] T.J. Griffin, D.K.M. Han, S.P. Gygi, B. Rist, H. Lee, R. Aebersold, K.C. Parker, Toward a high-throughput approach to quantitative proteomic analysis: expression-dependent protein identification by mass spectrometry, J. Am. Soc. Mass Spectrom. 12 (2001) 1238–1246. [94] S.P. Gygi, B. Rist, T.J. Griffin, J. Eng, R. Aebersold, Proteome analysis of low-abundance proteins using multidimensional chromatography and isotope-coded affinity tags, J. Proteome Res. 1 (2002) 47–54. [95] H. Zhou, A. Ranish Jeffrey, D. Watts Julian, R. Aebersold, Quantitative proteome analysis by solid-phase isotope tagging and mass spectrometry, Nat. Biotechnol. 20 (2002) 512–515. [96] A. Leitner, W. Linder, Chemistry meets proteomics: the use of chemical tagging reactions for MS-based proteomics, Proteomics 6 (2006) 5418–5434. [97] W.-J. Qian, M.E. Monroe, T. Liu, J.M. Jacobs, G.A. Anderson, Y. Shen, R.J. Moore, D.J. Anderson, R. Zhang, S.E. Calvano, S.F. Lowry, W. Xiao, L.L. Moldawer, R.W. Davis, R.G. Tompkins, D.G. Camp, R.D. Smith, Quantitative proteome analysis of human plasma following in vivo lipopolysaccharide administration using 16O/18O labeling and the accurate mass and time tag approach, Mol. Cell. Proteomics 4 (2005) 700–709. [98] X. Yao, A. Freas, J. Ramirez, P.A. Demirev, C. Fenselau, Proteolytic 18O labeling for comparative proteomics: model studies with two serotypes of adenovirus. [Erratum to document cited in CA135:089410], Anal. Chem. 76 (2004) 2675.

J.H. Weiner, L. Li / Biochimica et Biophysica Acta 1778 (2008) 1698–1713 [99] K.L. Johnson, D.C. Muddiman, A method for calculating 16O/18O peptide ion ratios for the relative quantification of proteomes, J. Am. Soc. Mass Spectrom. 15 (2004) 437–445. [100] J.-L. Hsu, S.-Y. Huang, N.-H. Chow, S.-H. Chen, Stable-isotope dimethyl labeling for quantitative proteomics, Anal. Chem. 75 (2003) 6843–6852. [101] C. Ji, L. Li, Quantitative proteome analysis using differential stable isotopic labeling and microbore LC-MALDI MS and MS/MS, J. Proteome Res. 4 (2005) 734–742. [102] J.E. Melanson, S.L. Avery, D.M. Pinto, High-coverage quantitative proteomics using amine-specific isotopic labeling, Proteomics 6 (2006) 4466–4474. [103] C. Ji, L. Li, M. Gebre, M. Pasdar, L. Li, Identification and quantification of differentially expressed proteins in E-cadherin deficient SCC9 cells and

[104] [105]

[106] [107]

1713

SCC9 transfectants expressing E-cadherin by dimethyl isotope labeling, LCMALDI MS and MS/MS. [Erratum to document cited in CA143:169101], J. Proteome Res. 4 (2005) 1872. J.R. Kimmel, Guanidination of proteins, Methods Enzymol. 11 (1967) 584–589. A. Gilchrist, C.E. Au, J. Hiding, A.W. Bell, J. Fernandez-Rodriguez, S. Lesimple, H. Nagaya, L. Roy, S.J. Gosline, M. Hallett, J. Paiement, R.E. Kearney, T. Nilsson, J.J. Bergeron, Quantitative proteomics analysis of the secretory pathway, Cell 127 (2006) 1265–1281. J.J. Bergeron, M. Hallett, Peptides you can count, Nat. Biotechnol. 25 (2007) 61–62. R. Orme, C.W. Douglas, S. Rimmer, M. Webb, Proteomic analysis of Escherichia coli biofilms reveals the overexpression of the outer membrane protein OmpA, Proteomics 6 (2006) 4269–4277.