Specific isotopic labeling of hemes provides a unique opportunity to characterize the structure and function of heme-proteins. Unfortunately, current methods do not allow efficient labeling in high yields of multiheme cytochromes c, which are of great biotechnological interest. Here, a method for production of recombinant multiheme cytochromes c in Escherichia coli with isotopically labeled hemes is reported. A small tetraheme cytochrome of 12 kDa from Shewanella oneidensis MR-1 was used to demonstrate the method, achieving a production of 4 mg pure protein per liter. This method achieves, in a single step, efficient expression and incorporation of hemes isotopically labeled in specific atom positions adequate for spectroscopic characterization of these complex heme proteins. It is, furthermore, of general application to heme proteins, opening new possibilities for the characterization of this important class of proteins.
Heme proteins hold a special place in the development of modern biochemistry. Hemoglobin is aptly considered an honorary enzyme despite its physiological role as a diatomic gas transporter (1). Indeed, heme proteins perform ubiquitous cell functions such as electron transfer (cytochromes) (2-4), energy transduction (cytochrome c oxidase) (5, 6), catalysis (P450 and peroxidases) (7-10), or molecular sensing (chemotactic proteins) (11). One class of heme proteins has recently gathered considerable attention, being the focus of a Biogeochemistry Grand Challenge of the U.S. Department of Energy. These are multiheme cytochromes c, which mediate electron transfer at the microbe-mineral interface in geological settings and at the microbe-electrode interface in bioelectrochemical devices (12, 13). These cytochromes have been shown to play major roles in cellular respiration and to exist in almost all major groups of Bacteria and Archaea (14). In some of these organisms, such as representatives of Geobacter, Shewanella, Anaeromyxobacter, or Desulfovibrio genera, the number of multiheme cytochromes is so elevated that it corresponds to a high percentage of their proteome, having elicited the creation of the term “cytochromome” (15, 16). For Geobacter and Shewanella, many of these are known to be essential for the extracellular respiration that is at the core of electricity production in microbial fuel cells (17).
Given their biological importance, considerable effort has gone into the development of efficient methods for their recombinant production to facilitate their molecular characterization and/or their use in medical or biotechnological applications (12, 16-18). In comparison to other types of cytochromes, multiheme cytochromes c are more difficult to express correctly on two accounts: the various hemes must be covalently attached, through thioether linkages, to the polypeptide chain at the CXX(XX)CH binding site; and also the correct distal axial ligand must be connected to the iron in the nascent protein. This is essential in order to obtain the native fold of the protein. For this to occur properly in Gram negative bacteria, specific molecular assembly helper proteins, collectively known as the cytochrome c maturation proteins CcmA-H (ccm cluster), are needed (19, 20). These proteins are responsible for the correct ligation of the heme to the apoprotein, while it is translocated to the periplasmic space. This cytochrome cbiogenesis system is denominated as system I and is the most complex of the presently known cytochrome c biogenesis systems, allowing the maturation of a variety of c-type cytochromes under different conditions (20-22).
Among many systems available for recombinant protein expression, the bacterium Escherichia coli is one of the most attractive hosts, due to the advantage of fast growth at a high density in an inexpensive medium, well-characterized genetics, and the availability of a large number of cloning vectors. With the insertion of a plasmid containing the ccm cluster (23, 24), E. coli becomes capable of expressing c-type cytochromes with correctly inserted hemes under aerobic conditions. Also, under these conditions, E.coli has the advantage of expressing only the recombinant c-type cytochromes. This simplifies considerably the protein purification procedure in comparison to other expression hosts described in the literature (25-29), which also produce their own native c-type cytochromes under the expression conditions used.
Multiheme cytochromes c are also more difficult to analyze with respect to their detailed functional properties, due to the multiple combinations of electron distribution that can occur among the various hemes. Since the heme cofactors are the functional components of multiheme proteins, their specific isotopic labeling is an attractive strategy to analyze the structure and function of these proteins. Toward this end, several approaches have been developed involving the supplementation of the growth media with specifically labeled heme cofactors (30, 31) or with isotopically labeled heme precursor δ-aminolevulinic acid (dALA) (31-33). Also, to guarantee that the uptake of these substituents is efficient, bacterial strains that are incapable of synthetizing hemes cofactors were created by deleting genes that are responsible for biosynthesis of dALA in the cell, such as the hemA gene (33, 34).
However, the methods presently published do not allow isotopic labeling of the hemes in multiheme c-type cytochromes (30-33), preventing the study of these more complex cytochromes. Here, a method that allows efficient expression of recombinant multiheme cytochromes c with specific isotopic labeling in the various hemes is reported. This method will bring an enormous advantage for the characterization of this important class of proteins by several spectroscopic techniques.
Nuclear magnetic resonance (NMR), resonance Raman, and Fourier transform infrared spectroscopy (FTIR), are among the spectroscopic methods capable of probing the structure and function of multiheme cytochromes c. All stand to benefit from the spectral simplification afforded by the isotopic labeling of specific heme carbons.
Material and methods Chemical reagentsAll chemical reagents were obtained from Sigma-Aldrich (St. Louis, MO, USA) with the exception of the BugBuster protein extraction reagent (Merck KGaA, Darmstadt, Germany) and the Complete protease inhibitor cocktail tablets (Roche, Basel, Switzerland).
Synthesis of labeled dALAThe method reported by Bunce et al. (35) was used to synthesize 1,2-13C-labeled dALA. This compound gives rise to hemes with 13C-labeled atoms in the methyl positions and also at the β- and carboxylate carbons of the propionate groups (36).
Construction of the hemA knockout in E. coli strain JM109(DE3)The E. coli strain JM109(DE3) was purchased from Promega (Fitchburg, WI, USA). In order to generate a ΔhemA mutant in E. coli JM109(DE3), the method developed by Datsenko & Wanner to disrupt chromosomal genes was used (37). First, the hemA gene was replaced with a kanamycin resistance (kan) gene, which was then subsequently excised. In order to do this, the plasmid pKD4 was used as a template and the region that contained the kan resistance gene flanked by Flp recombination target (FRT) sites was amplified. Primers homologous to the flanking sites of the kan gene were constructed. These also carried nucleotide extensions homologous to the chromosomal regions adjacent to the hemA gene (forward: 5′-CAGACTAACCCTATCAACGT-TGGTATTATTTCCCGCAGACGT-GTAGGCTGGAGCTGCTTC-3′ and reverse: 5′-GGCGTAAATGCACCCT-GTAAAAAAAGAAAATGATGTACT-GCATATGAATATCCTCCTTAG-3′). The PCR product consisted of the kan gene flanked by 40-bp sequences homologous to the regions adjacent to the hemA gene of E. coli. Before inserting the amplified region, the E. coli cells were transformed with the plasmid pKD46 in order to obtain better transformation and recombination rates. This plasmid contains the genes of the phage λ Red protein recombination system. Kanamycin-resistant clones were selected on Luria-Bertani (LB) medium supplemented with 50 mg/L dALA and 50 mg/L kanamycin. The kan gene was subsequently eliminated using the helper plasmid pCP20, which encodes for the Flp recombinase and promotes recombination at the FRT sites. This step allows the excision of the antibiotic gene from the chromosome, since it is flanked by the FRT sites, leaving behind only a small residual scar. Successful disruption of the hemA gene was verified by PCR. This hemA knockout E. coli mutant was denominated LS542.
Construction of the E. coli strains used for protein expressionThe plasmid pEC86 (23), containing the ccmABCDEFGH (cytochrome c maturation) genes, was introduced via electroporation into LS542 to create LS543.
Expression vectors pET21a-stC(D2N) (38) and pKP1 (39) containing the genes encoding for the mutant STC(D2N) and native MtrA, respectively, were subsequently introduced into LS543. These strains were denominated LS543-STC(D2N) and LS544, respectively.
In order to compare expression levels between the hemA knockout E. coli mutant strain and the wild-type (wt) strain E. coli JM109(DE3), the plasmids pEC86 and pET21a-stC(D2N) were also introduced into wt. This strain was denominated ROL002.
With exception of the wild-type, all the other strains were grown in media supplemented with 50 mg/L dALA. The supplementation of the media with dALA is essential for growth of the mutant strains.
Optimization of protein expressionTo optimize the expression, the E. coli strains LS543-STC(D2N), LS544, and ROL002 were tested using several overexpression methods for cytochromes. In all the cases, controlled modifications of temperature, dALA concentration, IPTG concentration, and induction period were performed.
The overexpression of proteins in a rich medium with an induction step at a certain OD600nm was tested (24). This was done using LB medium (10 g/L tryptone, 5 g/L yeast extract, 10 g/L NaCl) and also in Terrific Broth (TB) medium (12 g/L tryptone, 24 g/L Yeast extract, 4 mL glycerol, 2.31 g/L KH2PO4, 12.54 g/L K2HPO4) supplemented with 35 mg/L chloramphenicol, 100 mg/L ampicillin, and 50 mg/L dALA. Cells were grown overnight at 30°C and 180 rpm, and 1% of this culture served as inoculum for the fresh medium. Batches of fresh medium were supplemented with different concentrations of dALA, ranging from 20 to 200 mg/L, in order to determine the best dALA concentration to overexpress the protein. The culture was then allowed to grow at 180 rpm, at a defined temperature (25°, 30°, or 37°C) until an OD600nm ╛ 0.8 was reached. The protein expression was induced by addition of different concentrations of IPTG, ranging from 0 to 1 mM.
The method developed by Fernandes et al. (40) for isotopic labeling of c-type multiheme cytochromes was also tested. The cells were first grown in 1 L LB medium supplemented with 50 mg/L dALA, 35 mg/L chloramphenicol, and 100 mg/L ampicillin at 30°C and 180 rpm. After achieving an OD600nm ╛ 1.5, the cells were harvested by centrifugation at 6400× g for 30 min. The cell pellet was washed twice with a salt solution containing 15 g/L KH2PO4, 38 g/L Na2HPO4·H2O, and 2.5 g/L NaCl, in order to remove nutrients left behind by the rich medium. The cells were subsequently resuspended in 250 mL minimal medium containing 5 g/L NH4Cl, 0.5 g/L MgSO4, 0.01 g/L CaCl2, 1 mg/L MnCl2·4H2O, 2.8 mg/L FeSO4·7H2O, 1× BME vitamins solution, 4 g/L glucose, or 7 mL/L 50% glycerol stock solution as carbon source, supplemented with 35 mg/L chloramphenicol, 100 mg/L ampicillin, and 1 mM dALA. Cells were allowed to recover for a period of 1 h before inducing protein expression. Different concentrations of IPTG, ranging from 0 to 1 mM, and also different temperatures (25°, 30°, or 37°C) were used.
The auto-induction method developed by Studier (41) that had previously been used to produce heme proteins in E. coli (42) was also tested. The cells were first grown overnight in LB medium supplemented with 50 mg/L dALA, 35 mg/L chloramphenicol, and 100 mg/L ampicillin at 30°C and 180 rpm. One percent of this culture served as inoculum. The auto-induction medium was made by adding 50 mL stock solution 20× NPS [1 M Na2HPO4, 1 M KH2PO4, 500 mM (NH4)2SO4], 20 mL stock solution 50× 5052 (25% glycerol, 2.5% glucose, 10% α-lactose monohydrate) to 1 L LB or TB medium supplemented with 1 mM MgSO4, 0.1 mM FeSO4, 35 mg/L chloramphenicol, and 100 mg/L ampicillin. The optimal dALA concentration was determined, by testing concentrations varying from 20 to 200 mg/L. Different temperatures (25°, 30°, and 37°C) were also tested in order to optimize protein expression.
In order to test different induction periods, aliquots (1 mL) were taken each hour from each different culture condition and centrifuged to collect the cell pellet. The pellets were subsequently lysed using BugBuster protein extraction reagent. The supernatants were run on a SDS-PAGE, and the gels were stained using the heme staining procedure developed by Francis & Becker (43).
The gels were visualized and the stained protein bands were compared and analyzed for their intensity using the ImageJ v1.44 program (http://rsbweb.nih.gov/ij/index.html).
All growth conditions and analyses were performed in triplicate.
Protein production and purificationThe auto-induction method previously described was used to overexpress the mutant STC(D2N) multiheme cytochrome in E. coli strain LS543-STC(D2N). The auto-induction medium was supplemented with 35 mg/L chloramphenicol, 100 mg/L ampicillin, 50 mg/L unlabeled dALA to produce unlabeled hemes, or 50 mg/L 1,2-13C-labeled dALA (35) to produce specific isotopically labeled hemes.
The cells were grown for 27 h at 30°C and 180 rpm in conical flasks filled with one-fifth the total volume. Expression of STC(D2N) was induced spontaneously by depletion of the glucose present in the medium. There was a clear change in the medium's color to orange. The bacterial cells were harvested by centrifugation at 4°C and at 10,000× g for 15 min. The cell pellet was disrupted by gentle stirring for 20 min at room temperature with 5 mL BugBuster protein extraction reagent per gram of wet cell paste supplemented with a protease inhibitor cocktail. Cell debris were removed by centrifugation at 16,000× g for 20 min at 4°C. The supernatant containing the soluble extract was loaded directly onto a Q-Sepharose column (GE Healthcare, Little Chalfont, UK), equilibrated previously with 10 mM Tris buffer, pH 7.6. A salt gradient from 0 to 1 M NaCl in 10 mM Tris, pH 7.6, was applied and the fraction containing STC(D2N) was eluted at 200 mM. The final purification step was performed on a hydroxylapatite (HTP) column (Bio-Rad Laboratories, Hercules, CA, USA), preequilibrated with 10 mM phosphate buffer, pH 7.6. The fraction containing STC(D2N) did not bind to the column and was eluted in the washout volume.
The chromatographic fractions were routinely analyzed by SDS-PAGE and UV-visible spectroscopy to select those containing STC(D2N). The purity of the protein was confirmed by a single band on SDS-PAGE. Pure samples present an absorption ratio A408nm/A280nm of approximately 4.5 in the UV-visible spectrum.
NMR sample preparation and experimentsProtein for NMR experiments was lyophilized twice using 2H2O (99.9 atom %). The protein was dissolved in approximately 500 µL 2H2O to a final concentration of approximately 0.5 mM in 10 mM phosphate buffer at pH 7.0. The pH value reported is a direct reading without correction for the isotope effect. NMR spectra obtained before and after the lyophilization were identical, showing that the protein structure was not affected by this procedure. The NMR experiments were performed with Bruker Avance and Avance II spectrometers operating at 500 MHz (Bruker BioSpin, Wissembourg, France). 2D-1H-13C heteronuclear multiple quantum correlation (HMQC) spectra were obtained with 2048 points covering a spectral width of 39.7 kHz in the 1H dimension and 256 increments with time-proportional phase incrementation (TPPI) to give a spectral width of 34 kHz in the 13C dimension, using a Δ delay fixed at 3.2 ms, with 16 scans, at a temperature of 25°C. 1D-1H spectra were always performed before and after each 2-D spectrum, in order to verify that no changes had occurred in the protein sample. Partially reduced samples were prepared by flushing out the oxygen from the NMR tube with nitrogen gas and subsequently adding controlled amounts of a freshly prepared 10 mM sodium dithionite solution using a gas-tight syringe. The 1H spectra were calibrated using the water signal as an internal reference. The Bruker TopSpin program (Bruker BioSpin, Wissembourg, France) was used to visualize and analyze the NMR spectra.
Results and discussionIsotopic labeling of hemes using labeled dALA was originally reported by Druyan et al. (44), which labeled hemes of cytochromes belonging to the rat liver. Wachenfeldt et al. (45) and Rivera et al. (32) used isotopic labeled dALA to label hemes of monoheme cytochromes from bacteria. This was achieved by expressing the protein in a minimal medium supplemented with isotopically labeled dALA. This proved the concept for this type of strategy, although expression in minimal media has a much lower yield relative to expression in complex media, such as LB. However, using this method in a complex medium would lead to expression of proteins containing hemes produced from unlabeled sources available in the medium, decreasing the efficiency of isotopic labeling. A method for isotopic labeling of multiheme cytochromes c using complex medium was proposed by Fernandes et al. (40) that uses two steps of growth. Cells are initially grown in rich medium, harvested, washed, and resuspended in minimal medium supplemented with dALA. However, this procedure is experimentally cumbersome and not very efficient, since the overexpression step also occurs in minimal medium, which limits greatly the expression yields.
Although Woodward et al. (30) and Bryson et al. (33) reported strategies to isotopically label hemes, using mutants of E. coli incapable of synthesizing either the heme or dALA, respectively, these methods are limited with respect to the type of cytochromes that can be expressed. In particular, c-type cytochromes could not be obtained. Nonetheless, these developments opened the possibility of using rich media, such as LB, supplemented with heme or dALA under aerobic conditions to overexpress cytochromes.
Taking advantage of this progress, an E. coli mutant unable to synthesize dALA was created. The commercially available strain JM109(DE3), which had already proven its value for correctly overexpressing large multiheme cytochromes c efficiently, was used as source organism (39). This modified strain was then transformed with the vector pEC86 containing the ccmABCDEFGH (cytochrome c maturation) genes, that endows E. coli with the ability to correctly incorporate hemes in multiheme cytochromes c, when growing aerobically (23). This increases the versatility of this novel ΔhemA E. coli strain with respect to the type of cytochromes it can express efficiently, versus previous methods (30, 32, 33).
Deletion of the hemA gene does not affect the capability of the E. coli strains to express multiheme cytochromes efficiently, provided that the expression medium is supplemented with dALA. This was verified by comparing the multiheme cytochrome expression in the wt E. coli JM109(DE3) strain containing pEC86 with the expression in the ΔhemA E. coli strain LS543 (Figure 1A). For this, an expression vector containing a gene for the 12-kDa small tetraheme cytochrome (STC) from Shewanella oneidensis MR-1, mutated in a surface aspartate to an asparagine (D2N) was inserted in both strains. This modification does not affect the structure of the mutant versus native form, as determined by the pattern of the paramagnetic shifts in the NMR spectra collected in oxidized state (see Figure 2, B and C) but is expected to affect the interaction with physiological partners due to changes in surface electrostatics (38).


Moreover, to confirm that the ΔhemA E. coli LS543 strain is capable of expressing larger and more complex multiheme cytochromes c, an expression vector containing the gene of the decaheme cytochrome MtrA from S. oneidensis MR-1 of approximately 37 kDa, was used. Figure 1B shows that the resulting ΔhemA E. coli LS544 strain is also capable of expressing this protein, opening the door to the detailed characterization of these larger multiheme c-type cytochromes.
The auto-induction method developed by Studier (41) showed the best expression yields compared with the other protein expression methods tested (Figure 1A). This approach has the advantage of allowing the induction to occur gradually. This gradual process is essential for correct incorporation of the hemes, and allows the cultures to reach higher cell densities, thereby increasing the protein yield. A yield of approximately 4 mg pure STC(D2N) per liter of cell culture was obtained. To the best of our knowledge, to present date, this is the highest yield obtained for isotopically labeled multiheme cytochromes (40) and is comparable to other strategies used for overexpressing nonlabeled multiheme cytochromes c (25, 26, 46).
Thus, this expression method allows the efficient production of specifically isotopic labeled hemes and also their correct incorporation into a multiheme cytochrome c (Figures 2 and 3).

Using 1,2-13C-labeled dALA causes the incorporation of 13C at the methyl groups at the periphery of the heme macrocycle and also at the β- and carboxylate carbons of the propionate groups (Figure 2A). Figure 2 shows that adventitious unlabeled carbons in the methyl positions is below the detection limit of NMR experiments as can be confirmed by verifying the lack of residual peaks at the center of each doublet in spectra obtained without 13C decoupling (Figure 2A). In low-spin paramagnetic cytochromes, the heme methyls are reasonably sharp and typically located in a clean spectral region in the 13C dimension (Figure 3). This allows for a simple identification of these NMR signals that facilitates their assignment, since the remainder of the protein contains 13C only at natural abundance (╛1%).
This now opens the possibility to characterize in detail the structure and function of multiheme cytochromes containing a large number of hemes thanks to the greater spectral dispersion obtained in the 13C frequency versus the 1H frequency (Figure 3). The position of the heme methyl signals in low-spin paramagnetic hemes can be used to determine the orientation of the axial ligands and the placement of the magnetic axes system associated with the unpaired electron (47). When a multiheme cytochrome is titrated, the position of the methyl signals changes in ways that can be related with the oxidized fraction allowing for the determination of the relative reduction potentials of the hemes (48). The specific 13C labeling enabled by the method reported here is further suitable for characterizing proteins of large size or containing paramagnetic centers (49), because it allows the use of direct heteronuclear detection experiments such as 13C-13C NOESY, which may be more suitable than 1H based experiments in these cases.
Also, since dALA is a versatile labeling source for hemes, with different labeled carbons in dALA, different kinds of information can be obtained. For instance, considering NMR applications, using 5-13C-labeled dALA, the heme carbons attached to the meso protons can be labeled. Measurements of the residual dipolar coupling (RDC) of these signals provide information on the relative spatial orientation of the hemes (50).
A further general advantage of the method described here when applied to NMR spectroscopy is that the need for a highly concentrated sample, or even a pure sample, may be eliminated. Under aerobic conditions E. coli only expresses the cytochrome of interest, assuring the specific and efficient use of the labeled dALA in the biosynthesis of the hemes for this protein. Therefore, the 13C NMR spectrum is dominated by the signals of the labeled hemes. This advantage may facilitate the future characterization of multiheme c-type cytochromes that are difficult to express and purify, such as those associated to cell membranes, and may also allow the in cell characterization of cytochromes.
In conclusion, a strategy to efficiently produce multiheme cytochromes labeled at selected carbons in the hemes was developed. The simplicity of the method and its ability to produce isotopically labeled multiheme c-type cytochromes with a yield comparable to that obtained from the expression of unlabeled proteins, makes this approach potentially applicable to many different heme proteins. This is true even for those cytochromes that are not of the c-type and therefore dispense the need for covalent attachment of the heme to the polypeptide chain. The methodology will also enable the detailed structural and functional characterization of large multiheme cytochromes. A detailed characterization of these proteins, which mediate microbe-mineral or microbe-electrode contact, is essential to develop rationally designed bioelectrochemical devices and bioengineered systems for bioenergy production and bioremediation of environmental contaminants (13).
The plasmid pEC86 used in this work was a gift from Prof. L Thöny-Meyer. B.M.F. is the recipient of a PhD fellowship from Fundação para a Ciência e Tecnologia (FCT; SFRH/BD/41205/2007). L.S. was supported by the Subsurface Biogeochemical Research program/Office of Biological and Environmental Research, U.S. Department of Energy. Research in the author's laboratories was supported by grants PTDC/BIA-PRO 098158/2008, MIT-Pt BS-BB/1014/2008 from FCT awarded to R.O.L. and a grant from the National Science Foundation (MCB-0818488) awarded to M.R. This work was also supported by FCT through grant PEst-OE/EQB/LA0004/2011. The
The authors declare no competing interests.