Green Fluorescent Protein (PDB: 4EUL) from Aequorea victoria

Created by: Benjamin Campbell

Green fluorescent protein (PDB: 4EUL), often called enhanced green fluorescent protein, is a widely used variant of the wild-type green fluorescent protein (PDB: 1GFL). Wild-type green fluorescent protein, or wt-GFP, occurs naturally in the jellyfish Aequorea victoria. wt-GFP has a minor excitation peak at 475 nm and a major excitation peak at 395 nm. Excitation at both of these peaks results in emission peaking at 505 nm. Osamu Shimomura first discovered wt-GFP in 1962 and identified it as a companion protein to the famous chemiluminescent protein aequorin. The chemiluminescence of aequorin is blue and has an emission peak near 470 nm, but the observed fluorescence of the jellyfish host was green. It was later determined that wt-GFP converts the blue emission of aequorin into the green chemiluminescence observed in the jellyfish (1). The molecular weight of wt-GFP is 26,854.26 Da, and its isoelectric point (pI) is 5.80 (2) 

The major significance of wt-GFP is due to its wide range of applications in the fields of biochemistry and cell biology. wt-GFP has the unique ability to fluoresce without the need for any co-factors or substrates except molecular oxygen. Therefore, wt-GFP can easily be used to tag proteins within a cell and monitor their expressions levels or localization patterns (3). If the gene encoding for wt-GFP is connected in frame with the gene for a protein of interest, the result is a fusion protein with wt-GFP covalently linked to either the N- or C-terminus (1). This technique is quite popular in many cellular imaging applications.  

wt-GFP exists as an 11-stranded β-barrel structure with a core 3/10 helix containing the chromophore. 52 percent of the secondary structure is β-sheets, along with 9 percent for 310 helices and 2 percent for α-helices. The rest of the protein exists as random coils, which link the strands of the β-barrel. The β-barrel structure helps to protect the chromophore from quenching through water dipoles, paramagnetic oxygen, or cis-trans isomerization (3). This gives wt-GFP the ability to fluoresce at varying pHs and redox potentials and resist photobleaching (1). 

            The chromophore of wt-GFP is a p-hydroxybenzylideneimidazolinone formed from residues Ser-65, Tyr-66, and Gly-67 (1). The chromophore is formed via a three-step autocatalytic process that occurs after GFP folds into its native conformation (Image 1). The first step of formation is the nucleophilic attack of the amide nitrogen of Gly-67 onto the carbonyl carbon of Ser-65. This results in the formation of an intrachain five-membered ring consisting of atoms from the original peptide chain. This ring is then oxidized by molecular oxygen to form a cyclic imine. Finally, the hydroxyl group that was formed during nucleophilic attack of the carbonyl carbon is eliminated as a water molecule and accompanied by a proton transfer (4). The chromophore can exist in two states: deprotonated (anionic), corresponding to the 475 nm excitation peak, or protonated (neutral), corresponding the 395 nm excitation peak (1). A large number of polar residues and structured water molecules surround the mature chromophore and help to tune the absorbance and fluorescence qualities through ionic and hydrogen-bonding interactions (1). These interactions allow for the possibility of GFP-mutants with differing excitation and emission spectra.

            Although wt-GFP is commonly used in many cell biology applications as a fusion tag, it still has a few significant shortcomings. As mentioned before, wt-GFP has two excitation peaks, a major one at 395 nm and a minor one at 475 nm. The 475 nm excitation spectrum is more advantageous for imaging applications, but it suffers from a relatively low amplitude (5). Also, the folding efficiency of the protein is quite low at physiologically relevant temperatures (3). Finally, the chromophore maturation half-life is close to 30 minutes, which limits the use of wt-GFP as a reporter for rapid gene inductions (4, 5). To address the limitations of wt-GFP, molecular biologists created the wt-GFP variant called enhanced green fluorescent protein (PDB: 4EUL), or EGFP, which contains the mutations Phe-64-Leu and Ser-65-Thr (3). EGFP has a molecular weight of 26,897.43 Da and a pI of 5.58 (2). EGFP exists as a monomer and also has the same secondary structures as wt-GFP because the mutations occur within the core 310 helix and chromophore and do not affect overall protein structure. F64L increases the folding efficiency of the protein at 37°C (3). S65T is responsible for the suppression of the major excitation peak at 395 nm and the augmentation of the preferred 475 nm peak (5).

            The crystal structure of EGFP was determined at a resolution of 1.35 Å. EGFP was isolated from E. coli lysate, purified using ion exchange chromatography, and precipitated using a 75 percent weight by volume (w/v) solution of ammonium sulfate. Precipitated EGFP was then resuspended in a buffer solution and crystallized using the sitting drop vapor diffusion method at 4°C. Crystals were obtained in 0.1 MES/NaOH, 200 mM calcium acetate, and 20 percent (w/v) PEG 8000 supplemented with 13 percent (w/v) PEG 200. Diffraction data was collected on beamline I02 at the Diamond Light Source. The crystal structure of EGFP shows residues Lys-3 to Leu-231. Residues 1, 2, and 232-239 were unable to be crystallized because of significantly increased mobility in the regions of the N- and C-termini (3).

            The high-resolution crystal structure helps determine how the mutations result in the changes in EGFP folding and fluorescence characteristics. F64L replaces the bulky aromatic side chain of phenylalanine with the smaller, yet still hydrophobic side chain of leucine within the protein core. The decrease in side chain size allows for numerous residues, such as Val-29Leu-18Trp-57, and Phe-27 to shift towards the chromophore. These shifts could cause increased folding efficiency of EGFP by allowing for better packing of the β-barrel and reducing exposure of hydrophobic residues to the surrounding solvent (3).   

            Analysis of the crystal structure surround S65T illustrates the importance that Glu-222 has on the fluorescence characteristics of the chromophore.  Glu-222 is shown to exist in two distinct conformations. In one conformation, the side chain donates a hydrogen bond to the hydroxyl group of Thr-65. In the other, the side chain donates a hydrogen bond to the hydroxyl group of Ser-205. For both conformations, the carboxylic acid side chain of Glu-222 must be protonated and therefore neutral to serve as a hydrogen bond donor. This removes any chance of electrostatic repulsion between negative charges on Glu-222 and the anionic form of the chromophore. Thus, the chromophore is free to exist entirely in the deprotonated state leading to a single EGFP excitation peak at 475 nm (3).

            The ligands present in the crystal structure for EGFP are calcium (Ca2+), sulfate (SO42-) and di(hydroxyethyl) ether (PEG). None of these ligands are discussed in the literature; therefore, they are thought to have no specific function other than aiding in the crystallization of EGFP. In fact, one of the most important features of EGFP and wt-GFP is that these proteins do not require ligands to fluoresce.

            The success of wt-GFP and EGFP as reporters in the field of cell biology led researchers to search for more fluorescent proteins with different excitation and emission spectra. Of the fluorescent proteins found, DsRed (PDB: 1G7K) from the sea anemone Discosoma was especially valuable because its red-shifted fluorescence was quite visually distinct from that of GFP. mCherry (PDB: 2H5Q) from Discosoma is an optimized mutant of DsRed that is monomeric and fluoresces brighter than its predecessor. mCherry also has a faster chromophore maturation rate, improved photostability, and improved ability to serve as an N-terminal fusion tag (6).

            When compared to EGFP, mCherry is similar in many respects. mCherry is a small protein with a molecular weight of 26,722.19 Da and a pI of 5.62 (2). ExPASy Bioinformatics Resource Portal was used to compute the theoretical molecular weight and pI of each protein from its FASTA sequence. PSI-BLAST assigned mCherry an E-value of 5*10-88 in the search query for EGFP. This indicates that mCherry and EGFP have extremely similar primary structures. In fact, mCherry has greater than 25 percent sequence identity to EGFP (7). PSI-BLAST calculates E-values by looking at the total sequence homology between the subject protein and the protein query. Gaps are assigned when amino acids exist in the subject’s sequence but not in the query’s sequence. Sequence homology decreases the E-value whereas gaps increase the E-value; therefore, any E-value less than 0.5 is considered significant. Both mCherry and EGFP also exist as β-barrel structures with an internal chromophore-containing helix. The Dali Server confirmed these similarities in tertiary structure by assigning mCherry a Z-score of 30.3 when compared to EGFP (8). The Dali Server measures similarities in tertiary structure by comparing intermolecular distances. A Z-score greater than 2 indicates that a protein has similar folds and overall tertiary structure.

            mCherry has an excitation peak at 587 nm and an emission peak at 610 nm. Subtle differences in the maturation of the chromophore and the area surrounding the chromophore give mCherry different fluorescent qualities than EGFP. In addition to the previously discussed chromophore maturation reaction, the chromophore for mCherry undergoes one more oxidation step to an acylamine. Also, the chromophore environment of mCherry is substantially more hydrophobic than that of EGFP (9). The structural similarities and fluorescent distinctions between mCherry and EGFP illustrate an important characteristic of fluorescent proteins: they are easily tunable. Today, researchers have either discovered or created fluorescent proteins that cover almost the entirety of the visible electromagnetic spectrum. A few amino acid substitutions can greatly alter the excitation and emission peaks, fluorescence intensity, and maturation rate. For this reason, fluorescent proteins such as EGFP and mCherry remain an irreplaceable tool in cell biology and biochemistry.