CataboliteActivatorProtein
Catabolite activator protein complexed with DNA and cAMP (1J59) from Escherichia coli 
Created by Emily Symes
     The catabolite activator protein of Escherichia coli (PDB ID = 1J59) is a transcription activator protein that binds, in the presence of the allosteric effector, cAMP, to specific DNA sites located within operons to either activate or suppress transcription of DNA from a nearby promoter (1).  The study of CAP is essential to understand the mechanism by which gene expression is positively regulated and CAP’s transcriptional regulation of several functionally related genes by one common signal (2).  These functions of CAP are dependent on the structural features of CAP and on CAP-DNA binding.  Structure of CAP allows the protein to interact with its allosteric effector, cAMP, and the DNA molecule.  CAP-DNA binding is a significant indicator of the role of DNA bending in the formation of specific hydrogen bonding interactions that confer stability to the complex.        
     The catabolite activator protein is 209 residues long and has a molecular weight of 44603.02 Da (3).  Its isoelectric point is 8.56, indicating that the primary structure contains more basic than acidic amino acids and that the protein contains a net positive charge at physiological pH (4).  CAP consists of two symmetrical subunits composed of the same 209 amino acid sequence.  Each subunit contains large regions of positive potential that complex with the negatively charged phosphate backbone of DNA.  This causes the DNA to bend by as much as 150°, which is necessary for CAP to interact with the entire 28 base pair binding site found on DNA (1).  Without this conformational change in DNA, increased distance between DNA bases and amino acid residues of CAP would not allow for the formation of stabilizing hydrogen bonds and Van der Waals interactions.  
     The secondary structure of the two subunits of CAP consists of beta sheets, alpha helices, 3/10 helices, and random coils.  Structural stability conferred through regular hydrogen bonding between amino acid residues in alpha helices and beta sheets maintains the conformation of CAP, which is required for precise DNA interactions.  Though less frequent and in some cases irregular, hydrogen bonding interactions in 3/10 helices and random coils also contribute to structural stability of CAP.  The secondary structure is additionally characterized by a helix-turn-helix DNA binding motif (5).  This motif is composed of two alpha helices joined by a short string of amino acids and is found in many transcriptional regulatory proteins (6).  One of the alpha helices is located at the C-terminus of the motif, while the other alpha helix is located at the N-terminus.  The C-terminal alpha helix contributes to DNA recognition by binding to the major groove of DNA through hydrogen bonding and Van der Waals interactions.  The N-terminal alpha helix stabilizes the DNA-protein interaction through the same intermolecular forces.  In the CAP protein, two helix-turn-helix motifs exist in the DNA binding region.  One is located on the exterior portion of subunit A, while the other is symmetrically located in subunit B.  Each consists of two alpha helices connected by a region of random coils.  These structural characteristics allow for hydrogen bonding between amino acid residues that provides stability and specificity to the DNA-binding site of CAP.      
     Specific hydrogen bonding interactions are important to achieve the sharp bend of DNA in the CAP-DNA binding complex.  In this complex, the DNA molecule wraps toward the sides of CAP.  One subunit of CAP interacts with half of the DNA binding site, while the other subunit interacts with the remaining half.  Arg-180, Glu-181, and Arg-185 of CAP form hydrogen bonds with DNA base edges (5).  The guanidinium side chain of Arg-180 in complex A forms hydrogen bonds with O6 and N7 atoms of G5.  In complex B, Arg-180 forms a symmetrical hydrogen bond with G18’.  The carboxylate side chain of Glu-181 hydrogen bonds with the nitrogen atom of C7’ in complex A and the nitrogen atom of C16 in complex B.  Finally, Arg-185 forms water-mediated hydrogen bonds with T15 in complex A and with G14 in complex B.  The formation of these hydrogen bonds is important for the stability of the CAP-DNA complex.
     Arg-180, Glu-181, and Arg-185 are functionally important residues due to their ability to recognize and bind to three of the four most highly conserved base pairs in the DNA binding site: G•C base pair 5, G•C base pair 7, and A•T base pair 8.  Removal of the side chains of Arg-180 and Glu-181 results in loss of binding specificity at base pair position 5 and 7, respectively (5).  This suggests that the nitrogen-hydrogen bond of the guanidinium side chain of Arg-180 and the oxygen-hydrogen bond of the carboxyl side chain of Glu-181 are required hydrogen bond donors, without which stable DNA-protein interactions would not take place.  These highly conserved amino acids are therefore vitally important in the ability to form the CAP-DNA complex through direct readout, or direct hydrogen bonding interactions between amino acids and DNA base pairs.  CAP also forms hydrogen bonding interactions between amino acids and the phosphate backbone of DNA.  Specifically, these interactions involve Ser-179 and Thr-182.  In subunits A and B, the hydroxyl group containing side chains of Ser-179 and Thr-182 each form one hydrogen bond with the negatively charged phosphates of nucleotides 9’ and 14 (5).  The hydroxyl groups of these highly conserved amino acids are therefore required for stable CAP-DNA binding.  
      CAP binds to the cofactor, adenosine-3’,5’-cyclic-monophosphate (cAMP), in order to acquire the ability to bind to DNA and to regulate transcription.  cAMP binds CAP through both water-mediated and direct hydrogen bonds.  The backbone carbonyl of Glu-81 and the amide group of Gln-125 interact with a water molecule that is associated through hydrogen bonding with the phosphate group of cAMP (7).  Side-chain hydroxyl groups of Thr-127 and Ser-83 interact with two water molecules that are hydrogen bonded to N1 and N7 of the adenine residue, respectively.  The 2’-OH of cAMP forms hydrogen bonds with the side-chains of Gly-71 and Glu-72 in a binding pocket formed by residues 70 to 73.  Leu-73 and the fused rings of the adenine residue participate in hydrophobic Van der Waals interactions.  Together, these hydrogen bonds and Van der Waals interactions confer stability and longevity to cAMP binding.  These characteristics are important because cAMP acts as an allosteric effector to mediate CAP-binding.  Intracellular concentration of cAMP changes in response to different nutrient states of the extracellular environment.  CAP may then regulate expression of genes that encode catabolic enzymes required to metabolize various nutrients.  For instance, in the presence of glucose, the intracellular concentration of cAMP drops, repressing transcription of the lac operon (8).  This causes the cell to metabolize the available glucose rather than lactose as a carbon and energy source.  
     Studies involving mutant CAP molecules have shown that DNA binding is dependent upon certain amino acid residues.  For instance hydrogen bonding between oxygen and nitrogen atoms of DNA, the guanidinium side chain of Arg-180, and the carboxylate side chain of Glu-181 is necessary for stable DNA binding (9).  Mutations conferring amino acid substitutions at position 181 of CAP have demonstrated that hydrogen bonding between the carboxylate side chain of Glu-181 and the nitrogen atom of C7’ and C16 is essential for direct DNA binding (10).  Also, studies of alternative amino acid sequences in the protein’s DNA binding site have shown that the sequence, GNGA, is important for specific binding of DNA (9).  Mutation in this sequence results in loss of the ability to express genes for catabolic enzymes.  As a result, the cell cannot metabolize certain nutrients when they are available in the extracellular environment.  This consequence demonstrates the importance of sequence specific binding to the CAP protein and to the survival of the bacteria, Escherichia coli.
     When the CAP sequence is compared to other protein primary structures through the PBLAST databank, the cAMP receptor protein (CRP) from the CRP/FNR family of the bacteria, Mycobacterium tuberculosis (PDB ID = 3I54), has a significantly similar sequence (11).  Both structures associate with a adenosine-3’,5’-cyclic-monophosphate, or cAMP, ligand and are responsible for activation and repression of transcription.  As with CAP, cAMP acts as an allosteric effector with CRP by changing the protein’s conformation to allow DNA binding.  Both proteins contain a helix-turn-helix motif involved in DNA binding.  Because the CRP protein is similar in sequence with 249 amino acids and a low E value of 5e-21 (or 5 x 10^-21), the primary structure of CAP may be genetically conserved in M. tuberculosis.  A low E value indicates similar primary structure.  Using the Dali server to compare tertiary structures, the transcriptional regulator showed similarity with a Z score of 21.9 (12).  A Z score greater than 2 is significant.  One notable difference between CRP and CAP is that CRP consists of four identical subunits, each of which binds to one cAMP molecule whereas CAP has only two identical subunits and can bind one molecule of cAMP.  Presence of additional subunits suggests a more extensive DNA binding site when the protein is allosterically activated by cAMP, meaning that the function of CRP involves transcriptional regulation through binding of a longer sequence of DNA.
     The virulence factor regulator of Pseudomonas aeruginosa (PDB ID = 2OZ6) also showed significant similarity to CAP’s tertiary structure with a Z score of 28.4 (12).  Like CAP, the virulence factor regulator is a symmetric protein that binds DNA and acts as a global transcription regulator (13).  cAMP also acts as an allosteric effector to allow for DNA binding at two sites containing the helix-turn-helix motif like it does in both CAP and the CRP of M. tuberculosis.  From the protein’s low E value, 4e-97 (4 x 10^-97), it can be determined that the 207 amino acid primary structure of the virulence factor regulator is also significantly similar to the 209 amino acid primary structure of CAP (11).  Unlike CAP and CRP, the virulence factor consists of a single subunit and binds to two molecules of cAMP as opposed to one molecule in the case of CAP and four molecules in the case of CRP.  The single subunit of CRP suggests a less extensive DNA binding site and consequently, a shorter DNA binding sequence.  Therefore, transcriptional regulation by CRP is mediated by conservation of a less extensive DNA sequence than transcriptional regulation by CAP.
     In conclusion, the catabolite activator protein is a transcriptional activator protein that binds to the allosteric effector, cAMP, in order to acquire the ability to bind DNA.  DNA binding is facilitated by the structure that is a result of the protein’s amino acid sequence.  Secondary structure of CAP confers structural stability through hydrogen bonding interactions and provides two helix-turn-helix motif binding sites for DNA.  Hydrogen bonding and Van der Waals interactions facilitate stable binding of cAMP when present in the intracellular environment, allowing for conformational change and binding of DNA.  Specific hydrogen bonding interactions through amino acid side chains of CAP give stability and specificity to interactions with the highly conserved DNA binding sequence.  Similarities between CAP, the cAMP receptor protein, and the virulence factor regulator show that conservation of the amino acid primary and tertiary structure can be indicative of similarities in protein function, as all of these proteins are involved in transcriptional regulation.  The specificity of CAP’s structure and the conservation of its amino acid sequence are both required for proper interactions between the protein, cAMP, and DNA and for transcriptional regulation, which indicates the essential link between protein structure and function.