DnaG
Created by Sophia Lim
DnaG (PDB ID = 3B39) is a bacterial primase from Escherichia coli with a molecular weight (MW) of 65564.63 Da and an isoelectric point (pI) of 5.68, as obtained from Expasy (2). Primases synthesize RNA oligonucleotides (primers) on single stranded DNA (ssDNA) in both prokaryotic and eukaryotic organisms at the start of DNA replication. Primases are also involved in lagging strand synthesis and replication restart. They are essential for the initiation of such phenomena because DNA polymerases are incapable of de novo synthesis and can only elongate existing strands (4); as such, primases are foundationally important for cell proliferation.
DnaG is a globular, physiological homodimer (6). Each identical subunit is comprised of three domains: a 12 kD N-terminal zinc binding domain, a 15 kD C-terminal helicase binding domain, and a catalytic 36 kD RNA polymerase domain situated between them (8). A dimeric association is not necessary for in vitro function (13), considering experimentally purified, monomeric forms do not suffer compromised activity (12). However, the dimer is a physiological requirement for primer synthesis at origins of replication because two zinc binding domains are required for stable interactions with the major groove of double stranded DNA, or dsDNA (9).
The N-terminal zinc binding domain (ZBD) includes residues 1-110 (11) and binds stoichiometric amounts of zinc. The central RNA polymerase domain (RPD), spanning residues 111-433 (11), is the catalytic core of DnaG and is the main portion of the crystallized protein; it contains the active site of primer synthesis. The active site is not the only ssDNA binding site—there is also a recently identified binding groove on the surface of DnaG that enables ssDNA tracking (4). The C-terminal helicase binding domain (CTD) includes residues 434-581 (13). This region enables interaction with DnaB – a bacterial replicative helicase – and prevents primase movement, thus restricting the enzyme to the vicinity of the replication fork where DnaB is unwinding dsDNA to ssDNA (4). Another important aspect of binding DnaB is that DnaG achieves a 15-fold increase in the rate of catalysis (6).
The secondary structure of each RPD subunit contains 18 alpha helices (48%) and 12 beta strands (16%); additionally, there are several random coils as well as 310 helices (3). Furthermore, there is a pair of β-hairpins in the novel DNA binding groove that is not only functionally important for ligand association but also ensures the correct polarity of ssDNA around the active site (4).
The DnaG-ssDNA complex is made possible by two interactions: Firstly, minimal hydrogen bonding exists between the novel binding groove and the phosphate backbone of ssDNA. Secondly, van der Waals interactions occur between residue side chains and ssDNA nucleotides. Trp-165 contributes to van der Waals interactions via π stacking of its indole ring with Ade-3 in the co-crystallized ssDNA oligonucleotide. Hydrogen bonding between Arg-201/Tyr-197 and the phosphodiester bond between Ade-4 and Gua-5 also enables the formation of the DnaG-ssDNA complex. However, in an alternative crystallized conformation, these hydrogen bonds are not seen; in their stead are interactions between Asn-191 and Ade-3/Ade-4 and between Arg-199 and Gua-5/Cyt-6. Switching between these two crystallized conformations results in ssDNA sliding through the binding groove by roughly 6 Å, making ssDNA tracking and processing possible (4).
The absence of both ssDNA binding as well as primer synthesis in a W165A mutant enzyme illustrates that Trp-165 is essential for DnaG function. Arg-199 and Arg-201 are also critical residues, as established by similar mutagenesis experiments (R199A and R201A). Ligand binding is possible when only Arg-199 or Arg-201 is substituted by alanine – though this eliminates synthesis activity – but there is no formation of a DnaG-ssDNA complex when both residues are mutated (4).
The active site of the RPD does not bind ssDNA with high affinity (4). Its inability to accommodate more than 10 base pairs may explain the slow processing capability of DnaG (11). The low affinity association between DnaG and ssDNA may be a necessity considering the ubiquitous nature of primase and its never-ending role in DNA replication. Yet despite a nonspecific interaction between DnaG and its ssDNA ligand, there is clear sequence preference. DnaG has a higher affinity for ssDNA containing a 5’-CTG-3’ trinucleotide, which facilitates the initiation of primer synthesis, while the sequence appears essential in vitro (5). This may be because binding to the CTG-trinucleotide results in a tighter interaction with dsDNA, which drives the start of primer synthesis on nearby ssDNA, but the mechanism of preference is not yet entirely understood (9).
Although the primary ligand of DnaG’s catalytic domain is ssDNA, other ligands – zinc and DnaB – bound by the ZBD and CTD respectively are required for activity of DnaG (12, 14), while interactions with ssDNA binding protein and DNA polymerase III holoenzyme are also observed (8). In the ZBD, there are N-terminal cysteine residues that are involved in coordinating divalent metals. The ZBD is crucial for primase activity, as in vitro priming of phage DNA was not observed in a mutant DnaG with an N-terminal 27 residue deletion (12). The protein-protein interaction between DnaB and the CTD of DnaG not only localizes DnaG to the replication fork but also orients DnaG appropriately to allow primer synthesis. This complex is achieved through the association of a C-terminal helical hairpin on DnaG CTD to the N-terminus of DnaB. This association results in a lack of conformational mobility of the C-terminal helix (13).
Primases are attractive drug targets due to their significance in bacterial and viral replication. As primase inhibitors are inherently anti-replicative, therefore anti-proliferative, they can also be utilized as anticancer agents. A common mechanism for inhibition is inducing chain termination, but this does not prevent formation of the DnaG-ssDNA complex. Inhibition is instead indirectly achieved - as there is no direct inhibition of synthetic activity - by utilizing nucleotide analogs incapable of forming phosphodiester bonds. One example of an effective inhibitor is 2’,3’-dideoxynucleoside 5’-triphosphate (ddNTP molecules). Their incorporation into primers precludes chain elongation due to the absence of the 3’ hydroxyl group necessary for the formation of a phosphodiester bond between deoxyribonucleotides. Other chain terminating inhibitors, such as triphosphate derivatives of cytarabine or vidarabine, are also nucleotide analogs (5).
Type II DNA topoisomerase (topo II, PDB ID = 2ZBK) was found to have both primary and tertiary structural homologies to DnaG. Like DnaG, topo II plays a role in DNA replication; thus, this enzyme – particularly its ATP binding site – is an attractive drug target, with both antibacterial and anticancer applications (10). By cleaving then ligating dsDNA, topo II resolves the issue of DNA supercoiling while DnaB is unwinding dsDNA into ssDNA. Topo II is much larger than homodimeric DnaG: it is a heterotetramer of a pair of A and B subunits (11). Each A subunit – and each B subunit – is identical.
Subunit A is the only portion of topo II with sequence homology to DnaG; its MW is 45055.02 Da, and its pI is 8.76 (2). According to Dali server results, subunit A from topo II of Sulfolobus shibatae (Z = 8.9) exhibits tertiary structure similarity with DnaG as well (7). The Z score indicates folding similarities between proteins; ergo, a higher Z score indicates greater tertiary homology. The lower threshold for significance is 2, so 8.9 is a notable result. Additionally, there is primary structure homology between topo II and DnaG, even though the former did not show up in PSI-BLAST (Position-Specific Iterated Basic Local Alignment Search Tool) searches of the latter. (The results of PSI-BLAST were merely other primases with E values close to 0, indicating good overlap and the absence of gaps in subject and query sequences, but there was no enzymatic variation.) Approximately 100 residues within the catalytic core of DnaG are similar to select topoisomerases and other dissimilar proteins (11).
The tertiary structure similarity indicated by the Dali server most likely refers to the TOPRIM fold. Named for the proteins in which it was first discovered (topoisomerase and primase), this sub-domain is located in the catalytic RPD of primase – spanning residues 259-341 – and is also present in the catalytic region of topo II. The TOPRIM sub-domain in DnaG has an α/β N-terminal domain – broken down into a few α helices and an antiparallel β-sheet – on one side and a C-terminal, helical sub-domain at its other side. In contrast, topo II possesses a central, parallel β-sheet surrounded by three α-helices. Only about 10% of the TOPRIM sequence from topoisomerase is conserved in primases, but there are five residues that are present in all TOPRIM sequences. In DnaG, they are: Glu-265, Asp-309, Asp-311, and two glycines that are of structural importance (11).