DHX36

DHX36 with an N-terminal truncation bound to ADP-AlF4 (PDB ID: 5VHD) was obtained from Bos taurus, commonly known as cattle. In nucleic acid sequences, i.e. DNA and RNA, that contain many guanines, four guanines form cyclic hydrogen bonds to spontaneously turn into a structure called G-quartet. G-quartets are planar and thus can stack on top of each other to form four-stranded helical structure called G-quadruplex. G-quadruplexes are often found in telomeres as well as in gene promoters between introns and exons. These structures can protect the nucleic acid sequence by folding the telomeres and thus preventing nuclease from attacking the sequence (1). This folding can hinder replication, transcription, and translation of DNA and RNA. Most polymerases cannot unfold the G-quadruplex because its thermal stability is stronger than the force most polymerases can exert. A helicase called DEAH-box polypeptide 36 (DHX36), that belongs to DEAH/RHA family of helicases can bind to both DNA and RNA G-quadruplexes and unfold the structure. DEAH/RHA family is a family of ATP-dependent proteins that unwind nucleic acids and contain Asp-Glu-Ala-His motifs. DHX36 is often found bound to G-quadruplexes in eukaryotic cells. It is also essential to the heart development, the production of blood cells and platelets, and the development of embryo in mice (2).

The DHX36 with an N-terminal truncation bound to ADP-AlF4 has a molecular weight of 100068.31 daltons and an isoelectric point of 7.61 according to ExPASy (3). It has 870 residues and only 1 unique protein chain. It is 42% helical and 15% beta sheet. Most of the protein was crystallized and visualized using X-ray diffraction, vapor diffusion hanging drop, molecular replacement, and single-wavelength anomalous dispersion (SAD) (4). There are three domains that were crystallized: two RecA-like domains and a C-terminal domain. These domains form the core of the protein. Aside from the core, there are glycine-rich elements and a DHX36-specific motif (DSM) that is essential for the binding to the G-quadruplex (2). Glycine-rich elements and DSM were not crystallized.

Several different variants of DHX36 were crystallized, including the DHX36 attached to a 24-nucleotide DNA sequence and the DHX36 with a mutation at Tyr-69 in order to understand the mechanism behind the unfolding as well as the roles of the unique residues. In DHX36 without the glycine-rich element but attached to a 24-nucleotide DNA sequence, the three domains arrange in a trefoil, and the N-terminal folded into two alpha-helices that contained DSM. The DSM helix moves away from the core of the protein to find the 5' face of the G-quadruplex. A single-stranded DNA could pass through a positively charged groove between the RecA2 and the C-terminal domains; however, double-stranded DNA cannot pass through the groove nor achieve stable association between DHX36 and the DNA sequence (2). The release of energy through ATP hydrolysis changes the conformation of RecA-like domains and allows the DNA or RNA sequence to translocate through the groove (5).

The binding of the DHX36 helps the unfolding process by destabilizing the DNA structure and making it easier to unfold. When DHX36 binds to the DNA sequence, instead of three canonical G-quartets on top of each other, the top G-quartet is replaced by non-canonical A-T-G-G quartet because the G-17 residue, which is the 3'-most guanine of the bottom G-quartet, gets pulled away by the helicase into the 3' single-stranded region. This shifts the DNA sequence by one residue, causing the G-quadruplex to be less stable. As mentioned previously, the DSM allows the DHX36 to bind to the G-quadruplex. The alpha helical DSM residues Ile-65, Trp-68, Tyr-69, and Ala-70 form a hydrophobic core and produce a flat non-polar surface that stacks on top of the upper quartet of the G-quadruplex. If Tyr-69 were to be mutated, DHX36 would not be able to bind to the G-quadruplex due to weakened association. Another important component in binding process is the oligonucleotide and oligosaccharide-binding-fold-like (OB) subdomain of the C-terminal domain. The C-terminal side of the DSM helix and the first loop of the OB subdomain form hydrogen bonds with the G-quadruplex (2).

DHX36 binds to both DNA and RNA because it primarily binds to the phosphate group of the nucleotide backbone. Thus, the nucleotide bases themselves do not really affect the binding. In contrast, DHX36 has a strong preference over polarity and prefers to bind to parallel G-quadruplexes (2). Parallel G-quadruplexes have all guanine angles in an anti conformation whereas antiparallel G-quadruplexes have both syn and anti guanines (6). Parallel and antiparallel G-quadruplexes can occur in both DNA and RNA (7). DHX36 is inactive on completely antiparallel G-quadruplexes and has reduced activity on partially parallel and partially antiparallel G-quadruplexes. An opposite polarity creates a steric interference of the diagonal and lateral loops to DSM binding as well as an interference to binding of the OB subdomain (2).

Once DHX36 has bound to the DNA or RNA structure to destabilize the nucleotide sequence, the DHX36 unfold the G-quadruplex one residue at a time. 5' ß-hairpin (HP) is essential in this process. When DHX36 binds to a nucleotide sequence, its RecA2 domain rotates by 14°, and its C-terminal domain rotates by 28°. These rotations allow enough space for five stacked single-stranded DNA or RNA residues between the HP and the constriction formed by Arg-297, Gln-319, and Pro-699. Then the HP acts as a fulcrum to unstack the top nucleotides (at 5' end) while stabilizing the 3' end of the nucleotide sequence by forming a hydrogen bond with Thr-523. In other words, DHX36 unstacks one nucleotide residue at a time and handles a stack of five nucleotides at a time (2). A mutation in HP can result in lethal failure in the unfolding process without affecting the RNA or DNA binding. For example, a mutation at 3' HP of Prp43p (PBD ID: 3KX2), another protein in DEAH/RHA family, resulted in cold-sensitive phenotype whereas a mutation at 5' HP resulted in severely slow protein activity. A different mutation resulting in a cold-sensitive phenotype made the nucleotide sequence more stable, weakening the protein's binding and unfolding capability. When both mutations were combined, the protein activity completely stopped. Similarly, a mutation at HP of DHX36 would result in reduced DNA/RNA unfolding activity (8).

The function of DHX36 greatly depends on its structure for any mutation in the structure can result in a disruption of its function. Similarly structured proteins can have similar yet slightly different functions and capabilities. For example, DExH-box NTPase HrpB from E. Coli (PDB ID: 6EUD) has very similar structure to DHX36, but has different capabilities than DHX36. Psi-BLAST is a search tool used to find proteins with similar primary structure. Once given the name of a protein, psi-BLAST returns names of similarly structured proteins and their E value. E value is a measurement of how similar the two proteins are and is obtained by comparing the primary sequence and evaluating whether certain amino acids exist in the given protein but not in the comparison protein. An E value of less than 0.05 is considered significant (9). Likewise, the Z-score obtained from Dali Server shows that the two proteins have similar tertiary structures. Dali Server is used to compare tertiary structures of proteins by calculating the differences in intramolecular distances. A Z-score value above 2 is considered significant (10). DHX36 and DExH-box NTPase HrpB have an E-value of 2E-134 according to psi-BLAST and a Z-score of 23.4 according to Dali Server (9, 10). Since the E-value was 2E-134 for DHX36 and DExH-box NTPase HrpB, it is evident that their primary sequences are significantly similar (9). Since the Z-score was significantly higher than 2 (a Z-score of 23.4), it is evident that DHX36 and DExH-box NTPase HrpB have similar tertiary structures (10).

DExH-box NTPase HrpB has 812 residues, which is only 58 residues shorter than DHX36. DHX36 is 42% helical and 15% beta sheet whereas DExH-box NTPase HrpB is 38% helical and 21% beta sheet. Thus, DExH-box NTPase HrpB shows less of a helical characteristic and more of a beta sheet characteristic than DHX36. Another contrast is that even though both of the proteins depend on ATP in order to carry out their function, only DHX36 was crystallized with ADP ligand. DExH-box NTPase HrpB has 1,2-ethanediol as a ligand that was used for crystallization but had no affect on its function. DHX36 also had a tetrafluoroaluminate ion that was only used to induce crystallization (4). Because the two proteins are similar in structure, their functions are similar as well. Both are helicases that unwind RNA. However, DHX36 can unwind both DNA and RNA sequences whereas DExH-box NTPase HrpB can only unwind RNA. Another difference is that DHX36 cannot bind properly to double stranded nucleotide sequences whereas DExH-box NTPase HrpB can unwind RNA duplexes (11).

DHX36 is an essential protein to unwind G-quadruplexes that are formed in nucleotide sequences (DNA/RNA) in order to process the sequence for replication, transcription, and translation (2). Its structure and function are important to develop anticancer drugs that target G-quadruplexes in the promoters of oncogenes and at telomeres (1). It binds to G-quadruplex using the ATP-dependent groove between the RecA2 and the C-terminal domains and the alpha helical DSM residues Ile-65, Trp-68, Tyr-69, and Ala-70. Its binding to the G-quadruplex can destabilize the nucleotide sequence and make it easier to unfold. Once the G-quadruplex has been destabilized, DHX36 unwinds the stack one nucleotide at a time using the HP as a fulcrum (2). Its structure is almost identical to DExH-box NTPase HrpB, but the two proteins have slightly different functional capabilities due to minor differences in their structures.