Created by: David Mounts
DEAD-box proteins belong to the largest known family of helicases, known as superfamily II. These proteins are important for all aspects of RNA metabolism, as they are known to have biological roles in RNA transport, translation, decay, gene expression, and more (1,2). Interestingly, all of the known DEAD-box proteins contain a set of 9 conserved motifs that occur at similar positions in the peptide sequence (2). The title of these proteins stems from the peptide sequence DEAD (Asp-Glu-Ala-Asp), which is a part of motif II. While these proteins have many functional roles, they are most well-known for their ability to hydrolyze ATP (Figure 1) in a process stimulated by RNA binding (3). One particular protein of interest, Drosophila Vasa (PDB ID: 2DB3), derived from Drosophila Melanogaster, uses this process of ATP hydrolysis to facilitate the binding and unwinding of RNA. This proves to be biologically important for the developing embryo as it aids in the translation of mRNAs important to cell differentiation (4).
The primary sequence of Drosophila Vasa is
434 residues in length, has an isoelectric point (pI) of 6.25, and molecular
weight of 48,280.26 Da (5). The secondary structure is comprised of 21
α-helices (38% by residue), 14 β-strands (14% by residue), 3/10-helices, and
random coils. The ligands associated with the structure are a
phosphoaminophosphonic acid-adenylate ester (AMPPNP) and a magnesium (II) ion. The
purpose of this magnesium ion is to facilitate binding of the triphosphate group of AMPPNP by interacting with the negatively charged oxygens on the group. AMPPNP
acts as a non-hydrolyzable analog of ATP, which facilitates RNA binding and
allows intermolecular interactions of the enzyme-substrate complex to be
analyzed without producing products. X-ray crystallography utilizing this
analog and a single stranded RNA (ssRNA) reveals the core structure at 2.2 angstroms,
consisting of residues 202-621 (4). The protein consists of one single subunit that is divided into four distinctive regions: the C-terminal domain (CTD; residues 463-621), which is responsible
for binding the 5’ end of the RNA strand, and the N-terminal domain (residues
233-454), which binds the 3’ end. A linker region is located at residues
455-462 and links the CTD and the NTD. This region lies in the vicinity of
ATPase activity and is not conserved among other DEAD-box proteins. Lastly, a flanking sequence
is present at residues 202-232, which interacts with ATP. Phe-225 is important
in this sequence, as it is able to interact with the adenine base through van
der Waals forces. Similar to the linker region, this flanking sequence is
specific to Vasa and is not conserved among the family of proteins (4).
Out of the many conserved motifs that
exist across DEAD-box proteins, many of these residues are important for
effective binding and unwinding of RNA (Table
1). In order to determine important active site residues, a 7 base (U1-U7) ssRNA
molecule was bound to the protein (4). Motifs I and II are located among the N-terminal domain (NTD) motifs and bind the triphosphate group with the assistance of
Mg2+. While data is inconclusive due to the distance from the NTD,
motif VI which is located in the CTD is believed to facilitate triphosphate binding as well. Motif Ib and
GG facilitates binding of RNA bases U6 and U7, inducing a bent conformation of
the RNA molecule between U5 and U6. This is important, as this bending action
induces separation of the 3’ and 5’ strands in the presence of double stranded
RNA (dsRNA). After the strands are separated, ATP hydrolysis effectively
releases the split pair. There is only one motif that does not interact with
either ATP or the RNA sequence, which is motif III. This motif is still
important as it interacts with motifs II and VI. Motifs Ia (NTD) and QxxR (CTD)
participate in inter-domain interactions with one another which facilitate RNA
binding of phosphate groups at U5 and U6. Among the other CTD motifs, motif V interacts
by binding the phosphate group at U3. Lastly, motif IV interacts with the 2’ OH
group via Glu-497 and U2 phosphate via Lys-499. Mutational studies with eIF4A
(PDB ID: 2G9N) have revealed several important functional residues that are esssential to the RNA binding and unwinding process. Asp-402 of motif II and Gly-552 are particularly
important for RNA unwinding, as substitution of these residues inhibits
unwinding activity (4). Similarly, mutation of Arg-579 and Arg-582 of motif VI
have been shown to decrease crosslinking between the protein and RNA strand. While
it is a non-conserved residue, Arg-403 is also important as it is the only residue to interact with the substrate through hydrogen bonding, which occurs at U3.
The bioinformatics search engines Dali and PSI-Blast were used in order to find a comparison protein similar in structure. Although both engines use the same sequence corresponding to the protein of interest (POI) in order to find comparable structures, they slightly differ in what criteria they use to compare proteins. The purpose of PSI-Blast is to find proteins with similar primary structures to the POI. This is the basic linear structure of the polypeptide, which results from covalent bonds formed during protein synthesis. This is accomplished by looking at gaps in the peptide sequence and comparing to a database of proteins. Results are reported in the form of an E value, which indicates how closely related the structures are. The lower the E value, the more similar the two primary structures are. Values less than 0.5 indicate significant similarity, where a value of 0 indicates no variation among structures. In contrast to PSI-Blast, Dali searches for proteins that are similar in tertiary structure to the POI. This involves the three dimensional protein structure formed from intramolecular interactions such as hydrogen bonds, ionic bonds, disulfide bridges, and hydrophobic interactions. This is accomplished by calculating intermolecular distances via the sum-of-pairs method and comparing with a database of proteins. Results are reported in the form of a Z score, with a score of 2 or higher indicating significant similarities. Upon using both of these search engines, a Vasa helicase mutant (PDB ID: 4D25) from the organism Bombyx mori was chosen, yielding a Z score and E value of 65.8 and 0.00, respectively (6, 7).
The comparison protein, Vasa helicase from Bombyx mori, is similar to Drosophila
Vasa in many aspects. The functional qualities are similar in nature, which is
to facilitate the RNA unwinding process. While the primary sequence also
consists of 434 residues, it has a slightly higher pI and molecular weight with
values of 6.84 and 48,070.19 Da, respectively (5). The secondary structure is
composed of 23 α-helices (40% by residue), 17 β-strands (16% by residue), 3/10-helices,
random coils, and one β-bridge. The ligands associated with this protein are
the same as for Drosophila Vasa, with the addition of a glycerol molecule. The
protein also contains two mutations (E339Q, C342R). While the function of the
glycerol molecule is unclear, the primary focus resides at the glutamine
substitution at residue 339, which is located at motif II (DEAD DQAD). While this substitution is reported to
eliminate ATPase activity among other proteins in the family, it has little
effect on substrate specificity when compared with the wild type enzyme (8). This
suggests that the function of individual residues located in motifs are not
always conserved among DEAD-box proteins.
In conclusion, DEAD-box proteins are
highly conserved, yet highly essential proteins to all metabolic processes
involving RNA. The comparison between Drosophila Vasa and the Vasa mutant from Bombyx Mori yielded many similarities in
structure and function. It is suggested that comparison with other structures
be made in order to further analyze the functional similarities and differences
among the many conserved motifs.