Vasa

Vasa (PDB ID: 2DB3) from Drosophila Melanogaster

Created by: David Mounts

DEAD-box proteins belong to the largest known family of helicases, known as superfamily II. These proteins are important for all aspects of RNA metabolism, as they are known to have biological roles in RNA transport, translation, decay, gene expression, and more (1,2). Interestingly, all of the known DEAD-box proteins contain a set of 9 conserved motifs that occur at similar positions in the peptide sequence (2). The title of these proteins stems from the peptide sequence DEAD (Asp-Glu-Ala-Asp), which is a part of motif II. While these proteins have many functional roles, they are most well-known for their ability to hydrolyze ATP (Figure 1) in a process stimulated by RNA binding (3). One particular protein of interest, Drosophila Vasa (PDB ID: 2DB3), derived from Drosophila Melanogaster, uses this process of ATP hydrolysis to facilitate the binding and unwinding of RNA. This proves to be biologically important for the developing embryo as it aids in the translation of mRNAs important to cell differentiation (4).

The primary sequence of Drosophila Vasa is 434 residues in length, has an isoelectric point (pI) of 6.25, and molecular weight of 48,280.26 Da (5). The secondary structure is comprised of 21 α-helices (38% by residue), 14 β-strands (14% by residue), 3/10-helices, and random coils. The ligands associated with the structure are a phosphoaminophosphonic acid-adenylate ester (AMPPNP) and a magnesium (II) ion. The purpose of this magnesium ion is to facilitate bi nding of the triphosphate group of AMPPNP by interacting with the negatively charged oxygens on the group. AMPPNP acts as a non-hydrolyzable analog of ATP, which facilitates RNA binding and allows intermolecular interactions of the enzyme-substrate complex to be analyzed without producing products. X-ray crystallography utilizing this analog and a single stranded RNA (ssRNA) reveals the core structure at 2.2 angstroms, consisting of residues 202-621 (4). The protein consists of one single subunit that is divided into four distinctive regions: the C-terminal domain (CTD; residues 463-621), which is responsible for binding the 5’ end of the RNA strand, and the N-terminal domain (residues 233-454), which binds the 3’ end. A linker region is located at residues 455-462 and links the CTD and the NTD. This region lies in the vicinity of ATPase activity and is not conserved among other DEAD-box proteins. Lastly, a flanking sequence is present at residues 202-232, which interacts with ATP. Phe-225 is important in this sequence, as it is able to interact with the adenine base through van der Waals forces. Similar to the linker region, this flanking sequence is specific to Vasa and is not conserved among the family of proteins (4).

Out of the many conserved motifs that exist across DEAD-box proteins, many of these residues are important for effective binding and unwinding of RNA (Table 1). In order to determine important active site residues, a 7 base (U1-U7) ssRNA molecule was bound to the protein (4). Motifs I and II are located among the N-terminal domain (NTD) motifs and bind the triphosphate group with the assistance of Mg²⁺. While data is inconclusive due to the distance from the NTD, motif VI which is located in the CTD is believed to facilitate triphosphate binding as well. Motif Ib and GG facilitates binding of RNA bases U6 and U7, inducing a bent conformation of the RNA molecule between U5 and U6. This is important, as this bending action induces separation of the 3’ and 5’ strands in the presence of double stranded RNA (dsRNA). After the strands are separated, ATP hydrolysis effectively releases the split pair. There is only one motif that does not interact with either ATP or the RNA sequence, which is motif III. This motif is still important as it interacts with motifs II and VI. Motifs Ia (NTD) and QxxR (CTD) participate in inter-domain interactions with one another which facilitate RNA binding of phosphate groups at U5 and U6. Among the other CTD motifs, motif V interacts by binding the phosphate group at U3. Lastly, motif IV interacts with the 2’ OH group via Glu-497 and U2 phosphate via Lys-499. Mutational studies with eIF4A (PDB ID: 2G9N) have revealed several important functional residues that are esssential to the RNA binding and unwinding process. Asp-402 of motif II and Gly-552 are particularly important for RNA unwinding, as substitution of these residues inhibits unwinding activity (4). Similarly, mutation of Arg-579 and Arg-582 of motif VI have been shown to decrease crosslinking between the protein and RNA strand. While it is a non-conserved residue, Arg-403 is also important as it is the only residue to interact with the substrate through hydrogen bonding, which occurs at U3.

The bioinformatics search engines Dali and PSI-Blast were used in order to find a comparison protein similar in structure. Although both engines use the same sequence corresponding to the protein of interest (POI) in order to find comparable structures, they slightly differ in what criteria they use to compare proteins. The purpose of PSI-Blast is to find proteins with similar primary structures to the POI. This is the basic linear structure of the polypeptide, which results from covalent bonds formed during protein synthesis. This is accomplished by looking at gaps in the peptide sequence and comparing to a database of proteins. Results are reported in the form of an E value, which indicates how closely related the structures are. The lower the E value, the more similar the two primary structures are. Values less than 0.5 indicate significant similarity, where a value of 0 indicates no variation among structures. In contrast to PSI-Blast, Dali searches for proteins that are similar in tertiary structure to the POI. This involves the three dimensional protein structure formed from intramolecular interactions such as hydrogen bonds, ionic bonds, disulfide bridges, and hydrophobic interactions. This is accomplished by calculating intermolecular distances via the sum-of-pairs method and comparing with a database of proteins. Results are reported in the form of a Z score, with a score of 2 or higher indicating significant similarities. Upon using both of these search engines, a Vasa helicase mutant (PDB ID: 4D25) from the organism Bombyx mori was chosen, yielding a Z score and E value of 65.8 and 0.00, respectively (6, 7).

The comparison protein, Vasa helicase from Bombyx mori, is similar to Drosophila Vasa in many aspects. The functional qualities are similar in nature, which is to facilitate the RNA unwinding process. While the primary sequence also consists of 434 residues, it has a slightly higher pI and molecular weight with values of 6.84 and 48,070.19 Da, respectively (5). The secondary structure is composed of 23 α-helices (40% by residue), 17 β-strands (16% by residue), 3/10-helices, random coils, and one β-bridge. The ligands associated with this protein are the same as for Drosophila Vasa, with the addition of a glycerol molecule. The protein also contains two mutations (E339Q, C342R). While the function of the glycerol molecule is unclear, the primary focus resides at the glutamine substitution at residue 339, which is located at motif II (DEAD DQAD). While this substitution is reported to eliminate ATPase activity among other proteins in the family, it has little effect on substrate specificity when compared with the wild type enzyme (8). This suggests that the function of individual residues located in motifs are not always conserved among DEAD-box proteins.

In conclusion, DEAD-box proteins are highly conserved, yet highly essential proteins to all metabolic processes involving RNA. The comparison between Drosophila Vasa and the Vasa mutant from Bombyx Mori yielded many similarities in structure and function. It is suggested that comparison with other structures be made in order to further analyze the functional similarities and differences among the many conserved motifs.