Argonaute

Argonaute: Slicer and Silencer in RNA Interference

Created by Joseph Muldoon

   Argonaute (3F73; Thermus thermophilus) is an essential protein in RNA interference (RNAi), a widely-conserved mechanism that regulates transcription, translation, and genome stability. RNAi is used as a defense against transposons and RNA viruses and is essential in the development of plants and animals. In the RNAi pathway, the RNase III-like endonuclease Dicer produces 21 to 27-nucleotide (nt) short interfering RNAs (siRNAs) with 2-nt 3’ overhangs from double-stranded RNA (dsRNA). Cells may acquire dsRNA through bidirectional, inverted repeat, or aberrant transcription (1), or through transfection, viral infection, or artificially-induced expression (2).

   Two multiprotein silencing effector complexes can use siRNA to silence gene expression. RNA-Induced Transcriptional Silencing (RITS) uses siRNA to target chromatin for repression (transcriptional gene silencing; TGS). There are two models for targeting of chromosome regions. Either siRNAs in RITS pair with DNA to form an RNA-DNA double helix, or they pair with noncoding RNAs at the target locus. Both models require recruitment of a histone methyltransferase (HKMT) to methylate and bind to histone 3 lysine 9 (H3K9) to propagate heterochromatin, rendering the region transcriptionally silent (1).

   Another complex called RNA-Induced Silencing Complex (RISC) uses siRNA to repress translation (post-transcriptional gene silencing; PTGS) (1). Dicer loads double-stranded siRNA onto RISC, and RISC unwinds siRNA in an ATP-dependent process. It uses the anti-sense strand as a guide to base-pair with complementary mRNA (the target), and the sense (passenger) strand is discarded (2). An Argonaute/PIWI family protein in RISC then cleaves the mRNA. In a separate pathway, Argonaute uses 21 to 24-nt microRNAs (miRNAs) to recognize and cleave mRNAs. In both RISC pathways, Argonaute targets mRNAs for degradation and prevents them from being translated (1).

   Three paralogous groups of Argonaute (Ago) are Argonaute-like, Piwi-like, and C. elegans-specific group 3. The first two are an ancient lineage and are present in archaea, bacteria, and eukaryotes. The number of Ago genes varies by species; for example, Schizosaccharomyces pombe has one, humans have eight, Caenorhabditis elegans has 26, and the parasites Trypanosoma cruzi and Leishmania major no longer have any. Other species employ Ago in different regulatory pathways. Neurospora crassa uses Argonautes to prevent genome rearrangements through quelling and meiotic silencing of unpaired DNA (MSUD), and Tetrahymena uses them in DNA elimination. Other uses include germline maintenance, transposon silencing, and chromosome segregation. Argonautes are members of the RNase H endonuclease family, which uses a DNA template to target RNA; however, Ago has evolved to use a single-stranded RNA template (3).

   In the eubacteria Thermus thermophilus, the sequence of Ago is 685 amino acids. Subunits form a homodimer, and each subunit binds a set of RNAs and ligands. The molecular weight of each monomer is 76,611 Da, and the isoelectric point (pI) is 9.79. Each subunit associates with Mg2+ and PO43-, as well as two RNA strands in vivo, but the protein was crystallized with an RNA strand and a DNA strand. Wang, et al. crystallized the ternary complex of Ago bound to 21-nt, 5’-phosphorylated guide DNA (5’-phos-TGAGGTAGTA GGTTGTATAGT) and 20-nt target RNA at 3.0Å resolution (Figure 1). Nucleotide mismatches at steps 10 and 11 maintained the stability of the complex and blocked Ago endonuclease activity. Diffraction data was collected using micro-focus beam line NE-CAT ID-24E, and the structure was solved with the program PHASER (4). Guide miRNA and DNA produce very similar crystal structures (5), and Ago from T. thermophilus has also been crystallized with 21-nt DNA and 19-nt RNA (3HVR), 21-nt DNA (3DLH), 10-nt DNA (3DLB), and in mutant forms (3HGE, 3HJF, 3HK2).

   Ago has an amino-terminal N and PAZ lobe and a carboxy-terminal Mid and PIWI lobe (6). In the guide-Ago complex, PAZ binds to the 3’ 2-nt overhang of the guide strand, and Mid binds to the 5’ phosphate. ‘Slicer’ (cleaving) activity requires base-pairing between the guide and target at the 10th and 11th nucleotides, so mismatches at these steps stabilize association with Ago (7). The PAZ domain is made of two subdomains, one with a five-stranded beta barrel and the other with a beta hairpin and alpha helix, separated by a cleft. The PIWI domain is part of the RNase H enzyme family and is physically separated from PAZ by a nucleic acid binding channel (5). RNase-H-like enzymes have a conserved Asp-Asp-Glu/Asp motif and two divalent cations at the catalytic site, but in Ago the third residue is degenerate: Asp/Glu/His/Lys. The PIWI domain is necessary for specific splicer activity because it contains the catalytic center; however, it is not sufficient, which suggests that additional small RNAs or cofactors are required for its activity (3). The cleaved products have a 3’ hydroxyl group and a 5’ phosphate.

   Ago was crystallized with magnesium ions because RISC is Mg2+-dependent. One Mg2+ binds to the catalytic triad Asp-478, Asp-546, and Asp-660 in the PIWI domain. The function of this Mg2+ may be to bind to oxygen on the cleaved phosphate of the target. Another Mg2+ binds to Val-678 in the Mid domain and to the first and third phosphates from 5’ end of miRNA, anchoring the guide strand (3, 5).

   The primary structure of Ago in T. thermophilus is 685 amino acids. 51% of the residues are hydrophobic, 19% are polar and uncharged, 17% are basic, and 12% are acidic. Half of the polar and uncharged residues are Gly. The most common residues in the protein are Leu, Arg, and Ala, and the least common are Cys, Met, and Asn. (4).

   The secondary structure of Ago contains alpha helices, beta sheets, 3/10 helices, and random coils, with random coils predominating. Of the 678 residues that crystallized in the A chain, alpha helices and beta sheets each make up about 30% of the residues, random coils are 37%, and 3/10 helices are less than 3%. Of the 670 residues that crystallized in the B chain, alpha helices are 27%, beta sheets are 28%, random coils are 43%, and 3/10 helices are 1% (4).

   The guide and target strands exhibit extensive Hydrogen bonding. In miRNA, the two 5’ nucleotides H-bond to Val-678, His-445, His-441, and Arg-442. The two 3’ miRNA nucleotides H-bond to Trp-239, Glu-206, and Pro-208, and the phosphate between the 3’ nucleotides H-bonds to Arg-228. Phosphate groups in miRNA also forms salt bridges with the nucleic acid binding channel. The ternary complex is formed when mRNA is thread through the central channel in the Mid and PIWI lobes and H-bonds at nucleotides 2-8 with the guide strand (5).

   The Conserved Domain Database (CDD) identifies the following residues as 5’ guide anchoring sites: Arg-418, Lys-422, Gln-433, Ile-434, Leu-435, Asn-436, Leu-439, Arg-446, Asn-449, Gly-453, Lys-457, and Ala-644. In addition to the three aspartic acids, the CDD also identifies a fourth residue in the active site, G-480, but this site does not bind Mg2+. PDBSum identifies many interactions between the nucleic acids and protein. Most of the interactions are between guide strand and protein, and most of these contacts are non-bonded and involve the guide’s phosphate-sugar backbone and polar or basic amino acids (4).

   Tertiary structure in Ago allows a short nucleic acid duplex to fit into a channel with a spiral-shaped groove, and the motions of structural motifs mediate conformational changes. For example, in the PAZ domain, a loop is twisted by a hinge, and a sheet rotates and bends. These large-scale motions may widen the channel, allowing the guide stand to be captured by the 3’ end and oriented. Subdomains in PIWI and PAZ form a crescent-shaped base that acts as a gate to the channel. Next, the PAZ loop pivots, which places the guide in a position to be able to bind to the target. Formation of the ternary complex is accompanied by stretching of a helix near the loop and twisting of two hinges. These domain movements are responsible for conformational changes in the guide, but not the target (5).

   The “seed” sequence is the core recognition sequence near the 5’ end of the guide strand. Complete complementarity is usually required at this sequence but not at the other bases. The seed experiences greater fluctuation in the unbound state than when paired. As it base-pairs with the guide, Hydrogen bonds at base-pairs ('steps') 7 and 8 remain stable. However, the bonds adjacent to the seed sequence (at step 9) break, re-form, and then break again. Step 10 experiences even larger conformational changes, such that the phosphate diester bond approaches the PIWI catalytic site. Steps 9 and 10 each become hydrated by a water molecule, which stabilizes binding of step 10 to PIWI. The Mg+ binding pocket is also hydrated; hydration maintains the catalytic center in a rigid structure, and one of the water molecules is incorporated in RNA hydrolysis (5).

   The quaternary structure of Ago varies with the protein complexes in which it is assembled. Wang, et al. crystallized an Ago homodimer, and PDBSum identifies 13 non-bonded contacts at its interface: Pro-143 on chain A contacts Ser-317 on chain B; Gln-185 on A contacts Ala-630 on B, which contacts Gly-186 and Ala-184 on A; Ala-184 on A contacts Pro-627 and Glu-629 on B. There are no covalent or Hydrogen bonds connecting the subunits (4).

   Because Argonaute arose in an ancient lineage, it is nearly ubiquitous across eukarya, prokarya, and archeae. The T. thermophilus sequence is highly similar to Thermus, Synechococcus, and Thermosynechococcus genera (E < e-72 for a BLASTp search). For orthologs with solved structures, it is most similar to Aquifex aeolicus and Pyrococcus furiosus.

   Aquifex aeolicus Argonaute (Aa-Ago) is 706 amino acids; ExPasy computed molecular weight of 83,003 Da and isoelectric point at 9.53. It shares 27% sequence identity with T. thermophilus Ago, and for a species in a different phylum, very much structural similarity. Aa-Ago monomer was crystallized with Ca2+ (1YVU) and has a Z value of 21.4 and E value of 4e-13 (8, 9). The same protein crystallized as a homodimer with an external RNA duplex (2FS8) has a Z value of 25.9 and an E value of 5e-13 (8, 10). Z values greater than 2 indicate statistically significant tertiary structural similarity, and E values less than 0.05 indicate statistically significant sequence similarity. The secondary structure in Aa-Ago (2FS8) is 38% alpha helices, 28% beta sheets, 32% random coils, and 1% 3/10 helices. Compared to T. thermophilus, the A. aeolicus Argonaute has more alpha helices and few random coils.

   Aa-Ago (2F8S) was crystallized with 22-mer and 26-mer complementary siRNAs that bound to the protein externally, with the 2-nt 3’ overhang on one side of the duplex inserted into the PAZ-containing lobe. In the crystal, the 3’ overhang on the other end of the duplex aligned with the Mid domain on another protein in the crystal lattice. The siRNA-Aa-Ago complex provides information on how duplex RNA is recognized by and loaded onto Argonaute. The RISC-loading complex (RLC) and dsRNA binding protein bind to each end of double-stranded siRNA. The siRNA might be transferred onto Ago externally in double-stranded form. No instances of externally-bound single-stranded siRNA have been detected, although this may be difficult to achieve for reasons pertaining to crystallographic techniques (10).

   Pyrococcus furiosus Argonaute is 771 amino acids, 90,462 Da, and its isoelectric point is at 8.90. It has 24% sequence similarity to T. thermophilus, which is slightly less than A. aeolicus. A monomer was crystallized with a Manganese ion (1Z25). Pyrococcus belongs to the domain Archaea, so it has a less recent common ancestor than the other two species. Its secondary structure includes 35% alpha helices, 26% beta sheets, 35% random coils and 4% 3/10 helices, which are similar to the other two Argonautes (11).

   The CDD shows that active sites are mostly conserved between Thermus, Aeolicus, and Pyrococcus Argonautes. The active site residues are Asp-Gly-Asp-Asp, Asp-Ser-Asp-Asp, and Asp-Ala-Asp-His, respectively. Anchoring sites for the 5’ RNA guide strand are less conserved than active sites. Where there are substitutions, most do not change the polar or hydrophobic character of the side chain. Residues are listed below in order from location in the sequence from N-terminus to C-terminus, and bold indicates residues that two or more proteins have in common (Figure 2).

   Over hundreds of millions of years of evolution, these three organisms have retained Argonaute in their genome and have used it to splice small RNAs in gene silencing pathways. There are moderate differences in secondary structure and binding sites, but the overall shape of the proteins has changed very little. The RNA duplex passes into a binding channel that occupies much of the surface of the protein, so side chain protrusions into the channel would likely prevent catalytic activity by blocking RNA entry. This selective pressure may account for much of the similarity across lineages.

   Another source of conservation may be from sites involved in binding to other proteins in the RITS complex and RISC complex. Changing critical residues on either Ago or its associated proteins so that Ago could not be recruited by the complex would be equivalent to an Ago null mutant, because regulation of splicing activity is RNA and RNP-dependent.

   Argonaute can splice a single strand in an RNA-RNA, RNA-DNA, or DNA-DNA duplex, although RNA-RNA duplexes are the most likely to be encountered in vivo. Argonaute is important because it exerts wide-ranging effects on gene expression. It is essential for the maintenance of appropriate transcriptional activity through the RITS complex, and it controls translational activity by post-transcriptional gene silencing through the RISC complex.