TobaccoEtchVirusProtease
Crystal Structure of Tobacco Vein Mottling Virus Protease (PDB ID: 3MMG) from Potyvirus Tobacco Vein Mottling Virus
Created By: Sarah Tolman
Tobacco Vein Mottling Virus Protease (PDB ID: 3MMG) is a positive strand RNA virus and a member of the Potyviridae family. Tobacco Vein Mottling Virus (TVMV) is a long polyprotein that is processed by three viral proteases: protein 1 (P1), helper component protease (HC-Pro), and nuclear inclusion protease (NIa-Pro). NIa-Pro is responsible for seven of the nine cleavages seen in TVMV and contributes to the substrate recognition. TVMV protease serves as biologically significant to scientists because it is homolog to the Tobacco Etch Virus (PDB ID: 1Q31), a commonly used reagent for the removal of affinity tags from recombinant proteins. Although distinct,
TVMV is a close relative to the TEV protease and serves a similar purpose. TVMV protease is being analyzed to examine the importance of sequence specificity in the production of creating a recombinant protein or domain, and to further understand the effects of structural basis on the varied specificities in TEV and TVMV (1,2).
X-ray diffraction was used to obtain structural data. Wild-type and mutant forms of TVMV protease were produced in Escherichia coli (E. Coli) and purified via crystallization. The mutant form was manipulated in correlation to the TEV protease. The nucleophilic-active site, cysteine, was replaced with alanine, and entropy reduction mutagenesis was used to insert alanine in place of two lysine residues (K65A/K67A/C151A). The mutant form was truncated by cleaving six residues from the C-terminus, removing the P1-P6 sites. Before the TVMV protease mutant protease was crystallized, it was mixed in the canonical peptide substrate (Ac-RETVRFQSD). The protein was crystallized in a solution of 0.2M potassium formate and 20% PEG 3350 with a 1:3 ratio of protein to reservoir solution. Molecular replacement of the crystal structure TEV protease was used to solve for the final mutant structure. The final mutant structure contains residues 1-217 of Chain A, 3-216 of Chain B, 2-8 of Chain C, and 2-9 of Chain D. Formic acid and water molecules were located during refinement.
Formic acid is only bound to Chain A of nuclear inclusion protein A at position 96, 98, 99, and 187 (1, 3).
TVMV protease has an isoelectric point of 6.44 and molecular weight of 55,987.62 Daltons (4). The
overall structure of TVMV protease consists of two unique protein chains forming beta-barrel domains. TVMV has 446 residues that contain one asymmetric unit of chain A and chain B bound to peptide substrates chain C and chain D. Chains A and B make up the nuclear inclusion protein A fragment and contain alpha helices and beta strands with a sequence length of 241 amino acids (Fig. 1). Chains C and D make up the nuclear inclusion protein B fragment and contain only beta strands with a sequence length of 8 amino acids (Fig. 2). Mutations in the crystallized truncated TVMV protease were made to only nuclear inclusion protein A at position 65, 67 and 151 (3).
TVMV protease's secondary structure contains beta sheets, alpha helices and random coils, with beta sheets being the predominating structure. The loops in the secondary structure between the beta-8 & beta-9 and beta-12 & beta-13 sheets are integral to its function. The C-terminal of this region forms a 310-helix and a small loop. These secondary structure features can be contrasted with TEV protease's secondary structure. The C-terminal beta-strands of TEV protease forms antiparallel beta-sheets rather than a short loop and helix. These differences change the substrate-binding pockets causing varied substrate specificity. TVMV's secondary structure also contains specific hydrogen bonding that is significant to its overall function. One important hydrogen bond interaction is seen between P6 Glu and P5 Thr in TVMV. The side chain of P6 Glu points into the solvent resulting in just one interaction with P5 Thr. The P6 Glu in TEV protease interacts with the beta12 & beta13 loop creating more extensive interaction. This suggests that the P6 pocket is more important in determining substrate specificity for TEV protease compared to the TVMV protease (1). The tertiary structure of TVMV protease is also significant to its function. TVMV protease adopts a tertiary chymotrypsin-like fold that forms a dimer due to its crystal packing and intermolecular interactions (5).
The S4 pockets of TVMV protease are hydrophobic and shallow. The Van der Waals cavity volume of the TVMV S4 pocket is 137 Å3, while the cavity volume of the TEV S4 pocket is 241 Å3. The difference in pocket depth explains why the neighboring P4 pocket contains valine residues in TVMV protease and leucine residues in TEV protease. The shallow pocket of TVMV protease cannot accompany the longer side chains of leucine or isoleucine. This is significant because when a substrate's P4 valine is replaced with leucine or isoleucine, it can no longer be cleaved by the TVMV protease enzyme (1).
Arg49 and Glu30 are also important residues for the secondary structure of the TVMV protease. Arg49 and Glu30 are hydrogen bonded together and form the lid of the S1 pocket in the TVMV protease. TEV protease has Arg49 bound to a threonine residue which has a shorter side chain than glutamine. This restricts Arg49 from forming another hydrogen bond with Asp207 as seen in the TVMV protease. Therefore, TEV and TVMV protease have differing S1 pockets. Unlike TEV, TVMV has no S3 pocket and the P3 Arg side chain is fully exposed to the solvent allowing it to form two salt bridges with
Asp148. This causes the S3 pocket to be another major distinction in specificity between the two proteases (1).
The
structural similarities between TVMV and TEV protease can be seen through PSI-BLAST and DALI. The purpose of PSI-BLAST is to find proteins of similar primary structures to the target protein, known as the query. This server gives an "E value" score that represents the sequence homology to the query. Gaps in the sequence will increase the E score. A gap is an amino acid that resides only in the subject's sequence, not the query. Therefore, a smaller E value indicates more similarities in the primary structures between the subject and the query. The chain A nuclear inclusion protein of Tobacco Etch Virus (PDB ID: 1Q31) has an E score of 1E-167 (6). The purpose of the DALI server is to find proteins of similar tertiary structure using a sum-of-pairs method to calculate differences in intermolecular distances to the query. Significance in tertiary structure between the subject and the query are denoted through an assigned Z-score. Z-scores above two are significant. The chain A nuclear inclusion protein of Tobacco Etch Virus has a Z-score of 30.7 (7). A total of 52% of TVMV protease's amino acid sequence is identical to TEV protease's amino acid sequence (1).
Kinetic tests were conducted to determine the enzymatic behavior of TEV and TVMV. The values for Km and kcat compared the P6 and P1 specificity between the two proteases. Data showed that the TVMV protease is much more specific and less tolerant of amino acid substitutions at the P1 position than TEV protease, while TEV was more specific and less tolerant of substitutions at the P6 position. The differing electrostatic effects seen in the P1 and P6 pockets of TEV and TVMV protease play a role in these varied substrate specificities (8). Properties of one pocket of the TVMV protein had effects on adjacent pockets. For example, the S3 and S4 pockets of TVMV function cooperatively in playing a major role in the protein's substrate specificity.
Studying TVMV has helped scientists better understand how protein function depends on protein structure. Likewise, studying TVMV alongside TEV emphasizes the importance of
structural similarities and the significance of small differences in effecting large change in functionality and substrate specificity (1,9). This can contribute to the larger body of science as altering protein structure may be useful for specific manipulation in changing the protein function and its effects. For example, mutating peptide sequences and cleaving certain sites to harmful proteins may aid in inhibiting their negative effects.