TATA_TBP

TATA-Box Binding Protein (TBP)/TATA-Box Complex (PDB ID:1YTB) from Saccaromyces cerevisiae
Created by: Yanna Bian

TATA-Binding Protein (TBP)/TATA-Box Complex (PDB ID: 1YTB) is found in Saccharomyces cerevisiae, commonly known as Baker’s yeast (3,8). The crystallized structure of TATA-Box Complex displayed here shows two complexes of TBP and DNA (8). Transcription in eukaryotes depends on correct assembly of transcription factors at the proper promoter region of the gene to be transcribed. TBP/TATA-Box Complex is the main initiator of this process. Understanding how the TBP/TATA-Box Complex works will give scientists greater insight into the mechanisms behind eukaryotic transcription. This will require an analysis of the structure of TBP/TATA-Box Complex.

The TATA-Box Complex consists of TBP and one 29 base-pair (bp) DNA sequence (12-bp duplex and a 5-bp loop) (3,8). TBP is a monomer that consists of two, pseudo-symmetrical domains (3,8). Each domain consists of 180 residues and corresponds to an almost direct repeat in the primary structure. Each domain is composed of a 5-stranded antiparallel ß-sheet and two α-helices that exist at opposite ends of each domain (3,8). A basic chain connects the two domains, forming a horseshoe-esque shape. In each domain, the C-terminal region is highly conserved across all eukaryotes, while the N-terminal region is species-dependent and highly variable in size (3,8). The C-terminal region is important to the function of TBP, while the variable N-terminal region’s role in TBP function is still unknown (3,8). The molecular weight of TBP is 40165.4 Da, and its isoelectric point (pI) is 9.89 (4). TBP has a Grand Average of Hydropathicity Value (GRAVY) of 0.144, which means that the TBP is hydrophobic (4). Where TBP interfaces with DNA, it contains an abundance of hydrophobic residues (only 34.5% of residues in this area are hydrophilic) (3). This sets TBP apart from other DNA-binding proteins like Tryptophan repressor, trp repressor – a transcription factor that controls amino acid metabolism, which has approximately 57% hydrophilic residues at the DNA-protein interface (3). This is because most of the favorable DNA-protein interactions that form are van der Waals’ interactions (3).

The 8-bp TATA box is imbedded in the 12-bp DNA duplex (8). In organisms that contain a TATA box, the consensus sequence of the coding strand is T1-A2-T3-A4-(A/T)5-A6-(A/T)7-N8, where N is any base and positions 5 and 7 can be either adenine (A) or thymine (T) (2,7). Similarly, in Saccharomyces cerevisiae, the TATA box consensus sequence of the coding strand is T1-A2-T3-A4-(A/T)5-A6-(A/T)7-(A/G)8 (2).  This 8-bp section interacts directly with TBP. It is suspected that structural requirements for TBP binding result in the highly conserved nature of the TATA box. For example, notice that the TATA box is A-T rich rather than guanine-cytosine (G-C) rich. There are two structural reasons for this. One, it is easier to deform A-T base-pairs rather than G-C, as A-T base-pairs have only two hydrogen bonds while G-C base-pairs have three. Two, the absence of guanine’s protruding amine group allows closer contact between TBP and DNA. A-T base-pairs also permit a greater degree of minor groove deformation, the merits of which will be explained later. Why the second half of the TATA box can be more variable that the first half will also be explained later (7).

The TATA-box complex is unique from other protein-DNA interactions due to several reasons: first, the interaction occurs at the minor groove rather than the major; second, the DNA fragment is drastically altered; third, the interactions are largely non-polar and hydrophobic (3,7,8). When the TATA box binds TBP, it deviates greatly from typical B-DNA structure. Recall that B-DNA is the typical structure of DNA. B-DNA helices have a right-handed twist and contain 10-bp per turn of helix, with the mean rotation per base pair around 36º (5). Formation of the TATA-Box Complex causes the 8-bp TATA box to bend toward the major groove, exposing the minor groove, which interacts with the ten-strand ß-sheet (made from 5 ß-strands from each half of TBP) making up the underside of the horseshoe-shaped TBP, splitting the 8-bp into symmetrical 4-bp segments (7).  The ß-sheets are strengthened by a scaffolding of four α-helices. Due to this bending of the DNA sequence, and the desire to maintain maximum contact with the TBP, the TATA box unwinds significantly (7). The mean rotation per base pair decreases from 36º to about 18.5º (7). This unwinding of the helix is stabilized by an increase in positive supercoiling of the TATA box. Unwinding of the TATA box changes the shape of the helix at this region from B-DNA to A-DNA (7). Recall that A-DNA resembles a shortened and broadened B-DNA, the rise per base pair has decreased from 0.34 nm to 0.23 nm (5). The minor groove of A-DNA resembles what is seen in the TATA-Box Complex, it is broad, but shallow, thus allowing TBP to interact with the DNA segment.

Residues that interact with sugar-phosphate backbone are highlighted in purple. Residues that interact with the Hydrogen bases of the minor groove are highlighted in blue (8). Fifteen residues of TBP interact with DNA bases through van der Waal bonds. These interactions are symmetrical with respect to the pseudo-symmetry of TBP’s two domains (8). There are several extremely important structural interactions between TBP and TATA box. One pair of phenylalanine (F) residues in each domain of the TBP molecule accentuate the bending of the minor groove by inserting between the outermost two base-pairs of the TATA box. Side-chains of Phe190 and Phe207 open base-pairs 1 and 2 while Phe99 and Phe116 open base-pairs 7 and 8, producing sharp kinks in the helix, and de-stacking and pulling apart the base-pairs. Any loss of stacking energy is compensated by formation of van der Waals’ interactions between bases and phenylalanine residues (7,8). There are six hydrogen bonds that form between TBP side-chains and DNA bases. Two asparagines (Asn 69 and Asn 159) form four hydrogen bonds to base-pairs 4 and 5. Asn 69 interacts with adjacent bases on the anticoding strand while Asn 159 interacts with adjacent bases on the coding strand. Two threonines (Thr 124 and 215) form two hydrogen bonds, one with A4, the other with A5 on the anticoding strand (7,8).

The last four bases of the TATA box are more variable that the first four (8). This is due to a couple structural reasons. Previous studies have measured the variation of transcription activity with relation to point mutations of the 8-base coding TATA box. Results show that substitution of T1 with any other base drastically reduces transcription activity. This could be due to the presence of a proline residue (Pro191) (8). Due to the kink between base-pairs 1 and 2 created by Phe190 and Phe 207, Pro191 must pack its R-group against A1 of the anticoding strand (base-pair with T1) otherwise unfavorable steric hindrance occurs and favorable van der Waals’ interactions do not form. However, the residue symmetrical to Pro191 on TBP’s other domain is an alanine residue (Ala100), which does not cause the same steric restrictions on base selection (8). Alternating T-A… patterns are necessary because A-A… patterns (called A-tracts) prevents bending of minor groove necessary for formation of TATA-binding complex (8).

Another commonly studied relative of yeast-TBP(yTBP) is Arabidopsis thaliana-TBP(aTBP; PBD ID: 1VTL) (9). aTBP shares a 81% primary structure similarity with yTBP as shown by protein BLAST (E=7e-105) (1). All the residues of yTBP that interact with minor groove bases are conserved in aTBP. Regions of yTBP and aTBP that do not align are composed of residues that do not interact with either the sugar-phosphate backbone or bases of the TATA box. Residues differences that do exist are often conservative changes (i.e. – Leu334 to Ile334) (8). Secondary and tertiary structural similarities between yTBP and aTBP recapitulate the BLAST results as shown by DALI (Z=26.8) (6). Similar to yTBP, each domain of aTBP consists of a 5-stranded antiparallel ß-sheet and two α-helices (6, 8). Similarities on all three levels of protein organization are related to uniformity of TBP function and TATA box sequence homogeneity.

The eukaryotic transcription begins with the assembly of the preinitiation complex (PIC-PDB ID: 3J1N) at the proper promoter (10). The PIC is made up of TBP, transcription factors, coregulators, and the RNA polymerase (2). TBP/TATA-box complex formation plays a central role in creating the PIC at TATA-containing promoters (2). Note that not all promoters contain a TATA box. These TATA-less promoters still require TBP for function. In Saccharomyces cerevisiae only about 19% of the genome contains promoters with TATA box (2). In Saccharomyces cerevisiae, data shows that genes regulated by a TATA box are genes that have the need for greater expressional flexibility, and tend to be upregulated under conditions of environmental stress like heat and starvation (2). In contrast, “housekeeping genes” that are inhibited by sress conditions tend to be TATA-less (2). While the presence of a TATA box does not automatically confer a stress-response function, it does point to a connection between TBP/TATA-box complex and cellular responses that are tightly regulated and require rapid, high transcriptional activity.