Lecture Notes, Etc.

NOTE:

During the first laboratory period (Jan. 3: Introduction, Review and Overview), the following topics were addressed:


Click on for Review of PCR and RT-PCR


Nucleic Acid Chemistries and Structures


1. DNA Structure

Sources:
D. Freifelder, Molecular Biology, Jones and Bartlett Publishers, Inc., 2nd. Ed., 1987, pp. 60-68, 79-92.
T. McKee & J. R. McKee, Biochemistry: An Introduction, Wm. C. Brown Publishers, 1996, pp. 460-489.

In 1944, the theoretical physicist, Erwin Schrödinger, predicted that the hereditary material, whatever its nature, had to be complex (What is Life?, Cambridge University Press). This prediction was based on two considerations: 1. a small molecule would be constantly buffeted by neighboring elements in the microscopic milieu, thereby producing an inherently unstable condition (thermodynamic argument), and 2. physical laws were inaccurate within a probable relative error of 1/(n)1/2, where n is the number of molecules (statistical physics argument).

Major determinants of DNA structure

DNA Figures

Variations in DNA structure:

The structural model first proposed by Watson and Crick corresponds to the B form of DNA, which is typically prepared as the sodium salt under conditions of high humidity. Under conditions of partial dehydration, however, DNA assumes an A form, in which the base pairs are no longer at right angles to the helical axis, but are tilted 20o from the horizontal. The A form is observed when DNA is extracted with solvents such as ethanol; the significance of the A form under physiological conditions is not currently apparent. The Z form, named for its "zigzag" conformation, represents the most radical departure from the B form. DNA segments composed of alternating purine and pyrimidine bases, especially CGCGCG, are most likely to adopt the Z configuration. In contrast to the A form, Z-DNA may(?) occur under physiologically-relevant conditions. The table below lists those characteristics which serve to differentiate A, B and Z forms of DNA.

DNA Forms

Characteristic A-DNA B-DNA Z-DNA
diameter (D) 2.6 nm 2.0 nm 1.8 nm
bp/turn 11 10.4 12
degrees rotation/bp +32.7 +34.6 -30.0
axial distance/turn 2.8 nm 3.4 nm 4.5 nm
axial distance between bp 0.25 nm 0.33 nm 0.38 nm

H-DNA: Under certain conditions, e.g., low pH, a DNA segment consisting of a long polypurine strand base-paired to a polypyrimidine strand can form a triple helix, as illustrated below. The formation of this triple helix, termed H-DNA, depends on the phenomenon of nonconventional base-pairing (Hoogsteen base-pairing), which occurs without disruption of Watson-Crick base pairs (figure below). H-DNA formation has been observed to occur under physiological conditions and has been postulated to have a role, not yet clarified, in gene expression.

Hoogsteen Base-pairing

Supercoiled DNA: A circular DNA molecule having the characteristics of B-DNA, i.e., 10.4 bp/turn, is termed "relaxed" (figure below, panel a). The properties of such closed helical circles can be described by several quantitative parameters. The number of times each strand crosses the other is called the linking number (L). In a circular DNA molecule, L is equal to the number of base pairs per helical turn. The number of helical turns that occur in such a molecule is defined as twist (T). For example, if there are 260 bp in a relaxed, circular DNA molecule, then L = 26 (i.e. 260/10) and T = 26. A DNA molecule's linking number can change only if a covalent bond in the backbone of one or both strands is broken and subsequently resealed. Twist, however, can vary freely, if the relaxed circular DNA molecule is held and twisted a few times, it will adopt the shape shown in the figure below (panel b). When this twisted molecule is laid back on the flat surface, it will rotate so as to eliminate the twist. However, if this molecule is cut before it is twisted (e.g., by a topoisomerase), this "operation" changes both the linking number, as well as another parameter referred to as the writhing number (W). The writhing, or supercoiling, number is defined as the number of superhelical turns that occurs in a molecule's three-dimensional conformation. As with T values, the W number of a molecule can vary. The relationship between these parameters is described by the following equation: L = T + W. When L changes from its relaxed value, the strain introduced into the molecule is partitioned between T and W values. Therefore, molecules with the same linking number can have very different three-dimensional shapes. The circular DNA molecule can be twisted either to the right or to the left before its strands are resealed. If it is twisted to the left, the L number decreases, and the molecule's conformation is said to be underwound, or negatively supercoiled. (Note that negative supercoiling, which causes a decrease in a DNA molecule's T value, promotes the formation of Z-DNA). If, instead, the molecule is twisted to the right, it becomes overwound (the L number is higher), or positively supercoiled. All naturally occurring superhelical DNA molecules are initially underwound, i.e., form negative superhelices. Also, the degree of twisting (superhelical density) is about the same for all molecules, namely, one negative twist per 200 base pairs or, 0.05 twist per helical turn [1 twist / (20 turns / twist) = 0.05 twist].

Linear, Circular and Supercoiled DNA

In addition to forming a supercoil, a strained circular negatively supertwisted DNA molecule could remain flat. If it does so, however, the T value of the molecule decreases. In this case, the strain introduced by the negative supercoiling process is relieved by a partial unwinding of the double helix (figure below). In cells, the formation of a supercoil requires less energy than the disruption of base pairs. However, cells apparently underwind DNA because it facilitates strand separation (e.g., during DNA replication and transcription), a process that is regulated by specific enzymes. DNA negative supercoiling also explains the propensity of certain DNA sequences to form cruciforms (vide infra) and H-DNA (vide supra).

Relaxed and Supertwisted DNA

Cruciforms are cross-like DNA structures which are likely to form when a DNA sequence contains a palindrome or, in DNA parlance, an inverted repeat (panel a). In one proposed mechanism, cruciform formation begins with a small "bubble", or protocruciform, and progresses as intrastrand base pairing occurs (panel b).

Inverted Repeats, Cruciforms and Protocruciforms

Structural factors responsible for DNA stability

Structure/feature: Contribution to DNA stability:
2'-deoxyribose sugar Prevention of phosphodiester bond fragmentation by b-elimination at alkaline pH.
Double helix Redundancy of information through base-pair complementation.
Stacked array of bases within a central, protected (hydrophobic) environment; surrounded by a highly charged, hydrophilic surface. Decrease of vulnerability to "attack on bases" by water-soluble compounds.
Except for cytosine, high chemical stability of bases. Maintenance of DNA replication/coding stability. The appearance of uracil (U) as a deamination product of cytosine (C) serves as a convenient "tag" for error correction.

**By comparison, RNA lacks many structural features conferring long-term stability.

Melting curves - proving ground for the structural features of DNA

The figure below illustrates a theoretical hyperchromic shift associated with DNA strand separation (denaturation) as a function of temperature. These simple, yet elegant, studies were instrumental in providing direct evidence for a number of key structural features of DNA.

  1. How are these experiments performed?
  2. What accounts for the hyperchromic shift?
  3. Effect of base composition?
  4. What would you predict to be the effect of electrostatic repulsion (DNA phosphate groups) and the concentration of counterions?
    a. inorganic monovalent, divalent?
    b. organic (polyamines)?
  5. What would you predict to be the effect of hydrogen bond "breakers", e.g., formamide, urea?
  6. What would your prediction be concerning the effects of reagents, e.g., ethanol, that decrease the hydrophobicity of the internal base-stacked core?

Melting Curve and Hyperchromic Shift

2. RNA Structure

Ribosomal RNA (rRNA), a structural molecule, is the most abundant form of cellular RNA, constituting approximately 80% of the total RNA. As illustrated below, the secondary, base-paired structure of rRNA is quite extraordinarily complex and conserved, even though there are species-specific differences in primary sequence structure. rRNAs are but two components of ribosomes. For example, the large ribosomal subunit of E. coli consists of 5S RNA, 23S rRNA and 34 proteins, while the smaller subunit is comprised of 16S rRNA and 21 proteins. In contrast, a typical large eukaryotic ribosomal subunit contains 28S rRNA, as well as 5S, 5.8S RNA and 49 ribosomal proteins. The small eukaryotic subunit consists of 18S RNA and ca. 30 proteins.

Base-paired RNA folding patterns for 16S rRNAs of E. coli (panel a) and Saccharomyces cerevisiae (panel b)

Transfer RNAs (tRNAs), structural and informational molecules, are small, single-stranded nucleic acids ranging in size from 73 to 93 nucleotides. The base sequence of a yeast tRNAAla was first determined in 1965 (Bob Holly/Cornell). It was noticed that several segments of this molecule could form double-stranded regions that would give the molecule a cloverleaf structure in which open loops are connected to one another by double-stranded stems. To date, several hundred different tRNA molecules from a variety of organisms have been sequenced and, for each sequence, base pairing can be arranged to produce the cloverleaf conformation. Careful comparisons of these sequences have demonstrated that certain features are common to almost all of the molecules. This has led to the idea of a "consensus" tRNA molecule consisting of 76 nucleotides arranged in a planar, cloverleaf form (figure below, left panel). By convention, the nucleotides are numbered 1 through 76 starting from the 5'-P terminus. This 76-base sequence is sufficiently fundamental that in tRNA molecules having more than 76 bases the additional ones can usually be recognized as additions to the standard sequence. The most common additions follow standard bases 17, 20 and 47; the "extra" bases are numbered 17:l, 17:2, 20:1, 20:2, 47:1, 47:2, etc.

Standard features of tRNAs (Comments refer to figure below)

  • Bases in certain positions are quite invariant. These include 8, 11, 14, 15, 18, 19, 21 24, 32, 33, 37, 48, 53, 54, 55, 57, 58, 60, 61, 74, 75, and 76.
  • The 5'-P terminus is always base-paired.
  • The 3'-OH terminus is always a four-base, single-stranded region (...NCCA-3'-OH), where N = any base. This is the CCA or acceptor end.
  • tRNA contains many modified bases, e.g., dihydouridine (DHU), pseudouridine (y), inosine (I), etc., and some of the loops are named on this basis.
  • There are 3 large loops. The anticodon loop contains 7 bases with the general structure: 5'-Py-U-XYZ-Pu (modified)-N or 5'-C/U-U-XYZ-A/G(modified)-N. The DHU loop is composed of bases 14 to 21, but is not of constant size. The TyC loop (bases 54 to 60) usually contains the sequence TyC
  • tRNA contains 4 double-stranded regions termed stems, or arms.
  • The highly size-variable loop is designated the "extra arm" (bases 44 to 48).
Cloverleaf and X-ray Models of tRNA

Although the tRNA molecule is conveniently depicted as a planar cloverleaf, x-ray crystallographic analysis shows that its three-dimensional structure is actually more complicated. The figure above (right panel) shows a model of yeast tRNAPhe. The major features of the three-dimensional structure are the following:

  1. All base pairs proposed in the cloverleaf model exist.
  2. The molecule is folded and has two helical double-stranded branches, each about 60 angstroms long and mutually perpendicular. One branch consists of the acceptor stem and the TyC stem; the other consists of the stems of the DHU and anticodon loops. Thus, the molecule is L-shaped.
  3. The CCA acceptor terminus is at one end of the L and the anticodon loop is at the other end, about 80 angstroms away.
  4. The TyC loop, which is thought to interact with ribosomal RNA, is at the corner of the L.
  5. All bases except five (16, 17, 20, 47 and 76) are stacked, even though half of the bases are not in double-stranded helical regions. Even the bases of the single-stranded 3'-terminal segment (except for the terminal adenine) are stacked.
  6. There are nine base pairs that are not shown in the cloverleaf model. These are called tertiary base pairs because they are responsible for the three-dimensional folding. Only one of these, a G:C pair, is a standard Watson-Crick pair.
  7. In addition to the tertiary base pairs described in item 6, the tertiary structure is stabilized by hydrogen bonds between bases and different ribose units and between ribose units in separate nucleotides.

Messenger RNA (mRNA), a totally informational molecule, is "fixed of purpose" in all cells, i.e., to be translated into protein via the genetic code, and this RNA class shares the potential for having extensive secondary structure through base-pairing, as seen in tRNA and mRNA. Even so, there are important differences in the structures of prokaryotic and eukaryotic mRNAs:

Characteristic Prokaryotic mRNA Eukaryotic mRNA
Transcription unit Polycistronic Monocistronic
Elongation ca. 40 nucleotides/sec ca. 40 nucleotides/sec
5' end Triphosphate Methylated cap
3' end Last base of mRNA Poly(A)
Translation ca. 45 nucleotides/sec ca. 20 nucleotides/sec
Half life 5 min 4 hours
Ribosomes/mRNA 20 10

.