Make your own free website on

Genetic Code - Ingredient List for Making Protein

The letters of the DNA code are like beads on a string. There are four types of chemical beads with chemical names adenosine, guanine, cytosine and thymine, usually abbreviated A,G,C,T. Two such strings are almost always twisted together with letters on one string paired to letters on the other string in a definite way. The A always pairs with the T, and the C always pairs with the G. That means that if one string reads from left to right, say, ...AAGTACCTGAAC..., the other must read ...TTCATGGACTTG... Either string has the recipe for copying the other. When a cell divides, each DNA strand can act as a template to build a new complementary strand and the cell ends up with two DNA molecules where it started with only one.

These strings are extraordinarily long, billions of letters long in complex organisms. Even a simple virus will have about 10,000 letters in its DNA recipe book.

The strings are divided into sentences called genes, and there are two types of genes. One type acts like a switch. The other type is a list of ingredients for making a protein. The switches work by either letting a nearby gene be read, or by blocking it from being read. There are lots of different types of switch and lots of different control strategies for turning genes on or off, but the method of reading a list of ingredients and constructing a protein is universal. All living things use the same process.

It begins with a molecule called RNA which, like DNA, is composed of four types of chemical pieces like beads on a string. The RNA molecule is actually put together alongside the DNA and a copying enzyme copies the DNA letters one by one to build the RNA molecule. There is a particular DNA letter subsequence that means STOP COPYING HERE. Then the RNA molecule, completed, drifts away from the DNA and enters the cytoplasm, the part of the cell where proteins are built and used. (There are other types of RNA. The kind we've just described is called messenger RNA .)

In the cytoplasm, the messenger RNA attaches itself by one end to a miniature factory called a ribosome. The job of the ribosome is to build a protein from the ingredient list brought to it by the messenger RNA.

A protein is also a long molecule like a string of beads. Unlike DNA and RNA, which are each built from four types of beads, protein strings are built from twenty different kinds of beads. These are called amino acids. Each of the many different kinds of protein has its own unique order of amino acids. There are always plenty of these amino acid ``beads'' floating around in the cytoplasm. The task of the ribosome is to snatch them up in the right order and build the string. The right order is determined by the letters making up the messenger RNA, but since that is just a copy of the DNA, it is ultimately the DNA that stores the list of ingredients of a protein.

The correspondence uses a three letter code. That is, AAA specifies one amino acid, TAC specifies another amino acid, etc. With a three letter code of four possible letters there are sixty four possible combinations. The ribosome needs twenty combinations to specify that the next amino acid on the chain is one of the twenty types. Therefore more than one triplet can stand for each amino acid.

The assembly works like this. The ribosome starts with an attachment point which is an unusual kind of amino acid never found in completed protein molecules. It then reads the first three letters of the messenger RNA, finds the appropriate amino acid and attaches it. There's now a chain of length two. The ribosome then moves forward three letters along the messenger RNA and pushes the oldest end of the amino acid chain ``out the back door''. Now it's ready to pick up the amino acid specified by the next three letters of the RNA, attach it to the chain, push the oldest end of the growing amino acid chain out, and so on. In this way, a very long chain of amino acids is constructed using the ingredient list carried to the ribosome by the messenger RNA. While the long chain is dangling off the ribosome, the unusual amino acid at its oldest end gets removed, so it's never part of the final chain.

The process ends when the ribosome reaches the end of the messenger RNA (or when it reads a three letter STOP code). Then the messenger RNA and the just completed amino acid string drift away into the cytoplasm. The RNA can be reused by other ribosomes to make other identical amino acid chains, over and over until it gets broken up.

Now we've described the amino acid string as if it were a long thin molecule, but in actuality, it curls itself up in a complex way, depending on the exact sequence of amino acids along its length. Usually the string ends up roughly spherical, with bumps and crevices uniquely determined by the amino acid sequence.

The folding up is entirely automatic. Specify the sequence of amino acids along the string and the shape of the protein is determined, but in a complex way. Some of these amino acids are attracted to one another. Some are attracted to water, so they usually end up on the outside of the protein. Others are water repellent and usually end up on the inside. Often two or more of these chains assemble together. For example, hemoglobin is formed from two chains of one type and two chains of another type. This assembly is also automatic. Once the DNA has encoded the ``parts list'' of a protein, the rest of the design is automatic.

The function of the protein is determined by its shape. A few proteins are structural, like keratin, used to make hair and fingernails strong. A few, like insulin, are used as signaling molecules, otherwise known as hormones. But by far the most important function is to serve as enzymes.

For nearly every chemical transformation that happens in a living cell, there is an enzyme that has just the right shape to either accelerate the reaction or stop it. An enzyme is a catalyst. It makes chemical reactions that could happen actually happen.

Consider a zipper. The two halves of a zipper, when separate, could come together and stick. But it requires the patience of a saint to make it happen by pushing and prodding. That's why zippers come with a sliding piece that has just the right shape to bring the two sides of the zipper together in just the right way that they stick. The slide of a zipper functions mechanically the way an enzyme functions chemically. Other enzymes can make a molecule break apart, just like the slide piece can make the joined zipper come apart when we slide it in the reverse direction. Still other enzymes can move a group of atoms from one molecule to another. Almost any chemical reaction that is possible can be helped to happen millions of times more rapidly if the right enzyme is around.

If the slider could run off the end of the zipper, it could be used to zip together another zipper, and another and another. We don't let the sliding piece of the zipper run all the way off the end of the zipper, but an enzyme, on the other hand, is free to float away and encourage other chemical reactions of the same kind.

To summarize all this we have

  1. DNA codes a list of protein ingredients in a three letter code.
  2. Messenger RNA copies the coded sequence and carries it to a ribosome.
  3. The ribosome assembles the twenty kinds of amino acids into a chain as specified by the messenger RNA.
  4. The protein curls up into a unique shape automatically.
  5. If the protein is an enzyme, its shape is just right for making certain kinds of molecules come together and react, or for making a certain kind of molecule split apart.

DNA tells ribosomes how to build proteins and the proteins tell the cell how to do all the chemistry it needs to live.

There's an exception to the rule that every living thing uses the same method to build its proteins. In living things more complex than bacteria and viruses, between the time that the messenger RNA gets copied off the gene and the time it gets used by the ribosome, it gets changed. Some of the letters get snipped out of the sentence, making it shorter. That makes it a little bit tricky to predict the amino acid sequence of a protein from the ACTG sequence of its gene. Nobody knows why cells use this inefficient way of coding the ingredients list of a protein.