Origins III

Genetic Methods of Analysis

The fossil and cultural methods used above are not the only methods of evaluation human origins; genetic means can also be used. These methods are built on population genetics, which is itself based on a few simple premises. The primary foundation of population genetics is the Infinite Alleles Model, which states that each mutation is different from every other mutation. This is due to the high number of possible mutations in the human genome versus the low number of mutations that actually occur. This leads to the idea of identicality by descent, in which case it is assumed that if two people in a limited population have the same mutation they most likely share a common ancestor who had that mutation. By extending this theory, scientists can search for a common ancestor for all people of one family, one population, or even one species. This is done by comparing the mutations present in specific populations from specific areas and relating their rates. These comparisons can be done through several methods, including single nucleotide polymorphisms (SNPs), restriction fragment length polymorphisms (RFLPs), protein and antibody typings, microsatellite data, Variable Number Tandem Repeats (VNTRs), and Alu insertions.

SNPs are instances where one base has been substituted for another. This results in a point mutation. These in themselves are not usually overly helpful as they require large-scale sequencing of the genome to observe in significant numbers. Therefore, it is not economical for general evolutionary studies to analyze SNPs because the large number of people and traits necessary for consistent solutions becomes very costly. However, SNPs enable RFLPs to be used in evolutionary studies. RFLPs make use of restriction enzymes, which cut DNA at specific base sequences. These sequences, usually four to eight base-pairs long, will only be cut if they have the exact sequence necessary for the specific restriction enzyme being used. Because of this, any point mutations that occur will prevent the restriction enzyme(s) from cutting, and this is easily observed through electrophoresis, usually using polyacrimide gel.

Protein and antibody typing are generally not very helpful for H. sapien evolutionary analysis, but they are useful for both extremely long and short-term evolutionary studies. Proteins are generally fairly highly conserved as many possible changes could drastically change protein function. Some protein coding genes then bear mutations that are relatively stable, as the genome protects these coding regions because mutations may be deleterious. This results in the ability to see evolutionary changes in the long term. By sequencing and thus learning the specific mutation, comparisons can be made between populations. This is generally most useful for analyzing short-term population history (in this case population history without significant migration) or interspecies analysis.

Microsatellites are tandem arrays, where a few bases are repeated over and over and over again. Different individuals have different numbers of repeats and this can be analyzed as a means of distinguishing relatedness of populations. Variable Number Tandem Repeats (VNTR, 2.5 to 20 kilobases repeated) can also be used towards the same end, and have the additional benefit of being able to be analyzed through polyacrimide gel electrophoresis. Each of these methods has proven to be fairly useful, because these items are generally present in large quantities in the genome and usually are not very highly conserved, making for a moderate mutation rate. A potential mechanism for these mutations is also known, providing a level of security in any data resulting from experiments of these types (because knowledge of the mechanism precludes, in this case, a different mutation rate in the past). However, both microsatellites and VNTRs suffer back mutations and parallel evolution (Mountain). This is because the mechanism works during DNA replication and involves failure in DNA fidelity which leads to either more or less repeats. The mechanism is very standard in every person, and therefore it is likely that back mutations (a mutation where a prior mutation is undone; for example progressing from 18 repeats to 21 back to 18) have occurred and that different populations may come to the same range of repeat numbers through completely independent means. These limitations, though, can be partially compensated for through large numbers of microsatellites and VNTRs being checked and compared.

Finally, Alu insertions are proving to be perhaps the most useful of all genetic methods. Alu inserts consist of the Alu family of retroposons. There are perhaps 500 000 of these in the human genome, all of which are apparently pseudogenes that integrated and replicated in the genome at distinct periods in the past. In fact, the periods in the past are very distinct, and at times very much in the past. There have been about 500 new Alu inserts since the human/ape divergence approximately six to eight million years ago, all of which were strictly one-time events (unlike VNTRs and microsatellites), which makes them extremely useful to population studies. A simple agarose gel electrophoresis (simpler and less timely to run than polyacrimide) can determine whether an individual has one copy, two copies, or no copies of the specific Alu insert in question. This is because Alu repeats are much longer (and therefore easier to spot through electrophoresis) than VNTRs and microsatellites. Alu insertions appear to be the best method of analysis present at this time, as a simple present/absent test can eliminate an individual's ancestors from a specific population group. (http://www-bio.llnl.gov, Perna).

In the following sections, one each discussing mtDNA, the Y chromosome, and nuclear DNA, the opportunities for analysis provided by these genomes with these methods will be discussed, as well as a brief summary of the results thus far obtained.

mtDNA Mitochondria are the powerhouse of the cell. They are responsible for producing ATP for cellular use and are present in large numbers (hundreds or thousands) in each cell. They are cytoplasmic in nature, and their origin appears to be as once free-living bacteria. Now they are totally dependent on the cells in which they live and the cells they live in are dependent on them. Evidence for this is in the genetic code; mitochondria, while having DNA of their own, do not code all of their own proteins. Some of their features are nuclear encoded, indicating that they have permanently become a part of the cell.

One of the good things about mitochondria, at least in terms of genetics, is that they are maternally inherited. When an egg and a sperm cell are created, each, being highly specialized, is only given certain features. A sperm is given the methods of motility it uses to reach the egg, a haploid set of genetic information, and enzymes necessary to digest the egg in order to implant "his" genes. An egg is given all of the basic materials the future organism will need, including mitochondria. No mitochondria (or any other cytoplasmic feature, for that matter) are ever contributed by the sperm to a new embryo; the sperm's mitochondria remain in the sperm cell. This is why cytoplasmic features such as mitochondria are always maternally inherited. This helps genetic analysis because it reduces the working population size by half (only the females count). Also, the mitochondrial DNA has a much smaller genome size than does nuclear DNA, making it easier to work with.

This is not the only factor that enhances the attractiveness of mitochondrial research. Also very important is the hypervariable region of the mitochondria genome. In this region the mitochondria DNA mutates at a higher than usual rate, which allows a larger number of mutations to accumulate in a shorter time. Unfortunately, the high mutation rates increase the likelihood of a back mutation. However, the higher number of mutated loci in a sample adds additional opportunities to compare and crosscheck with results from other loci, partially resolving that issue.

The other main feature of mtDNA is that it is largely non-recombinant. This is extremely important. Mutations can be hidden, lost, added to, and basically toyed with through recombination. An individual may have centuries worth of mutations and information stored up, and one recombination event will hide that knowledge from researchers. This is not an exaggeration; it really does happen and it really is a problem.

Finally then, mtDNA is attractive to study because there is a large amount of it. There are possibly thousands of copies in each cell, so extracting mtDNA is not as difficult as extracting and purifying nuclear DNA. This enables scientists to evaluate ancient samples (as done with a Neanderthal specimen) for genetic studies.

Mitochondrial analysis has been carried out primarily by comparing genetic sequences and evaluating microsatellite data. Alu repeats are not present in the mitochondrial genome, and therefore are not used. Results gained from investigations into the mitochondria indicate fairly precise answers. Most indicate an African origin approximately 200 000 years ago. It should be noted that the fact that a Most Recent Common Ancestor (MRCA) exists 200 000 years ago does not indicate any specific event occurring 200 000 years ago. Instead it means that, in this case, only one woman in a population at that time contained mitochondria from which all other mitochondria are derived. There are possible errors with this dating, but these will be discussed in the conclusion.

Y chromosome Originally, both of the sex chromosomes (X and Y) were vertebral autosomes. However, due to a repeated pattern of addition recombination to the X chromosome and deletions and rearrangement to the Y, the Y chromosome shrunk to its current size of just 60 Mb, making it the second smallest chromosome. Of these sixty Mbs, there are at least 26 genes, 17 of which lie on the NRPY, or non-recombinant portion of the Y chromosome. This is important, as assaying population history is made much easier if the genes have been localized to one specific area for a long portion of time (as discussed in the section on mtDNA). Therefore, the existence of the NRPY is very important to evolutionary studies. Also important are the numbers of RFLPs and VNTRs found on the Y chromosome. There appear to be few (relative to autosomes) RFLPs in the NRPY section of the Y chromosome. This is one failing of the chromosome, in that the infrequency of RFLPs limits their ability to be used to assay genetic relatedness. Once the number of RFLPs was found to be low, scientists turned to VNTRs. The Y chromosome appears to be much richer in these repeats, but they aren't as helpful, as they generally undergo parallel evolution because a common process drives their development. Microsatellites, however, can be used fairly well, and the NRPY has large numbers of them. The problem with microsatellites, however, is that they are easily subject to back mutations, which makes it difficult to assay population history. Finally then, Alu repeats are left to be used. There are five hundred thousand or so in the human genome, and it appears (somehow) that only one of them has inserted into the Y chromosome since the divergence between humans, chimpanzees, and gorillas. Unfortunately, this does not allow much opportunity to study human diversity and origins. However, it does allow the possibility to resolve the Homo-Pan-Gorilla trichotomy of relationships. (Hammer and Zegura)

The end result is that the Y chromosome, although featuring some unique problems, is very useful to study. Some mtDNA has been found to undergo recombination, which could complicate future (and the results of past) studies. None of the NRPY has ever been observed recombining. Like mtDNA, the Y chromosome is uniparentally inherited, in this case through the father. Because it is through the father, an additional advantage is gained. Assuming 50% of the population is male and 50% is female, any study of mtDNA will have to include 50% of the population (all of the females). However, only 25% of the sex chromosomes in a population are Y chromosomes, so by using the Y chromosome instead of the mitochondria, you effectively reduce the size of the population study by another half, making it easier to work with. Also, the Y chromosome appears to be much less mutable than the mtDNA, which allows more confidence in longer-ranging estimates regarding population history. The Y chromosome is also only present in one copy per cell, unlike mtDNA, which is present much much more and possibly in slightly different varieties. And finally, through analysis of the Y chromosome, it is possible to develop not just a population history but also to develop a culture history, in terms of perhaps percent male, percent male in the breeding pool, migratory patterns of males, and so on. (Mitchell)

Overall then, there is much less data available regarding the Y chromosome as it relates to population studies, but that appears certain to change as more and more benefits begin to emerge. It is even easier to work with than mtDNA and can be used for long term experiments with more confidence. Although until now the Y chromosome has been often passed over for the mitochondria, research in this area is beginning to increase and interesting results are appearing. For instance, Hammer and Zegura reported MRCAs between 170 000 and 190 000 years ago, and other reports have generally centered on 200 000 years ago, much like mtDNA MRCAs, although, again like mtDNA, there can be complications (which will be discussed in the conclusion).

Nuclear DNA Nuclear DNA can (and has) also been used to investigate human origins. However, it is not quite as easy to work with as is the Y chromosome or the mitochondria. Although nuclear DNA is much more stable than mtDNA (which allows higher confidence in results and better long-term analysis), it does undergo recombination. This greatly hinders the ability to use the nuclear genome for evolutionary studies. In addition, the nuclear genome is much larger than is the mitochondrial one, which again greatly increases the difficulty of study. Also, the effective population size for a nuclear DNA based study is twice that of mtDNA and four times that of the same study using the Y chromosome. This is again a major disadvantage. Because of these facts it will take a substantial amount of time for the genome to be thoroughly analyzed. But it will be analyzed eventually, because despite these large problems there are definite positives to study of nuclear DNA as it relates to human origins.

The two primary methods for examining nuclear DNA are ALU repeats (of which there are many) and microsatellites. The sheer abundance of these makes DNA quite attractive to study. With approximately 500 Alu repeats occurring in single events following the human-ape divergence, a very large amount of information is definitely stored in the nuclear genome. Recombination hinders the interpretation of this data, but the data can be assembled into a coherent picture nonetheless. Furthermore, the large genome enables ample opportunity to crosscheck work, ensuring accurate results (Mountain).

This has been done in several studies, and the results are similar to that found by mtDNA, the Y chromosome, and the non-genetic means. While some nuclear DNA studies push back the date of the MRCA to 500 000 years, most are somewhat more recent and the mean is in the area of 250 000 years ago. This is slightly longer than that seen for any of the other methods, but is consistent with the stated hypothesis.

continue this paper by following the next (and final) link below

continue the paper

the next part...