Molecular Computing

Scott Langlands
MCB 299 Independent Study
J.P. Gogarten
24 April 2000
Evolution of DNA and Molecular Computers

In trying to understand the nature of DNA, Leonard Adleman has proposed a novel concept. What he has realized is the potential for DNA driven molecular computers. The idea is still relatively new and all the answers have not been fleshed out. In fact there is much criticism that the molecular computer will be nothing but a toy and has no real practical application. Adleman has admitted that he has only proposed a concept and not a practical application, but the intention was to open new doors and encourage scientists in microbiology, computing and engineering to work together and develop applications for his concept. If someone had told the developers of electric computers in the 1960s and 1970s that one day every college student would have a computer that fit on a desk or that computers would not take up a whole room but easily fit in a briefcase or in the palm of your hand, they would not have believed it. The potential for DNA computing is new and exciting, yet with present technology and understanding applications are severely limited. Rozen, McGrew, and Ellington (7)share a clear summary:

The concise algorithm by which protein functions is encoded in DNA sequences, and the flexible and adaptable operating systems of biological organisms, have been coveted by computer scientists. Similarly, the information storage capacity of computers and their extraordinary problem-solving speed have been the envy of biologists. The parallels between the two systems are many, and it would seem natural to try to combine them. Indeed, computer scientists have been quick to take advantage of many of the information-handling paradigms that have been tried and adopted by evolution. Neural nets, genetic algorithms and cellular automata all attempt to reproduce the elegance of biological systems in silicon. But despite the fact that the binary logic of machine code is superficially similar to the quaternary logic of base-pairing, reproducing the speed of computers in carbon(organic chemistry and biology) has recently been unattainable (7).

Although the research is focusing on building a molecular computer, the research and development is helping scientists to better understand the nature of DNA in biological systems and to develop practical applications of molecular computing and bio-regulation of enzymes. To accomplish this the computer scientists must work with molecular biologists to overcome the technology barrier. In short, Leonard Adleman has shown that DNA coding can be applied to executing mathematical algorithms, in hopes that the algorithms can be the foundation for constructing a biological computer.

In 1994 Adleman solved a Hamiltonian path problem, that is the traveling salesman who must visit a number of cities only once going down one way streets. The approach to the problem encodes nodes as cities with 20-mer DNA as a label. Unique oligonucleotides were used to label the cities. Then to force the DNA to be read in parallel, where a complement would ligate the path to the cites randomly. All possible computations were tried at the same time. Only those complements that completed the path to all cities were amplified using polymerase chain reaction. The product was run on agarose gel and the double stranded DNA that entered all seven nodes was extracted with distilled water and run in a PCR again. Adleman also noted that he used a magnetic bead system to help purify the correct answers. The results were recorded using graduated PCR. "Graduated PCR is a method for �printing' results and is performed by running six different PCR reactions with (the initial vertex) as the right primer and (its compliment) as the left primer in the (final) tube"(1). The computation of the possible answers takes minutes or hours depending on the number of nodes and the amount of DNA required to read across the nodes. The paths are run in parallel which means all possible answers are tried at once, a feat that slows electrical computers down. The drawback to Adleman's approach is that the PCR, magnetic beads and agarose gel electrophoresis to read the results took him seven days. He was also concerned with the amount of DNA required for more complex problems. The amount required grows exponentially with the increase in the number of nodes. In addition he was unsure of the rate of error as the process is scaled up. He tried to minimize errors by using only 20-mer oligonucleotides to minimize recombinant loops and to ensure proper annealing. Adleman did not have all the answers nor did he claim to. He was trying to open a dialogue showing that DNA could be used to compute algorithms and the speed of the computation surpasses modern computers. He also pointed out that the ligation step was exothermic and that the capacity for DNA to store energy was immense. His only concern was the limitation of technology. It took seven days to "read the data" and the amount of operations were limited.

The same year Adleman drew support from David Gifford, who is in the department of electrical engineering and computer science at MIT. This was significant because it linked the biologists to the computer scientists and showed there was a possibility to make molecular computing work. Gifford states, "Such new computational models for biological systems could have implications for the mechanisms that underlie such important biological systems as evolution and the immune system. Hundreds of practical computational problems, including optimal shop scheduling, the longest path in a graph, and Boolean logic satisfaction, can be translated into a classic form known as the Hamiltonian path problem . . . Finding such paths for complex graphs is hard. Adleman has approached the Hamiltonian path problem in a biological context where each vertex and edge of the graph can be represented by a short oligonucleotide sequence . . . In essence, Adleman has used the enormous parallelism of solution-phase chemistry to solve a hard computational problem. The take home lesson for molecular biologists is that DNA ligation can effectively search a large space of potential solutions to a given problem" (2). Gifford's article was very optimistic, however he understood the current limitations of technology. He also discusses the similar approach of brute force by computers and Adleman's method. Computers solve complex mathematical problems and integration by computing many approximations and then taking the average. In Adleman's case the DNA paths are ligated and the correct responses are extracted. Furthermore, the DNA ligations find the solutions faster than the computer approximations by proceeding "in parallel." He is also impressed by the storage capacity of the DNA and the energy efficiency. Gifford also compares natural evolution with computing where adapters could screen the solutions, that is "adapter oligonucleotides could constrain a random process towards a solution"(2). He explains that adapters could inhibit the formation of wrong answers and save time and energy. Gifford wonders if this occurs in nature. "Might similar processes be at work in biological systems that evolve? In such a scenario genetic plasticity would still be created by random events, but constraints might direct mutations to make desired outcomes more probable. If such constraints exist, understanding them would certainly be a major accomplishment" (2). This means if we understood the constraints at work in a biological system it would not only help further the development of molecular computing, but also the understanding of biological systems. Another comparison between nature and computing is using enzymes and gene regulation to aid in programming. Gifford concludes that a less labor intensive method for reading the output is needed and further analysis for sources of errors is needed and he anticipates future development. This is an interesting approach where biologists and computer engineers work together to find a suitable molecular computer. In the end biologists might learn more about natural processes that function like computers and the computer engineers will have a faster computer.

Five months later in 1995, Adleman met criticism from Michal and Nathan Linial(3), a chemist and computer scientist in Israel. They believe there will be many errors during the computations as the number of vertexes increases in the Hamiltonian path. Also the amount of DNA required would become unmanageable for the computations and for the cost, errors and labor in reading the output. The Linial's view is pessimistic regarding the procedures used and the potential for development. Another research group from Oxford(4) applaud Adleman's work but also focus on the cost and resource demands needed to solve the Hamiltonian problems. They also cite that the DNA computing is good for complex problems but are impractical for simple computations where electronic computers are favored. Last they cite that even if the ligation reactions are exothermic that the labor that goes into the PCR and electrophoresis is very inefficient. Barry Bunow from Maryland (5) also shares criticism with the Linials and the group from Oxford regarding Adleman's approach. He says that a complex Hamiltonian problem would require an exhaustive DNA library, more than all the particles in the universe, because of the specificity of labeling the nodes. Furthermore he could not envision a practical molecular computer that is automated. Adleman did respond to this criticism stating he did not intend to build a molecular computer but only prove it could be possible. He admits that there are limitations and continued to state some of them. However, it is enough that people around the world are critically thinking of the possibilities of molecular computing and looking for ways to make it possible. Admittedly when electronic computers first came out, they were programed by punching holes in stacks of cards to be fed into the computer and the applications were limited. Adleman emphasized that it was too early in the development of molecular computing to be overly optimistic or pessimistic. Where would we be today if people thought card punching was too labor intensive to develop the modern computer with the world wide web, email and amazing graphics potential?

1995 also brought some much needed support to Adleman's approach to molecular computing. First by Robert Pool (13), who admitted that molecular computing is a hot field since 200 scientists gathered at Princeton to discuss developments and applications. Pool believes that the DNA computation is an important development and it is remarkable that the computations can be done in one flask. Pool believes that molecular computing is possible if scientists work together to find a way to harness the reaction in a practical format. The next achievement sited by Pool was Richard Lipton's approach. Lipton (6) tried to deal with the issue of �brute force" rationalizing that "The speed of any computer, biological or not, is determined by two factors: (i) how many parallel processes it has and (ii) how many steps each can perform per unit of time." The first favors the biological computers where 10^22 molecules are present in 3g of water but the second favors the electric computers. Lipton believes that the "brute force" method is too inefficient. Trying every possible solution wastes time and material. Lipton's approach was to solve a satisfaction problem using Boolean values. The interesting point is that he used DNA compliments to encode binary numbers. The binary numbers would be a closer comparison to the conventional electronic computers and would seem to be an easier way to "read the output." Lipton uses a series of test tubes, extraction and PCR in his version of the "computer." Again 20-mer oligonucleotides are used to ensure annealing and prevent looping. He does admit that he wants to further study the sources of error, however, the size of the formula to the amount of DNA required is linear not exponential. Lipton's approach shows that DNA computing is not limited to Hamiltonian path problems but includes other forms of logic. Furthermore, it incorporates a binary numbering system for a readout, which could be applied to conventional computing. The question still remains, can one build a practical molecular computer?

At the end of 1995 Willem Stemmer (9) published a short piece with his views on molecular computing. Stemmer reminds the scientific community that they should focus on the biology as much as the math. The way around the problem of a large library of DNA for computation is by studying biological evolution. He states, "The explanation is that multiple, recursive cycles of selections from small pools can �compute' complex answers that far exceed the capacity of any single library of sequences." It would be more efficient, and more natural, for the best possible partial answers to be selected and enriched rather than forcing all possible answers to be sequenced. Stemmer believes that this could be accomplished by natural mutation, such as recombination and point mutation. The way to do this in vitro is by sexual PCR. He continues by analyzing the human genome as a DNA library. For the amount of proteins humans encode there would not be a big enough library in our body to facilitate the translation by Adleman's method. Only with recombination, regulation and restriction sites is it possible. Thus sequences can be moved through faster and more effectively. These insights bring the molecular computer closer to being realized. By introducing the binary code perhaps molecular-electric hybrid could be made to increase the speed of the readout. Also the recombination approach helps to solve the problem with a large DNA library.

In an article in Current Biology March 1, 1996 Daniel Rozen et al (7) summarize the findings of those scientist previously mentioned in this essay. They are not optimistic or pessimistic about the molecular computer but try to sort through some of the remaining problems. They address the problem of the PCR reaction to amplify results. The computation part is errorless, however, the biology is not. The PCR relies on the longevity of the polymerases, the temperature, pH and volume. These issues render a practical computer inaccurate. However, they do support using a Binary system to encode the output as suggested by Lipton. Overall they feel that molecular computing is progressing. It may be used to crack military and bank codes by sequencing all possible solutions in parallel. There is still no application that will make the DNA computer practical and tangible. One of the goals is to develop a molecular Turing machine, "the simplest programmable general model of computation" (7) that can read, write and be programmed. This achievement might bridge the gap between molecular and electronic computing. Speculations for a possible Turing machine using DNA have been made, however, a modified ribosome would make a better Turing machine according to the way it is explained in Rozen's article. There is also speculation for incorporating plasmids or vectors in the molecular computer to regulate the genes.

In July 1996 another development was made by Frank Guarnieri et al (8). In addition to the Hamiltonian path and Boolean formulas they have developed a method to make DNA add not just search. This required a new model to generate the correct output. This also uses binary numbers, where single stranded DNA represents "non-negative two digit integers" and the function is carried out by operators, place holders and primers. The two binary numbers represented by single stranded DNA are added by an extension reaction of the compliment. The resulting strands can be enriched through PCR. The strands can then be run on a gel and the elongated strands would encode the sums. Overall the procedure is done by horizontal chain reaction and the limitations include checking the accuracy and finding a different way to encode the output other than the gel. The DNA library required does not rise exponentially but linear, which means the amount of DNA required would not increase as the number of additions increased. It takes two days to complete the work instead of several. The ability to use DNA to add demonstrates further development in the search for molecular computing and shows that there are more applications possible. The problem still remains to find a suitable format to output the data and efficiently use the resources.

1997 brought more ideas and advances in molecular computing. In addition to more mathematical models, Qi Ouyang et al (12) proposed using E. coli to read the results of the computation. They also used another NP-complete algorithm, instead of the Hamiltonian path. The algorithms used are defined where the run time is bounded by a polynomial function or P. NP Problems are run in non-deterministic polynomial time, where the problem can not have its answer checked in polynomial time. There is a third type called NP-complete, where NP problems can be solved in polynomial time. The Hamiltonian Path is a type of NP-complete problem (2). Ouyang and his colleagues solve an NP-complete problem by using vertices, edges and binary code to solve the problem and the biological procedure used double stranded DNA, PCR and agarose gel. This procedure took a graph of N-vertices and assigned a binary number, 1 or 0, for a vertex in the clique or out respectively. The vertices are then transformed from numbers to a binary code. A complement is assigned by the graph with all edges missing and all possibilities connecting to the compliment are eliminated. The remainder of the solution set is then sorted through for the maximal clique. This is done with DNA by assigning a binary code to two sections of double stranded DNA. One section was used for the value and one section was used for the position. This formed the data pool. Primers were added and PCR was run. The experiment was run randomly at first and then with restriction enzymes to help prevent mispairing. Parallel overlap assembly was used in PCR to construct all possible combinations. The results were read on electrophoresis gel. From this process they obtained the size of the maximal clique, but to find the sequence of the clique they inserted the DNA into M13 bacteriophage. It was then cloned in E. coli. and extracted. Sources of error from this procedure include the inability of the single stranded DNA to be cut by restriction enzymes, instead S1 nuclease was used, so that false positives would not be reported. Incomplete cutting by the restriction enzymes was overcome by repeated digesting, but this makes the process longer and requires more material. This was a novel approach that used a living organism to help compute the problem and the only sources of error are in the restriction sites. As mentioned before, perhaps there could be an electronic-molecular hybrid that would allow the DNA to do the computing and electronic components like bio-chips from Affymetrix or Ciphrogen to help decode the results.

Biologists and chemists from the University of Texas(11) published an article detailing the possibilities for DNA computing April of 1999. According to Cox, "Hybridization logic is not the only possibility for DNA computation- for example, an alternative approach that we will examine below is the use of nucleic-acid enzymes as parts of nucleic-acid logic circuits; in this case, it would be catalytic transformation, rather than hybridization, that would drive computations." They then recount the theories of the Hamiltonian path problem, integer addition, decryption and maximal clique, and they add that the technique used in the maximal clique removed some of the cliques from the possible solution set without the aid of the DNA computation. Other potential for DNA computing is also recounted in this article such as, the "knight problem" where DNA solves the problem of placing knights on a chessboard in neutral positions so that no other knight can attack it. It was done via the satisfaction problem proposed by Lipton(6). The interesting points show that RNA can also be used to compute and that the computations can be subtractive as well as additive by using restriction enzymes and extraction. Criticism heralds that DNA computers would not solve "real world" applications, only interesting gimmicks. This is true so long as the technology prevents accurate interfaces, error correction and resolving the capacity issue of th DNA library required. Cox and his team have also presented the idea of a DNA transistor using RNA ligases that would read and write in the circuit. Applications of this development would include integrated circuits that could regulate hormone levels. For example, diabetics could have an implant to control insulin levels.

This year in January 2000 another development to speed up the DNA computation process was introduced in two articles of Nature. The first by Ogihara and Ray (14) discussed representing the DNA strands with binary code but affixing the DNA to a gold surface. The single stranded potential answers are bonded to a surface and then allowed to pair with their compliment. Those sequences that do not represent the correct answer are not paired and then destroyed by enzymes leaving only the correct answers. The surface is then heated to release the correct answers. PCR is also used to amplify the solutions before fixing them to the surface to improve the accuracy. This technique reduces the readout time but is still limited by the DNA library concern. The authors cite that by using the surface there is less chance for the DNA to interact with each other during the computation and there is a potential for an electronic interface to aid in the readout. They speculate that practical applications include implants to deliver a drug or signal to the body as mentioned before for insulin regulation.

Another article from the University of Wisconsin, also in Nature(10), discusses the potential for using surfaces to aid in computation. They solve the satisfaction problem on a surface and cycle the annealing and destroying steps to eliminate the incorrect answers. It is mentioned that E. coli is used to digest the incorrect answers from the surface. Furthermore, fluorescence is used to examine the remaining correct answers and the intensity is plotted on a graph. The florescence helps to discriminate between the correct solutions but the drawback to this whole procedure is the sources of error. The errors are caused by poor annealing or undestroyed wrong answers. The surface based approach does appear to be promising, but the techniques need to be cleaned up to promote accurate readout. With incomplete digestion and annealing it is difficult to enrich the correct answers with PCR. However, scientists are closer to constructing a tangible molecular computer.

Using DNA in molecular computing is a very new and exciting field. Mathematicians and biologists agree that there is potential for organic computing that can be used in small and large applications. The common theme among the articles is that molecular computing is hindered by biological errors and technological developments. The technology can always be improved and the errors in biology can be used to the advantage of the scientists. Namely, some were concerned with mutations and recombinations ruining the results, however, recombination may be the key to reducing the size of the DNA library required to run a molecular computer. Horizontal chain reactions and sexual PCR have also propelled developments in the development of computers. In the near future it is certain that there will be a mixed electronic-molecular computer. Perhaps in time with further developments a completely organic computer could be synthesized. Furthermore, just as microchips have accelerated the potential for electronic computers by making the chips smaller and increasing the versatility. There could also be a development in macromolecular computers that would allow the use of macromolecules not just DNA in computing. With macromolecules the versatility of molecular computing would also increase. Applications of effective biological computers could have many practical applications in a variety of fields not just medicine. The main focus should be to harness the resources available, perfect the techniques and further develop a DNA interface to make the readout of the solutions more accessible.

References

Adleman L. M. Molecular Computation of Solutions to Combinational Problems.
Science 1994 Nov 11; 266:1021-1024
Gifford D. K. On the Path to Computation with DNA.
Science 1994 Nov 11; 266:993-994
Linial M, et al. On the Potential of Molecular Computing.
Science 1995 Apr 28; 268:48, 483-484
Lo Y. M. et al. On the Potential of Molecular Computing.
Science 1995 Apr 28; 268:48, 483-484
Bunow B. On the Potential of Molecular Computing.
Science 1995 Apr 28; 268:48, 483-484
Lipton R. J. DNA Solutions of Hard Computational Problems.
Science 1995 Apr 28;268: 542-545
Rozen D. E. et al. Does DNA Compute? Molecular Computing.
Current Biology 1996 Mar 1;6: 254-257
Guarnieri F. et al. Making DNA Add.
Science 1996 Jul 12; 273:220-223
Stemmer W. P. The Evolution of Molecular Computation.
Science 1995 Dec 1;270:1510
Liu Q. et al. DNA Computing on Surfaces.
Nature 2000 Jan 13;403:175-179
Cox J. C. et al. The Complexities of DNA Computation.
Trends in Biotechnology 1999 Apr 17;4:151-154
Ouyang Q. et al. DNA solution of the Maximal Clique Problem.
Science 1997 Oct 17;278:446-449
Pool R. A Boom in Plans for DNA Computing.
Science 1995 Apr 28;268:498-499
Ogihara M. et al. DNA Computing on a chip
Nature 2000 Jan 13;403:143-144

Home Page URL: https://members.tripod.com/~mohawkscot/index2.html
Layout,design,revisions © 1998-2006 Medicine Soldier Webdesigns
Webmaster: Scott Langlands
Table of contents: Link
Last Revised: January 27, 2006

Administrator only