On June 26, 2000, U.S. President Bill Clinton, geneticists Francis Collins and Craig Venter, announced the completion of the “first survey” of the entire human genome. The announcement was made in front of a large audience at the White House, which signified the importance of the milestone. Actually, the sequencing of the human genome was not yet complete; the presentation had been arranged as a compromise between two competing parties. On one side, the government-funded Human Genome Project, and on the other side a private company, Celera Genomics.
Collins was the head of the Human Genome Project, Venter represented Celera Genomics. The truce had been arranged to end a race for a complete sequence of the human genome. Over a number of years both groups had made considerable progress and where getting close to the finish line. The joint statement would ensure that both sides would get credit upon completion; all the letters of human DNA would soon be spelled out. Francis Collins ended his talk with the following statement: “I am happy that today, the only race that we are talking about is the human race.”
The Human Genome Project was launched in 1990. The task at hand was enormous. The human DNA code consisted of over 3 billion letters. It was estimated that it would take 50,000 person years of labor, at a cost of $3 billion. Collins compared the project’s scale to going to the moon or splitting the atom. As it turned out, two competing groups would go after the genome. They would differ in both technique and purpose.
Celera Genomics was using a method of sequencing called shotgun. Criag Venter believed he could speed up the process by ignoring large parts of the genome located between genes. These sections encode for regulating genes, such as on and off switches, and some parts have no known function. Venter would essentially break up the genome, sequence the genes and then try to put the pieces back together.
Perhaps another motivation for the shotgun approach was to map individual genes in the hope to patent genes. Venter informed Collins his intention to seek patents for 300 genes that would serve as targets for drugs to treat diseases. In addition, the question whether the whole genome could be patented was uncharted territory.
The Human Genome Project’s founding leader was James Watson, one of the co-discoverers of the structure of DNA (the double helix in 1953). Watson had the credentials to get government funding for the project; however, he was outspoken and sometimes that got him in trouble. Watson was replaced by Francis Collins in 1993, which was more cautious and diplomatic, traits that would be needed to steer the project to completion. Collins’ group did not believe that individual genes or the genome should be up for patents. The genome belonged to everyone and should not be privatized for profit. Also, there was concern that Venter’s Shotgun method would reveal an incomplete genome, one that could not be put back together.
Scattered throughout the genome are DNA fingerprints. These are repeating patterns of code that are unique to each individual (except for identical twins), hence the term DNA fingerprints. The Human Genome Project would use DNA fingerprints to break up the task of sequencing. The DNA fingerprints stood out from the random code along the genome; this provided a natural break in which the genome could be divided up, and later put back together. The genome was divided into segments and sent to 16 labs around the world. Once each section was sequenced the genome could be placed back together
The controversy and the race for the genome increased the pace of the sequencing. In the end, both sides would publish papers on a sequenced human genome. On February 15, 2001, The Human Genome Project published their results in the scientific journal, Nature. The next day Celera published in the journal, Science.
The sequenced genome is a template of a normal genome, which could be used to find abnormal genes responsible for diseases. In theory, the template could be used to compare and locate any mutant genes. This could lead to treating and curing diseases (at the genetic level) that have previously been incurable. A map of the human genome could also prevent diseases; genes that predispose individuals to attracting diseases in the future could be identified years in advance.
Out Comes the Genome
The science behind sequencing the human genome has come from a century of discoveries, starting in the late 1800s. At first, genetics was an abstract concept describing hereditary information. Although it was known that hereditary information was passed through generations, the mechanisms were unknown. Once DNA and genes were discovered, the first step to sequencing the human genome was to start with simple organisms, such as, viruses, flies and worms. Then the more complicated human genome could be dealt with. Today, a complete instruction book to make a human being has been identified; however, a complete understanding of the book is still a long way off.
In a way, the human genome is simple in its design, yet incredibly complex in length of code and number of functions. The fundamental unit of the genome is DNA, coded information like letters of the alphabet. Certain sections make up genes; these are like words or sentences. The genes are strung together in chromosomes, which is comparable to chapters in a book. The genome is everything, the whole book. The function of genes is to encode for making proteins. Therefore, genes encode messages (carried by a messenger molecule called RNA) to build proteins. The proteins perform the actual tasks encoded by the genes.
Here are some interesting features of the human genome:
- It contains over 3 billion letters of DNA code.
- The DNA code is written in a 4 letter alphabet (AGCT), named after the initials of the 4 basic chemical units of DNA. If it were in book form, it would take more than 1.5 million pages to write it.
- The structure of DNA is arranged in base pairs, strands that are connected like a spiral staircase (the double helix).
- The total number of genes is about 20,687.
- The genome divided in 23 pairs of chromosomes, 46 in total.
- Human complexity arises from gene networks (more so than the number of individual genes). Genes can be turned on or off in specific situations, and work in different combinations to produce near-infinite functions.
- Genes only make up a tiny portion of the genome (only 2%). Most of the DNA either regulates genes, has unknown functions or does nothing at all (junk DNA).
- Part of our evolutionary past is carried in the genome, fragments of DNA that no longer serve a purpose. They are relics of DNA from ancient organisms that have gone dormant over time. These fragments vastly outnumber genes.
- Human beings are 99.9% identical at the DNA level (a discrepancy of 1 letter in every 1,200 letters).
References: Siddhartha Mukherjee, The Gene (New York: Simon & Schuster, (2016).
DNA – Episode 3 of 5 – The Human Race – PBS Documentary, published on Mar 21, 2013. https://www.youtube.com/watch?v=MJu9dL7a3ZI