Molecular considerations

Molecular considerations - DNA, RNA, genes, amino acids, proteins

Overview

Source (in another context): Michael Balter, "Schizophrenia's unyielding mysteries", Scientific American, May 2017, 48 at 52.

Living as we are in the twenty first century, we are fortunate to have at our disposal many resources to explore the principles underlying natural selection and the processes of heredity which were not available to Darwin, in particular DNA (DeoxyriboNucleic Acid) and RNA (ribonucleic acid). DNA was first identified and isolated by Friedrich Miescher in 1869 at the University of Tübingen, and the double helix structure of DNA was first discovered in 1953 by James Watson and Francis Crick at the University of Cambridge using experimental data collected by Rosalind Franklin and Maurice Wilkins. Their contribution, especially that of Franklin, was especially significant[1]. Here's how they went about it: https://www.youtube.com/watch?v=VegLVn_1oCE

The significance of DNA as a historical record [1.1]

The DNA record is an almost unbelievably rich gift to the historian. Its significance for historical purposes lies in the fact that, "as long as the chain of reproducing life is not broken, its coded information is copied to a new molecule before the old molecule is destroyed. In this form, DNA information far outlives its molecules. It is renewable - copied - and since the copies are literally perfect for most of its letters on anyone occasion, it can potentially last an indefinitely long time. Large quantities of our ancestors' DNA information survives completely unchanged, some even from hundreds of millions of years ago, preserved in successive generations of living bodies".

"What historian could have dared hope for a world in which every single individual of every species carries, within its body, a long and detailed text: a written document handed down through time? Moreover, it has minor random changes, which occur seldom enough not to mess up the record yet often enough to furnish distinct labels…. It follows from the fact of Darwinian evolution that everything about an animal or plant, including its bodily form, its inherited behaviour and the chemistry of its cells, is a coded message about the worlds in which its ancestors survived: the food they sought; the predators they escaped; the climates they endured; the mates they beguiled. The message is ultimately scripted in the DNA that fell through the succession of sieves that is natural selection".

DNA - what is it? [2]

A DNA molecule is a long chain of building blocks - small molecules called nucleotides. These are organic molecules that serve as the sub-units of nucleic acids such as DNA and RNA, which are in turn composed of a nitrogenous base, a five-carbon sugar (ribose or deoxyribose) and at least one phosphate group.

The molecule consists of a pair of nucleotide chains composed of two complementary chains of DNA twisted together, somewhat like a twisted ladder and colloquially referred to as the 'double helix' and 'the immortal coil' with the base pairs forming the ladder’s rungs and the sugar and phosphate molecules forming the vertical sidepieces of the ladder. These DNA molecules, which are present in every cell, encode a detailed set of plans, rather like a blueprint, for building different pieces of the cell. They provide all the necessary information for a living organism to grow and function and tell the cell what role it is to play in your body.

The nucleotide building blocks come in only four different kinds, whose names may be shortened to A, T, C and G (Adenine, Thymine, Cytosine and Guanine). The order or sequence of these base pairs determines the information available for building and maintaining an organism, similar to the way in which letters of the alphabet appear in a certain order to form words and sentences.[3] A always pairs with T and C with G to form the base pairs, and the unique sequence of the A, C, T, and G in DNA forms codes which carry genetic information.

Specific combinations of bases are called genes. The genes tell the cell to make other molecules called proteins and the arrangements of the letters constitute codes for the different kinds of protein. Joined together in long DNA strands, these genes form the x- and y-shaped chromosomes, each of a different length. The proteins also allow a cell to perform special functions, such as working with other groups of cells to make the body’s individual components, such as sight and hearing and so on, function. The nucleotides store and read the instructions for making an organism or species (the genome) and the proteins use those instructions to construct each individual organism. Crudely speaking, nucleotides handle reproduction and proteins handle metabolism, which is something like the distinction between hardware and software.

Source: U.S. Department of Energy http://news.nationalgeographic.com/news/2006/03/0307_060307_dna.html

Source: The Australasian Genetics Resource Book, 2007, p 4

If we were to hypothetically untwist the DNA strand and lay it flat, it would look just like a ladder. The two sides of the ladder are called the DNA's "backbone", and the steps inside the ladder, the "bases", A, C, G and T, thus:

These building blocks are the same in all animals and plants. What differs is the order in which they are strung together. A G building block from a human is identical to a G building block from a snail. But the sequence of building blocks in a human is not only different from that in a snail. It is also different, though less so, from the sequence in every other human (except in the special case of identical twins).

DNA dwells in our cells

Our DNA lives inside our bodies. It is not concentrated in a particular part of the body, but it is distributed in all of our cells. There are about a thousand million million cells (a thousand billion) making up an average human body, and, with some exceptions which we can ignore, every one of those cells contains a complete copy of that body's DNA.

This DNA can be regarded as a set of instructions for how to make a body, written in the A, T, C, G alphabet of the nucleotides. The English language has 26 letters and the Greek one 24, DNA but 4. [3.1] Richard Dawkins uses a beguiling metaphor to illustrate the point [4]:

It is as though, in every room of a gigantic building, there is a book-case containing the architect's plans for the entire building.
The 'book-case' in a cell is called the nucleus.
The architect's plans run to 46 volumes in man, the number is different in other species.
The 'volumes' are called chromosomes. They are visible under a microscope as long threads, and the genes are strung out along them in order. It is not easy, indeed it may not even be meaningful, to decide where one gene starts and ends and the next one begins.
The ‘pages’ in the volumes are the genes, the individual instructions for making proteins, although the division between genes is less clear-cut than the division between the pages of a book.
On the pages, there are four letters of nucleic acid bases arranged in three letter words, which have to correspond to the twenty amino acids which are the sub-units of proteins, more about which later.

The chromosomes containing the genes are found in the nucleus of the cell, and in very small compartments in the cell called mitochondria that are found randomly scattered in the cytoplasm outside the nucleus. The mitochondria are the energy centres of the cell.

The composition of a human cell. Source: http://ghr.nlm.nih.gov/handbook/howgeneswork/makingprotein

The DNA instructions have been assembled by natural selection. DNA molecules do two important things. Firstly they replicate, that is to say they make copies of themselves. This has gone on non-stop ever since the beginning of life, and the DNA molecules are now very good at it indeed. As an adult, you consist of a thousand million million cells, but when you were first conceived you were just a single cell, endowed with one master copy of all the architect's plans. This cell divided into two, and each of the two cells received its own copy of the plans, so that each of them could replicate themselves. Successive divisions took the number of cells up to 4 (22), 8 (23), 16 (24), 32 (25), 64 (26), and so on into the thousands of billions. At every division, the DNA plans were faithfully copied, with scarcely any mistakes [5].

It is one thing to speak of the duplication of DNA. But if the DNA is really a set of plans for building a body, how are the plans put into practice? How are they translated into the fabric of the body? This is the second thing DNA does. DNA indirectly supervises the manufacture of the different kinds of molecules of proteins that make up the organs of the body and the building of all these organs with their functions. Thus as the cells divide and multiply, the different organs of the body appear and build up. It takes some 15-18 years to build a human capable of reproducing itself in its turn.

The relevant categories are now considered in sequence: amino acids, proteins, genes (the 'pages' in the 'volumes') and chromosomes (the 'volumes' themselves).

Amino Acids

Amino acids are the sub-units of proteins. Each protein is formed by a chain of amino acids linked together through peptide bonds. The chain of amino acids takes on different shapes to form different proteins. The various shapes allow proteins to take on different characteristics in cells. Each amino acid is composed of a constant (always remaining the same) group and a variable amine group. There are 20 common amino acids that are responsible for forming proteins, and they can be classified into 4 main families based on their chemical characteristics: acidic, basic, uncharged polar, and nonpolar.

The relationship between the nucleotide sequences of genes and the amino acid sequences of proteins is determined by a three-letter genetic code formed from a sequence of three nucleotides (e.g. ACT, CAG, TTT) drawn from a 64 word dictionary, each word being called a codon. Where does this 64 word dictionary come from? When you take 3 letters at a time, it allows for 64 combinations (4 to the 3rd power) and that 64 word dictionary is the same wherever you look in the animal kingdom, with one or two exceptions too minor to undermine the general rule. Combinations of three nucleotides correspond to different amino acids. The dictionary maps the 64 code words onto 21 meanings - the 20 biological amino acids that are responsible for forming the proteins of all living organisms plus one "all-purpose punctuation mark", but note that most amino acids have more than one three letter code, so in this respect the code is technically 'degenerate'. [6]

When a cell wants to manufacture a certain protein, it has to find the recipe for that protein stored in the DNA. Each group of three bases ((adenine (A), guanine (G), cytosine (C), and thymine (T)) codes for a particular amino acid: CCT codes for proline and CGT codes for arginine. So, during protein synthesis, DNA turns into the instruction for making a protein. The 20 amino acids are strung into typically a few hundred, each sequence a particular protein molecule. Whereas the number of letters in the DNA alphabet is limited to four and the number of codons to 64, there is no theoretical limit to the number of proteins that can be spelled out by different sequences of codons. [6.1]

Proteins

Proteins are the main workhorses of the cells. They are the building blocks of all body processes, consisting of chains of amino acids, and these chains fold themselves, in highly determined ways but on a much smaller scale. To function properly, they must have the right structure, location and abundance in each cell. A human body contains about 19,000 proteins.

To make a protein, a cell must put a chain of amino acids together in the right order. First, it makes a copy of the relevant DNA instructions in the cell nucleus, and takes it into the cytoplasm (see here the graphic of a human cell above at http://ghr.nlm.nih.gov/handbook/howgeneswork) - a bit like taking a photocopy of the instruction manual from the manager's office out to the assembly lines in a car factory, according to one observer. Here, the cell decodes the instruction and makes many copies of the protein, which fold into shape as they are produced.

The coded message of the DNA, written in the four-letter nucleotide alphabet ATCG, is translated in a simple mechanical way into another alphabet. This is the alphabet of amino acids which spells out protein molecules. Making proteins may seem a far cry from making a body, but it is the first very small step in that direction. Proteins not only constitute much of the physical fabric of the body; they also exert sensitive control over all the chemical processes inside the cell, selectively turning them on and off at precise times and in precise places. Eventually this process leads to the development of a baby.

Proteins consist of strands of amino acids folded into a specific shape. The different protein structures can be classified by four levels of folding, each successive one being constructed from the preceding one[7]. A protein’s shape is important: a protein is an enzyme which is a catalyst. A catalyst is a chemical substance that speeds up, by as much as a billion or even a trillion times, a chemical reaction between other substances, which like the catalyst itself emerge from the process unscathed and free to catalyse again. The right enzyme achieves its ‘rightness’ largely through its physical shape, and that’s important because the physical shape is determined by genes, and it is genes whose variations are ultimately favoured or disfavoured by natural selection [8].

Primary Structure - The very basic strand of amino acids is the called the primary structure.

Secondary Structure - The hydrogen-bond interaction among strands of amino acids gives rise to the first level of folding, alpha-helices and beta-sheets.

Tertiary - Interaction between alpha-helices and beta-sheets comprise the second level of folding, protein domains. These protein domains are then strung together through third level folding to form small globular proteins. The combination of second and third level folding yields tertiary structure.

Quaternary Structure - In order to achieve enhanced function, small globular proteins often come together to form protein aggregates. This fourth level of protein structure is called the quaternary structure. A famous example of quaternary structure is haemoglobin.

The Karyotype of Alzheimer's disease. The chromosome affected is number 19.

For a very good animation of the relevant principles involving DNA, genes and proteins (also including inheritance, and mitosis/meoisis which we will deal with later), see the link at: http://pratclif.com/biologie-moleculaire/dna/protein.swf

Genes

A gene consists of a sequence of DNA letters that spell out the amino acids needed to make a protein. Dawkins has his own definition centred upon the gene's function as replicator and that is: a gene is any portion of chromosomal material that potentially lasts for enough generations to serve as a unit of natural selection. In other words, the gene is a replicator with high copying-fidelity. Copying-fidelity is another way of saying longevity-in-the-form-of-copies which Dawkins abbreviated simply to "longevity".

The chance coming together of previously existing sub-units is the usual way for a new genetic unit to be formed [9]. This is referred to as crossing-over and refers to the process whereby pieces of each paternal chromosome physically detach themselves and change places with exactly corresponding bits of maternal chromosome.

Genes are the instruction manuals for our bodies, and provide the directions for building all the proteins that make our bodies function. For example, blood contains lots of red blood cells that transport oxygen around our bodies. The cells use haemoglobin, a protein, to capture and carry this oxygen. Of our 40,000 genes, only a few contains the instructions for making haemoglobin proteins. The remaining genes contain the instructions for making other parts of our bodies.

Genes also contain instructions for building proteins. Specialised proteins work in the hair cells of our ears to help us hear. Other proteins put the colour in our eyes. Remember that genes are made of DNA, and one strand of our DNA contains many genes. All of these genes are needed to give instructions for how to make and operate all parts of our bodies.

The gene complex is just a long string of nucleotide letters, not divided into discrete pages in an obvious way at all. However, there are special symbols indicating the end of one protein chain message and the beginning of another written in the same four-letter alphabet as the protein messages themselves. In between these two punctuation marks are the coded instructions for making one protein. But crossing-over does not respect boundaries between genes (or cistrons as they are sometimes referred to in this context) and splits may occur within genes as well as between them. The genes are not separated from their neighbours by any delimiters apart from what can be read from their sequence. In this respect they resemble old-fashioned "TELEGRAMS THAT LACK PUNCTUATION MARKS COMMA AND HAVE TO SPELL THEM OUT AS WORDS COMMA ...." with no punctuation marks in between and only a final one spelled out at the end STOP [9.1].

A body is the genes' way of preserving the genes unaltered, and characteristics acquired after birth are not inherited. Natural selection favours replicators that are good at building survival machines[10], genes that are skilled in the art of controlling embryonic development and their body parts, such as the muscle, the heart and the eye. A survival machine is a vehicle containing not just one gene but many thousands all working together in a cooperative venture. They are all interdependent. In terms of the analogy, any given page of the plans makes reference to many different parts of the building; and each page makes sense only in terms of cross-references to numerous other pages.

Sexual reproduction has the effect of mixing and shuffling genes, so one individual body is just a temporary vehicle for a short-lived combination of genes. The combination of genes in any one individual may be short-lived, but the genes themselves are potentially very long-lived. Their paths constantly cross and recross down the generations. A gene may be regarded as a unit that survives through a large number of successive individual bodies[11].

Mutations [12]

No copying process is perfect. It is fundamental to the idea of a replicator that when a mistake or mutation does occur, it is passed on to future copies. The mutation brings into existence a new kind of replicator which “breeds true” until there is a further mutation. Imagine an ink blemish on a sheet of paper copied on a photo copy machine which is then perpetuated in subsequent copies, perhaps compounded by additional blemishes stemming from unclean glass on the photocopier. In any chain of replicators errors are cumulative.[13] Dawkins uses two examples of this process. One kind of mutation is called point mutation. A point mutation is an error corresponding to a single misprinted letter in a book. It is rare, but clearly the longer a genetic unit is, the more likely it is to be altered by a mutation somewhere along its length.

Another rare kind of mistake or mutation which has important long-term consequences is called inversion. A piece of chromosome detaches itself at both ends, turns head over heels, and reattaches itself in the inverted position. In terms of the earlier analogy, this would necessitate some renumbering of pages. Sometimes portions of chromosomes do not simply invert, but become reattached in a completely different part of the chromosome, or even join up with a different chromosome altogether. This corresponds to the transfer of a wad of pages from one volume to another. Though usually "disastrous", this can occasionally lead to the close linkage of pieces of genetic material which happen to work well together. Perhaps two cistrons which have a beneficial effect only when they are both present - they complement or reinforce each other in some way - will be brought close to each other by means of inversion. Natural selection may then tend to favour the new 'genetic unit' so formed, and it will spread through the future population.

There are said to be between 20,000 and 40,000 deviations from the so-called "normal" human sequence of nucleotide letters in any one human body. Twenty per cent of the variants in a person's DNA code have the potential to alter protein function, and about 1,000 are exceedingly rare [14]

Chromosomes

Remember those 46 'volume's' (the chromosomes). In mammals, our DNA "strings" are arranged in chromosomes in a linear fashion and each is a different length. But the estimate is probably in the millions of "letters" per chromosome. The nucleus of all of our cells, it doesn't matter which cell type, contains chromosomes, and chromosomes are responsible for storing our hereditary information. [15]

The DNA packaging in chromosomes occurs in several discrete steps, starting with the double helix of DNA. Then the DNA is wrapped around some proteins. These proteins are packed tightly together until they form a chromosome, which are therefore efficient storage units for DNA. Each human cell has 46 chromosomes. [16] The chromosomes consist of 23 pairs of chromosomes, which means that filed away in the nucleus (bookcase) of every cell, there are two alternative sets of 23 volumes of plans: volume 1a and 1b, volumes 2a and 2b etc down to volume 23a and 23b. We receive each chromosome intact from one of our parents in whose testis or ovary it was assembled. Appearing at right is a picture of all of the chromosomes in a cell, known as a karyotype.

Both males and females have 23 pairs of chromosomes, but one of these pairs constitutes the sex chromosomes that determine whether you are male or female – x and y. In males, the 23rd pair consists of an x-Chromosome and a y-Chromosome, whereas females have two x-Chromosomes. The y-Chromosome is special because it carries ancestral information regarding a male's paternal line: Females do not have a y chromo. Instead they have 2 x Chromosomes.

No two individuals, except identical twins, have exactly the same genetic code and that is what makes everyone unique. However, all males who are originated from a common lineage will share the same or very similar genetic code in their y-Chromosome. Unrelated males from a different family line will have a different y-Chromosome code.

Each volume coming originally from the father can be regarded, page for page, as a direct alternative to one particular volume coming from the mother, and affect, for example, eye colour. The pages may make contradictory recommendations and the gene that is ignored is called recessive. The one that is followed is dominant, eg the gene for brown eyes is dominant to the gene for blue eyes. Genes, like the brown eye and blue eye gene, that are rivals for the same slot on a chromosome are called alleles of each other [17].

Imagine the volumes (chromosomes) of architects' plans being loose-leaf binders whose pages can be detached and interchanged. Every Volume 13 must have a Page 6, but there are several possible Page 6s which could go in the binder between Page 5 and Page 7. One version says 'blue eyes', another possible version says 'brown eyes'. There may be yet other versions in the population at large which spell out other colours like green. Perhaps there are half a dozen alternative alleles sitting in the Page 6 position on the 13th chromosomes scattered around the population as a whole. Any given person only has two Volume 13 chromosomes. Therefore he can have a maximum of two alleles in the Page 6 slot. He may, like a blue-eyed person, have two copies of the same allele, or he may have any two alleles chosen from the half dozen alternatives available in the population at large.

Cell division – mitosis and meiosis [18]

Our genes are doled out to us at conception, and there is nothing we can do about this and in fact, something like the detaching and interchanging of pages and wads of pages from loose-leaf binders really does go on. The normal division of a cell into two new cells, each one receiving a complete copy of all 46 chromosomes is called mitosis.   Another kind of cell division called meiosis occurs only in the production of the sex cells: the sperms or eggs, each containing only 23 chromosomes, which or course is exactly half of 46. Meiosis is therefore a special kind of cell division, taking place only in testicles and ovaries, in which a cell with the full double set (diploid) of 46 chromosomes divides to form sex cells with the single set (haploid) of 23 (using the human numbers for illustration). In other words, meiosis works to halve the number of chromosomes and in doing so thereby gives rise to the sperm or egg cells.

Every sperm cell made by an individual is unique, even though all his sperms assembled their 23 chromosomes from bits of the same set of 46 chromosomes. Eggs are made in a similar way in ovaries, and they too are all unique. The number of possible combinations that give birth to the new individual are 223. So that new individual is also unique, because the probability of him being the same as another one is practically nil.

Remember that in real life, during the manufacture of a sperm (or egg), bits of each paternal chromosome physically detach themselves and change places with exactly corresponding bits of maternal chromosome[19], and this crossing over means that any one chromosome in a sperm is a patchwork, a mosaic of maternal genes and paternal genes.

The role of RNA

Before moving on to mitochondrial DNA, there is another of the three different kinds of biological molecules that each serve critical functions in the cell to be considered: RNA or ribonucleic acid. Remember that the proteins are the cell’s workhorses, carrying out diverse catalytic and structural roles, while the nucleic acids DNA and RNA carry the genetic information that can be inherited from one generation to the next.

RNA is a polymeric molecule[20] made up of one or more nucleotides. A strand of RNA can be thought of as a chain with a nucleotide at each chain link. Remember that each nucleotide is made up of a base (A, C, G and U), a ribose sugar, and a phosphate. The structure of RNA nucleotides is very similar to that of DNA nucleotides, but despite their great structural similarities, DNA and RNA play very different roles from one another in modern cells.

RNA is the intermediary, a kind of information bridge, between DNA and a protein. (Here, see the graphic of a cell's composition, above at http://ghr.nlm.nih.gov/handbook/howgeneswork). mRNA (messenger RNA) is the message that carries genetic information from the DNA in the nucleus to the cytoplasm. tRNA (transfer RNA) is the adaptor that reads the mRNA and brings the amino acids to the ribosomes for protein synthesis. The flow of information from DNA to RNA to proteins is one of the fundamental principles of molecular biology. It is so important that it is sometimes called the “central dogma”.

A major difference between DNA and RNA is that DNA is usually found in a double-stranded form in cells, while RNA is typically found in a single-stranded form. The lack of a paired strand allows RNA to fold into complex, three-dimensional structures. RNA folding is typically mediated by the same type of base-base interactions that are found in DNA, with the difference being that bonds are formed within a single strand in the case of RNA, rather than between two strands, in the case of DNA[21].

Until recently, it was thought that most of the human genome, as much as 97%, served no useful function, and had no impact on the makeup of the human individual, and that it was just “junk”. This DNA did not make proteins and was called non-coding DNA, so it was generally assumed that it was lacking in utility. However Australian geneticist John Mattick felt that it was unlikely that useless material would survive hundreds of millions of years of evolution. He found that the non-coding sections of DNA did have a function, namely to produce RNA, forming a massive and previously unrecognised regulatory framework that controls human development. Many scientists now believe that this RNA is the basis of the brain’s plasticity and learning, and may hold the secret to understanding many complex diseases[22]. On a subsequent page, we consider the origins of the genetic code in DNA and RNA's possible role in how life got going in the first place.

The role of mitochondrial DNA (MtDNA)

The mitochondria in the cell’s cytoplasm also contain genes, although mtDNA is one long string of genes not arranged in chromosomes. (The genes in bacterial DNA are also arranged in a long string giving rise to the theory that the mitochondria originated from bacteria that once invaded or were absorbed by a human cell long ago in evolution to provide symbiotic benefits). In fact, there are 37 genes in the mitochondrial genome. They carry information the mitochondria use to perform their primary function, which is the generation of energy for cellular activities. They utilise oxygen in a way that liberates energy from foodstuffs. The mitochondria made complex life possible.

MtDNA has been described as the cell powerhouse of eukaryotic (multicelled) cells, being able to produce several key proteins, transfer RNA, and ribosomal RNA using the blueprints found in their DNA. They can be important for understanding mitochondrial disorders passed on by the mother. A mitochondrial genome is also a collection of genetic information carried in the mitochondria. MtDNA is inherited solely through the mother. Measured over thousands of years, these mutations occur at a regular rate, and each new mutation gives rise to a new lineage of mtDNA, like the branches on a family tree.

“Yet even after a billion years, mitochondria behaves as if things might not work out between us", says Bill Bryson with his customary flourish. "They maintain their own DNA. They reproduce at a different time from their own host cell. They look like bacteria, behave like bacteria and sometimes respond to antibiotics the way bacteria do. In short, they keep their bags packed. They don’t even speak the same genetic language as the cell in which they live. It is like having a stranger in your house, but one who has been there for a billion years”.[23]

The significance for natural selection is that by analysing the mtDNA sequences of a large number of people, geneticists can build a "gene tree", working backwards in time and perhaps eventually converging on a common ancestor. However, nuclear DNA (from the nucleus) is potentially a richer resource for reconstructing human history, but it is harder to work with. Unlike mtDNA or the y chromosome, which are both passed down intact, the nuclear genome is chopped up and recombined into novel combinations every generation, which makes it very difficult to build gene family trees: one can't be sure whether sequence differences arose through shuffling or mutation, which made it all but impossible to derive information on evolutionary history from nuclear DNA.   However, with the passage of time, many of the difficulties involved have now been resolved[24].

See also "The role of DNA and mutations in fossil identification": /dna-and-mutations.html

[1] Source: https://en.wikipedia.org/wiki/DNA An account of the relative contributions of each of the four players, may be found in Bill Bryson, A Short History of Nearly Everything, Broadway Books, 2003, 403-407; see also Bill Mesler and H James Cleaves II, A brief History of Creation, Penguin, London, 2016, 203-4. Emphasis is laid in both accounts of the demeaning, sexist portrayal of Franklin by Watson in his account of what occurred in his book The Double Helix. Watson and Crick's discovery wasn't actually confirmed until the 1980s: Bryson, op cit, 407.
[1.1] As explained by Richard Dawkins and Yan Wong, The Ancestor's Tale - A Pilgrimage to the Dawn of Life, Weidenfeld & Nicolson, London, 2004, 2nd Edition, 2016, 22
[2] Dawkins, SG, 22-23. What follows is heavily reliant in summary and extract form on Richard Dawkins' The Selfish Gene (SG), 1976, 1989, especially Chapter 3 thereof, "Immortal coils", with other sources interspersed among the commentary as from time to time indicated. My page references are to the 1989 edition.
[3] The "sentence" for the Forkhead box protein P2 gene appears on the sub-page entitled "The significance for natural selection": /the-significance-for-natural-selection.html
[3.1] Dawkins and Wong, op cit, 20.
[4] SG 22.
[5] SG 23.
[5.1] Dawkins and Wong, op cit, 20.
[6] Dawkins, Greatest Show, 315; Dawkins and Wong, 20-21.
[7] The internet source for this material and the illustrations which follow are now unable to be traced or referenced.
[8] Sources for paragraph: Dawkins, Greatest Show, 236, 239.
[9] SG 28.
[9.1] Dawkins and Wong, op cit, 20-21.
[10] A term Dawkins uses to describe organisms such as bodies, animals and human, which carry and serve to transport the genes. It is the subject of explained more fully on the next page dealing with The Selfish Gene.
[11] SG 25
[12] SG 31.
[13] EP 85.
[14] Kevin A Strauss, "Genomics for the people", Scientific American, December 2015, 58 at 65.
[15] An exception is our red blood cells, which have no nucleus and so don’t have any chromosomes.
[16] Not all living things have 46 chromosomes like humans: mosquitos have 6; onions 16; carp 104.
[17] SG 25-6. The idea of dominant and recessive factors in inheritance was pioneered by Gregor Mendel, an Austrian monk, though he did not refer to them as genes. Previously, it was thought that a favourable trait in one parent would be diluted when blended through subsequent matings with the the other and then continuing through subsequent generations.   This was regarded as a flaw in Darwin's argument: Bryson, op cit, 391-393
[18] SG 26-7.
[19] Here Dawkins reminds us that we are talking about chromosomes that came originally from the parents of the individual making the sperm, i.e., from the paternal grandparents of the child who is eventually conceived by the sperm.
[20] A large molecule, or macromolecule, composed of many repeated sub-units.
[21] Source: http://exploringorigins.org/rna.html
[22] “Making something of junk earns geneticist top award”, SMH, 13 March 2012. Professor Mattick also suggests that the body’s vast amounts of non-coding RNA are not junk as previously thought, but essential to regulating the function of other genes and that, once activated, they guide the behaviour of neurons. A deficiency in this “long non-coding RNA”, known as Gomafu, may be also a factor in the development of schizophrenia: “Low levels of RNA linked to schizophrenia”, SMH, 1 May 2013.
[23] Bryson, op cit, 300.
[24] As explained in Dan Jones, “The Neanderthal within”, New Scientist, 03 March 2007.

So what came first, the metabolism or the software?

DNA may be the link, but how do we explain the origins of the genetic code in the DNA present in all living organisms today? Basically, this encompasses the distinction between nucleotides which store and read the instructions for making an organism (the genome) – the hardware which handles replication, and the proteins that use those instructions to construct each individual organism – the software which handles the organism’s metabolism. So which evolved first – metabolism (the software of chemical activity) or replication (the hardware of the genetic code), or did they evolve together? [1]

In answering this question, it may be supposed that the organic chemicals of the early earth were already subject to the laws of evolution. Most chemicals vanished, but those well adapted to their environment were more likely to survive, and those that survived long enough had offspring, and all later generations were their descendants, and if enough of a particular kind survived and catalysed, eventually there may be a runaway reaction. If so then wherever in the universe conditions allow large amounts of organic chemicals to form and interact, life may be a near certainty.

However, even theories that put metabolism (software) first need to explain what mechanisms of replication (hardware) existed in the early days, and DNA is a fantastically complex molecule. Every cell contains a complete set of these instructions, even though it uses only a tiny portion of its DNA instruction manual. These instructions are triggered by the particular environment in which the cell finds itself. For example, different parts of the body are used by brain cells and bone cells.

So how did this “complex, elaborate and elegant mechanism”[2] get going in the first place. DNA is helpless on its own, so how could it have evolved without hardware? There are also problems with the view that metabolism (the software) developed first - in particular how rough and ready evolutionary processes without sufficient precision could generate such a high enough level of complexity.

This has led to speculation that perhaps RNA, DNAs close cousin - “software” which can code information but is less “helpless” than DNA, and less stable also - may be the answer. It exists only in a single strand, which means that it can fold up like a protein and engage in metabolic activities. So it can play both sides of the role of life. It can resemble both DNA and proteins, the agents of replication and metabolism. It can reproduce itself and provide the set of instructions for reproducing. It can be both hardware and software, and though it carried genetic information, some RNA must have once been capable of doing the job of a protein.

Perhaps the first molecules that reproduced accurately to have some form of “heredity” were made from RNA, leading to suggestions that RNA was the earliest form of life and could copy itself.[3] Experiments conducted in the 1970s even showed that given the right conditions a meaningful information-carrying molecule like RNA is capable of arising spontaneously[4]. The downside is that RNA copies itself less accurately than DNA, possibly leading to the transmission of errors to later generations.

Most biologists now seem to agree that life began with a soup of RNA, and then at some stage the vast majority of life must have switched its code, and the prevailing dogma is that the earliest life forms arose from a loose mix of proteins and nucleic acids that used RNA as their genetic material. Then at some stage, most of life began storing genetic information in DNA, and all the cellular life we know today, and most modern viruses as well are DNA based. In other words, about 4 billion years ago, early life received a massive system upgrade[5].

An unusual hybrid

Perhaps life as we know it emerged through a symbiosis between two different types of organism, one good at metabolism and the other good at coding. Something like this exists today between bacteria and the many different viruslike entities that float between them. Bacteria often use free-floating bits of software similar to viruses for their own purposes, while entities like viruses exploit the metabolic powers of bacteria and other organisms to reproduce[6]. This suggestion has recently been given added impetus by the discovery of an unusual hybrid virus living in one of the harshest environments on the earth – a hot acidic lake in California’s Lassen Volcanic National park. Here, Ken Stedman of Portland State University in Oregon, found a gene made of DNA which looked like the gene for a protein coat from an RNA virus. In other words the gene had leapt from RNA to DNA. A full sequence of the strange virus’s genome revealed that it contained a gene for DNA replication typical of a DNA virus.

Similar discoveries have been made in at least a couple of oceanic samples. So modern viruses can combine information coded in the two normally separate genetic molecules, lending support to the idea that it was viruses that performed the upgrade from RNA and effectively gave rise to DNA. This may have occurred when an RNA virus, DNA virus and retrovirus all infected a cell at the same time. Modern virus’s life cycles are very different from those of their ancestors. However, the finding does prove that a community of viruses can move information from RNA into DNA and the graphic below shows how this may have occurred[7].

of one thing we can be sure,
There is not yet a complete theory of the origins of life. Some have have been optimistic enough to suggest that the solution may be "just around the corner"[8]. Another has opined that “progress has been rapid in recent decades, and ongoing research holds out the promise that a more satisfactory story may appear within the next decade or two”[9]. But of one thing we can be sure, whether the search be long or short, science will never stop searching for an answer[10].

Drawing the threads together. the origin of life is Canterbury! [11]

Richard Dawkins and Yan Wong (hereafter Dawkins) have drawn all these disparate threads together in their work The Ancestor’s Tale, (see Descent from a common ancestor) a text which by way of literary analogy draws a parallel between the path the pilgrims followed on their journey to Canterbury in The Canterbury Tales and Dawkins’ path from the present backwards in time to the origin of life itself, with each species - animals, plants, fungi. archaea – commencing from their own individual vantage points each on their own separate pilgrimages and along the way meeting with and visiting their own ancestors including the ones they share with us.

The Canterbury Tales had its journey’s end: the city of the same name, and so does the Ancestor’s Tale in the origin of life itself: Dawkins’ “Canterbury” - the destination towards which this 4 billion year backwards in time journey has all the while been inexorably heading. The only problem is, as we have seen, we don’t really know anything about the origin of life is or how it got started.

Theories abound. Dawkins ranges over the Miller-Urey experiment, considered above and now rendered redundant because the early atmosphere, as it was then conceived to be when the experiment was conducted - a mix of methane, ammonia, water vapour and hydrogen - is not as it is presently conceived - primarily nitrogen and carbon dioxide - but the experiment at least showed that complex organic molecules can form without living beings to build them[12], and we now also know that natural, abiotic reactions do create complex organic chemicals Including amino acids) because they are found in meteorites.

The riddle also remains of how the first self-replicating molecule arose, one problem being that DNA, the main self-replicating molecule we are aware of, can’t replicate “without a large supporting cast of molecules, including proteins that can only be made by DNA”, so RNA is a far better candidate for the original replicator than DNA, according to Dawkins.

Along our hosts' backwards self-replicating journey's trajectory, we also encounter enzymes and catalysts - in particular abzase which, from a cell’s point of view, speeds up reaction rates by a factor varying between a million and a trillion; and mutations subject to degradation caused by copying errors. Short chains of RNA and indeed DNA can spontaneously self-replicate without an enzyme, but the error rate per letter is far higher than when an enzyme is present, which could signify the death-knell of the original gene – “the catch-22 of the origin of life”: a gene big enough to specify an enzyme would be too big to replicate accurately without the aid of an enzyme of the very kind it is trying to specify, so the system cannot get started. [13]

And on top of all this, as recently noted on this page, perhaps life didn’t get started in a “warm little pond” after all, but from archaebacteria conceived in scalding temperatures deep underground and on the ocean floor. In 1977 it was discovered that volcanic vents on the floor of deep oceans support “a strange community of creatures, all conceived and surviving without the benefit of sunlight”, leading Dawkins to opine that “(a) persuasive case can indeed be made that when we dig down into the rocks we are digging backwards in time, and rediscovering something like the conditions of life’s scalding Canterbury”. [14]

The journey back to the present [15]

And on that note, Dawkins turns for home. Just as our host in The Canterbury Tales made the return journey to London when his pilgrimage had been accomplished, so Dawkins returns to the present, his metaphorical London, following his reverse pilgrimage to the origin of life itself. But he returns without his ancestor companions: “for to presume upon evolution’s following the same course twice would be to deny the rationale of our backward journey”. Evolution was never aimed at any particular endpoint”, and it cannot be presumed that if it were to be rerun, it would necessarily follow the same course. [16]

But what would life be like if the ‘tape’ were to be rerun a statistical number of times? Dawkins points out that Australia, New Zealand, Madagascar, South America are countries, continents in some cases, whose prolonged geographical isolation provides us with approximate reruns of major episodes of evolution, since the land masses housing them were isolated from each other and from the rest of the world for significant parts of the period after the dinosaurs disappeared when the mammal group displayed most of its evolutionary creativity. In other words, evolution can evolve in similar fashion if it is allowed to rerun twice.

On his lonely journey back to the present, Dawkins also ruminates on some of the major watersheds of evolution experienced on his outward journey when accompanied by his many ancestors:

The period when a “world of RNA” serving as both replicator and enzyme gave over to a separation between DNA in the replicator role and proteins in the enzyme role.
the clubbing together of replication entities (‘genes’) in cells with walls, which prevented he gene products leaking away and kept them together with the products of other genes with which they could collaborate in cellular chemistry.
the birth of the eukaryotic cell by the co-mingling of several prokaryotic cells.
the origin of sexual reproduction, giving birth to the concept of the species itself, with its own gene pool and all that implies for future evolution. [17]

Our host concludes by elaborating upon the significance of two major themes. The first is intrinsic within us. "Through the millions - probably billions - of individual ancestors whose lives we could have touched along our Pilgrims' Way, one singular hero has emerged in the minor, like a Wagnerian leitmotiv. It remains the dominant factor that links us to the rest of the natural world.

His second pertinent remark on origins surveys the landscape of the myriad of life across the full expanse of the evolutionary plain:

"The universe could so easily have remained lifeless and simple - just physics and chemistry, just scattered dust of the cosmic explosion that gave birth to time and space. The fact that it did not - the fact that life evolved out of nearly nothing, some 10 billion years after the universe evolved (by some physicists' accounts) out of literally nothing - is a fact so staggering that I would be mad to attempt words to do it justice. And even that is not the end of the matter. Not only did evolution happen: it eventually led to beings capable of comprehending the process, and even of comprehending the process by which they comprehend it”. [18] Staggering indeed!

[1] Source: Christian, Maps of Time, 93-9
[2] Ibid, 102.
[3] This hypothesis is explored in Mesler and Cleaves, A brief History of Creation, WW Norton, New York, 2016, 240 ff, drawing on the work of Walter Gilbert Jack Szostak in the 1980s . Crick himself expounded this hypothesis: ibid, 207.
[4] Ibid, 248.
[5] “First glimpse at the birth of DNA”, New Scientist, Bob Holmes (Atlanta), 21 April 2012, 10.
[6] Cf the concept of "horizontal gene transfer" discussed in Mesler and Cleaves, op cit, 234 ff and referred to at /descent-from-a-common-ancestor.html
[7] Holmes, op cit.
[8] Mesler and Cleaves, op cit, 251.
[9] Christian, 104.
[10] Mesler and Cleaves, op cit 251.
[11] Ibid, 642 -665.
[12] Ibid, 649-650.
[13] Ibid, 657.
[14] Ibid, 663-664.
[15] Ibid, The Host's Return, 666 -700
[16] Ibid, 666.
[17] Ibid, 696.
[18] Ibid, 699

Molecular considerations - DNA, RNA, genes, amino acids, proteins Overview

Molecular considerations - DNA, RNA, genes, amino acids, proteins

Overview