Discovery of the Gene Structure

The discovery of the structure of DNA in 1953 by James Watson and Francis Crick set the stage for the next fifty years of research into gene structure, function, and regulation.

Genes are functional units of DNA that contain the instructions for making proteins or RNA. Genes also act as units of heredity, transferring the same instructions from parent to offspring. The nature, structure, and regulation of genes has been a central topic of scientific research for more than 100 years.

Genes were first defined as units of hereditary transmission. The name “gene” was coined by Wilhelm Johannsen in 1909, although the concept of a discrete unit governing inherited characteristics goes back at least to Gregor Mendel in 1861. The work of Thomas Hunt Morgan and his colleagues established that genes were located on chromosomes, and in the mid-1940s Oswald Avery demonstrated that genes were composed of DNA (deoxyribonucleic acid). Since that time, some types of viruses have been discovered that use ribonucleic acid (RNA) instead of DNA, but here we shall concentrate on DNA genes.

DNA is a linear molecule composed of subunits called nucleotides. Each nucleotide is made of a sugar and phosphate group, plus a chemical base, of which there are four types: adenine, thymine, guanine, and cytosine (A, T, G, C). Nucleotides are typically referred to by the name of their base. DNA exists as a pair of strands, wound around one another into a double helix, with the bases directed into the center. The structure and charges of the bases dictate that A on one strand can match only up with T on the other, and C only with G. This complementarity provides the basis for faithful replication of the entire DNA molecule.

While all genes are made of DNA, not all stretches of DNA act as genes. Indeed, in eukaryotic organisms, most of the DNA does not function as genes, meaning it is not the code for making proteins or RNA. Some DNA outside of genes has a structural role, some are remnants of old genes that now are functionless, and much of it appears to be “junk,” inserted and copied by viruslike sequences. Within a gene, usually only one side of the double helix actually codes for product; the other side is silent. Which side of the helix acts as code varies from gene to gene.

Almost all genes code for proteins. Proteins are strings of amino acids, and the sequence of nucleotides in the gene dictates the sequence of amino acids in the protein. Proteins perform almost all the functions in cells, and can be grouped into four major classes: they act as enzymes that control the rate of chemical reactions in the cell; they form structural components of organelles, membranes, and other cell components; they receive and transmit signals between and within cells; or they act as regulators of genes by latching onto DNA, thereby increasing or decreasing the rate at which the gene is used, or “expressed.”

Genes vary in length. The largest human gene is 2.5 million base pairs in length, and codes for the muscle protein named dystrophin, which is more than 3,500 amino acids long. Eukaryotic genes generally produce proteins of about 150 to 3,000 amino acids in length. Some genes are relatively small, as in prokaryotes, which produce proteins of 50 to 300 amino acids. Most eukaryotic protein-coding genes are present in only two copies per genome, occurring in the same position on homologous chromosomes, one of which is received from each parent. If the two copies differ slightly they are called alleles. Changes in nucleotide sequences are termed mutations or polymorphisms, depending on their effect.

Some genes code not for protein but for RNA molecules that have their own functions within the cell. These include the transfer RNAs, ribosomal RNAs, and a variety of other smaller RNAs with roles in the nucleus. RNA-coding genes are usually present in multiple copies per eukaryotic genome.

Source