Nuclear Genome
Nuclear Genome - A genome is the complete set of chromosomes found in each nucleus of a given species which contains the entire genetic material. The nuclear genome is the largest in the plant cell, both in terms of picograms of DNA and in number of genes encoded (complexity). Eukaryotic nuclear genomes can be distinguished from organelle and prokaryotic genomes by size and complexity. A typical higher plant genome, for example, contains about 5 x 109 base pairs of DNA per haploid set of chromosomes. This is about 30,000 times as much DNA as in a single chloroplast genome and some 10,000 times as much as in a moderately sized plant mitochondrial genome. It is also 1000 times more than that of bacterial DNA present in Escherichia coli.
The typical plant genome with 5 x 109 base pairs of DNA would be about three metres long if the entire DNA were to be laid out in a straight line. Chromosomes are composed of two types of large organic molecules (macromolecules) called proteins and nucleic acids. Nucleic acids are of two types: deoxyribonucleic acids. (DNA) and ribonucleic acids (RNA).
During the 1940s and early 1950s several elegant experiments were carried out which clearly established that genetic information resides in the nucleic acids and not in the proteins. More specifically, these experiments showed that genetic information resides in DNA. (In a few simple viruses, however, RNA carries the genetic information; these particular viruses contain no DNA.) Nuclear DNA is packaged into chromosomes along with histones and nonhistone proteins, all of which play important roles in gene expression.
These various components are held together to form chromatin by both hydrophobic and electrostatic forces. While the DNA encodes the genetic information, the proteins are involved in controlling packaging of DNA and in regulating its availability for transcription. Although the structure of eukaryotic chromatin has been fairly well characterized, the roles of various individual proteins in chromatin structure and gene regulation have yet to be elucidated.
Ribosomal RNA Gene Families
Ribosomal RNA Gene Families - Probably the best known multigene families are those encoding ribosomal RNA. The "major" ribosomal RNA genes are organized in long tandem arrays containing both gene and spacer sequences. In most animal cells there are some 100-200 rRNA genes and in plants the numbers may be much higher. It is not uncommon to find 5,000 rRNA genes per 'genome in plant species with DNA contents typical of crop plants, and much higher numbers have been reported. The genes exist in large tandem arrays at one or a few loci in the genome.
The repeating unit in these arrays consists of one large transcription unit containing genes for the 18S, 5.8S and 25S rRNAs as well as for $pacer sequences that are removed during processing of the large primary transcript. Also included in the basic repeat are spacer sequences which are not part of the primary transcript and often called non transcribed spacer or intergenic spacer sequences. The major rRNA genes are transcribed by RNA polymerase I, a specialized form of RNA polymerase, which transcribes only these genes. Active rRNA genes are found in the nucleolus where their transcripts are processed and assembled with ribosomal proteins.
Genes for 5S ribosomal RNAs are also organized in tandem arrays although they are located elsewhere in the genome away from the major rRNA gene arrays. They are transcribed by RNA polymerase III rather than polymerase I. As in the case of the major rRNA genes, the arrays consist of alternating gene and spacer sequences. In wheat two main variants exist in which the repeating units are 410 and 500 bp long. Similar heterogeneity is seen in flax, in which the major length classes are 340 and 360 bp. Within these repeats the gene itself occupies only 118 bp.
Most spacer sequences are highly variable, even between the different length classes in a single genome, but there is a region of about 70 bp 5' to the start of transcription that is conserved. Studies of 58 genes in Xenopus, however, have shown that this region is not required for an accurate transcriptional initiation. Interestingly, the promoter sequences required for proper initiation by polymerase III in extracts of Xenopus oocytes seem to be located well within the coding sequence of the gene.
The number of copies of both the major rRNA genes and the 58 rRNA genes can vary widely among closely related species of plants, and even among different races or varieties within a species. For example, different lines of flax have been reported to contain between about 50,000 and 120,000 58 RNA genes. The number of major ribosomal RNA genes varied between about 1,400 and 2,700 in the same lines, although there was no correlation between the numbers of 58 genes and major rRNA genes
Extensive variation in the number of major rRNA genes also occurs independently at each of the several loci that contain rRNA genes in hexaploid wheat. Such extensive variation between vigorously growing, closely related genotypes makes it very difficult to argue that the larger gene numbers are required for normal plant growth and development. Instead, it would appear that plants simply tolerate a great deal more variation than animals do. Even in plants with lower gene numbers there may be a substantial "excess" over the number of genes actually expressed in any given cell
In addition to variation in rRNA gene number, many plants have substantial polymorphism in the length of their tandem repeat units. Both types of variation are thought to result from unequal crossover events. Why both copy number and length variation can result from unequal crossover in ribosomal RNA gene arrays can be understood by considering the structure of a typical repeat in such an array.
In each repeat there is the large transcription unit containing genes for the 188, 588, and 258 RNAs and the spacer sequences between them. There is also the "nontranscribed spacer".Within the nontranscribed spacer there are a number of short (about 130-180 bp) "subrepeat" sequences. These subrepeats provide another set of tandemly repeated sequences at which unequal crossovers might occur.
Unequal crossover in this region would generate spacer length variants with different numbers of subrepeat sequences. This model predicts that most of the rRNA gene length variants seen in nature should differ from each other by lengths corresponding to integral multiples of the subrepeats sequence, and this is exactly what is observed
The evolution of ribosomal RNA gene families is also characterized by a rapid accumulation of point mutations and other sequence changes in the nontranscribed spacer region. Clearly, the sequence of this region is not conserved in the same way as that in the coding region, which shows strong homology over very large evolutionary distances. In spite of the rapid evolution of spacer sequences, however, a degree of sequence homogeneity is somehow maintained within the arrays of a particular genome.
This paradox provided the first recognized example of "concerted evolution", in which repeated sequences of multicopy genes sometimes show a tendency to evolve together rather than diverge by accumulating different mutations. Concerted evolution is thought to reflect the operation of homogenizing processes such as gene conversion and unequal crossing over. Gene conversion is the direct conversion of one sequence to another while the sequences are paired during mitosis or meiosis. Unequal crossovers within large tandem arrays, such as those containing the ribosomal RNA genes, can lead to the spread of random sequence variants through the population of genes by a process that is analogous to genetic drift in a population of organisms.
Although it is often thought (and in many cases is undoubtedly true) that sequences that diverge very rapidly in evolution must be phenotypically neutral or unimportant, the nontranscribed spacer sequences in the major ribosomal RNA genes contain transcriptional promoters and enhancers which are essential to gene function. In contrast to control sequences of protein-coding genes, which can sometimes be recognized by their evolutionary conservatism, promoter and enhancer functions in ribosomal RNA genes show a very high degree of species specificity. This can be demonstrated by the failure of heterologous genes to be transcribed in extracts that faithfully transcribe homologous ribosomal RNA genes. In such a situation, genes from species A work in A extracts but not 8 extracts while genes from species 8 work in 8 extracts but not those from species A. Thus a kind of molecular co-evolution must occur with genes that encode transacting transcription factors evolving in parallel with the DNA sequences these factors recognize.
Chromosomal regions containing arrays of rRNA genes are potential sites of nucleolus formation and are thus referred to as "nucleolar organizers". Although there may be more than one nucleolar organizer region, the number is rarely more than two or three per genome. It can be shown by in situ hybridization that most of the DNA that hybridizes to ribosomal gene probes is located in these regions; however, not all the rRNA genes are contained within the nucleolus itself. Many of the genes exist in an apparently inactive form just outside the nucleolus.
Using genetic stocks of maize containing chromosomal translocations involving DNA near the nucleolus, R. Philips and his associates at the University of Minnesota were able to show that DNA from this region could organize a nucleolus when it was transferred to another part of the genome, even though it had not done so at the original location. Thus the inactive genes are capable of functioning when given the chance and their inactivity must reflect the operation or a system for regulating the number of genes that can be active in anyone nucleus.
A similar conclusion can be drawn from experiments in wheat. Since bread wheat is a hexaploid species it is possible to make various aneuploid derivatives in which the number of nucleolar organizers, and the total number of rRNA genes, varies considerably. R. Flavell and his colleagues at the Plant Breeding Institute in Cambridge, England have shown that the total nucleolar volume, which is an index of rRNA gene activity, remains relatively constant as the number of genes is varied over a wide range. This dosage compensation effect is evidence that the activity of rRNA genes is regulated by some mechanism independent of their copy number.
Additional experiments with aneuploid wheat lines and wheat lines that contain chromosomes from related species have shown that different nucleoli exhibit a "dominance hierarchy". For example, the nucleolus on chromosome 18 is always larger than that on chromosome 68 in the euploid cultivar "Chinese Spring". Since the nucleolar organizer on 68 has about twice as many rRNA genes as that on 1 B, it is not possible to explain dominance simply on the basis of gene number. Instead, it seems more likely that something about the genes on 1B makes them more able to compete for some limiting factor or factors required for activity. In this connection it in striking that the dominant genes generally contain larger numbers of the spacer sub repeats mentioned previously in connection with size polymorphism