Now that researchers have access to thousands of fully sequenced genomes, it has become easier to check whether they show any systematic evolutionary pattern. The first question that comes into everyone’s mind is whether more complex organisms have large genomes. T. Ryan Gregory has been working on this topic for many years and he organized a workshop in 2010 among like-minded people. The papers coming out of that workshop were published in two sets, and among all, the following one is fairly good to get you started on the topic. Salient points -
1. Based on data collected on ~5000 animal genomes so far, genome sizes were found to vary 7000-fold. That is an enormous range.
2. Animals with the largest genomes - lungfish (~80-120Gbp), salamanders (similar order), sharks (~10-20Gbp), grasshoppers, flatworms and crustaceans.
3. The only phenotypic links with genome size are in larger cell size and longer time to do cell division for organisms with large genomes.
4. Ecological correlation - “An emerging trend from animal (and more specifically, crustacean) genome size studies is the positive relationship between genome size and latitude.”
5. Correlation with intron size - “Intron size and genome size are known to be positively correlated between species of Drosophila (Moriyama et al. 1998), within the class of mammals (Ogata et al. 1996), and across eukaryotes in general (Vinogradov 1999).”
6. No relationship between genome size and animal complexity has ever been found. Researchers have been looking into this for over four decades.
The study of genome size diversity is an ever-expanding field that is highly relevant in today’s world of rapid and efficient DNA sequencing. Animal genome sizes range from 0.02 to 132.83 pg but the majority of animal genomes are small, with the most of these genome sizes being less than 5 pg. Animals with large genomes (> 10 pg) are scattered within some invertebrates, including the Platyhelminthes, crustaceans, and orthopterans, and also the vertebrates including the Actinopterygii, Chondrichthyes, and some amphibians. In this paper, we explore the connections between organismal phenotype, physiology, and ecology to genome size. We also discuss some of the molecular mechanisms of genome shrinkage and expansion obtained through comparative studies of species with full genome sequences and how this may apply to species with large genomes. As most animal species sequenced to date have been in the small range for genome size (especially invertebrates) due to sequencing costs and to difficulties associated with large genome assemblies, an understanding of the structural composition of large genomes is still lacking. Studies using next-generation sequencing are being attempted for the first time in animals with larger genomes. Such analyses using low genome coverage are providing a glimpse of the composition of repetitive elements in animals with more complex genomes. These future studies will allow a better understanding of factors leading to genomic obesity in animals.
The above paper and other ones mentioned in Gregory’s Genome size evolution: patterns, mechanisms, and methodological advances present fairly good scientific analysis of data available so far.
The relationship between genome size and animal complexity was also discussed in contemporary religious literature. Religious people follow different approach in doing their analysis, where they start with knowing the answer and then use various methods to arrive at it from data. Also, successful cult leaders are very outspoken and convincing, even though what they are trying to convince you about could be complete BS. The real good ones go about ‘deconstructing the dogma’ of others.
Here are two examples. Please let me know, if you can derive the conclusions of the first paper from the data in reviews mentioned above.
There are two intriguing paradoxes in molecular biology–the inconsistent relationship between organismal complexity and (1) cellular DNA content and (2) the number of protein-coding genes–referred to as the C-value and G-value paradoxes, respectively. The C-value paradox may be largely explained by varying ploidy. The G-value paradox is more problematic, as the extent of protein coding sequence remains relatively static over a wide range of developmental complexity. We show by analysis of sequenced genomes that the relative amount of non-protein-coding sequence increases consistently with complexity. We also show that the distribution of introns in complex organisms is non-random. Genes composed of large amounts of intronic sequence are significantly overrepresented amongst genes that are highly expressed in the nervous system, and amongst genes downregulated in embryonic stem cells and cancers. We suggest that the informational paradox in complex organisms may be explained by the expansion of cis-acting regulatory elements and genes specifying trans-acting non-protein-coding RNAs.
Since the birth of molecular biology it has been generally assumed that most genetic information is transacted by proteins, and that RNA plays an intermediary role. This led to the subsidiary assumption that the vast tracts of noncoding sequences in the genomes of higher organisms are largely nonfunctional, despite the fact that they are transcribed. These assumptions have since become articles of faith, but they are not necessarily correct. I propose an alternative evolutionary history whereby developmental and cognitive complexity has arisen by constructing sophisticated RNA-based regulatory networks that interact with generic effector complexes to control gene expression patterns and the epigenetic trajectories of differentiation and development. Environmental information can also be conveyed into this regulatory system via RNA editing, especially in the brain. Moreover, the observations that RNA-directed epigenetic changes can be inherited raises the intriguing question: has evolution learnt how to learn?