The Most Challenging Problem in Bioinformatics
In chapter 6 of his book, Darwin wrote -
If it could be demonstrated that any complex organ existed, which could not possibly have been formed by numerous, successive, slight modifications, my theory would absolutely break down. But I can find out no such case.
Although Darwin mentioned complex organs, the statement would be equally valid for complex unicellular organisms not widely known in his time. The genome sequencing projects over the last 15 years shows one case that ‘absolutely breaks down’ the idea of numerous, successive, slight modifications to get to complexity, and that is the origin of eukaryotic organisms from prokaryotes. Eugene Koonin describes the problem as ‘enimagtic’. Dan Graur calls it ‘one of the hardest and most interesting puzzles’. The difficulty is being more and more appreciated with the availability of many different protozoic genome.
What is the challenge? It appears that a large number of eukaryotic genes came from ‘nowhere’ (check this 2002 paper - The origin of the eukaryotic cell: A genomic investigation). Moreover, every possible scenario of explaining their gradual origin over evolutionary time hits one or other roadblock. It appears that eukaryotes have a number of novel genomic toolkits and the entire package came into earliest eykaryotic cell (‘LECA’) together.
In our blog on evolution, we are collecting a number of relevant papers with short explanations about various potential conjectures, but you can see that the question is far from answered.
Anyone interested in this topic should first start with Bill Martin’s wonderful review paper. The other excellent ones are linked below.
Koonin: The Origin and Early Evolution of Eukaryotes in the Light of Phylogenomics
Why are Bacteria Different from Eukaryotes?
Why are Bacteria Different from Eukaryotes? (ii)
Origin of the cell nucleus, mitosis and sex: roles of intracellular coevolution
-————————————————–
Removed based on suggestion from Erich -
Can the bioinformaticians stick to developing tools and leave answering
fundamental problems to others? That results in a large amount of duplicate
work, and a good example in this context is the CEGMA
paper. The authors did not
acknowledge in their paper that the same question was answered by evolutionary
biologists for many years prior to their work, and neither did they bother to
ask the fundamental questions answered in the prior papers. Two noteworthy
examples of uncited prior work -
The origin of the eukaryotic cell: a genomic investigation
A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes</del>