Alternative Splicing - the New Snake Oil to Explain 'Human Complexity'
We have been noticing increasingly that various bioinformaticians are chasing alternative splicing in RNAseq data. That qualifies as harmless curiosity just by itself, but when one combines it with additional observation that various ENCODE and GTEx clowns are using alternative splicing to ‘explain’ ‘human complexity’, the situation gets worrisome. On the later point, look no further than a paper published by leading ENCODE clown Mike Snyder last year (covered in Human Transcriptome is Extremely Complex and Snyderome is the Most Complex of All).
The problem started in 2001, when the number of protein-coding genes in human genome fell far short of the expected 100,000 to only 25,000. That was comparable to 20,000 genes in measly worm. The observation deflated the ego of a few bad scientists, who had been looking for other explanations of human complexity ever since. Using alternative splicing to turn that 20,000 to 100,000 splice forms is one method, others being declaring the entire genome as ‘biochemically functional’ and talking incessantly about human long non- coding RNA as something marvelous. Sandwalk blog has a full list.
Here’s the latest list of the sorts of things that may salvage your ego if it has been deflated.
1. Alternative Splicing: We may not have many more genes than a fruit fly but our genes can be rearranged in many different ways and this accounts for why we are much more complex. We have only 25,000 genes but through the magic of alternative splicing we can make 100,000 different proteins. That makes us almost ten times more complex than a fruit fly. (Assuming they don’t do alternative splicing.)
2. Small RNAs: Scientists have miscalculated the number of genes by focusing only on protein encoding genes. Our genome actually contains tens of thousands of genes for small regulatory RNAs. These small RNA molecules combine in very complex ways to control the expression of the more traditional genes. This extra layer of complexity, not found in simple organisms, is what explains the Deflated Ego Problem.
3. Pseudogenes: The human genome contains thousands of apparently inactive genes called pseudogenes. Many of these genes are not extinct genes, as is commonly believed. Instead, they are genes-in-waiting. The complexity of humans is explained by invoking ways of tapping into this reserve to create new genes very quickly.
4. Transposons: The human genome is full of transposons but most scientists ignore them and don’t count them in the number of genes. However, transposons are constantly jumping around in the genome and when they land next to a gene they can change it or cause it to be expressed differently. This vast pool of transposons makes our genome much more complicated than that of the simple species. This genome complexity is what’s responsible for making humans more complex.
5. Regulatory Sequences: The human genome is huge compared to those of the simple species. All this extra DNA is due to increases in the number of regulatory sequences that control gene expression. We don’t have many more protein-encoding regions but we have a much more complex system of regulating the expression of proteins. Thus, the fact that we are more complex than a fruit fly is not due to more genes but to more complex systems of regulation.
6. The Unspecified Anti-Junk Argument: We don’t know exactly how to explain the Deflated Ego Problem but it must have something to do with so-called “junk” DNA. There’s more and more evidence that junk DNA has a function. It’s almost certain that there’s something hidden in the extra-genic DNA that will explain our complexity. We’ll find it eventually.
7. Post-translational Modification: Proteins can be extensively modified in various ways after they are synthesized. The modifications, such as phosphorylation, glycosylation, editing, etc., give rise to variants with different functions. In this way, the 25,000 primary protein products can actually be modified to make a set of enzymes with several hundred thousand different functions. That explains why we are so much more complicated than worms even though we have similar numbers of genes.
What, then, is the explanation of ‘human complexity’? The answer is none. Humans are no more complex than most other animals.
Getting back to original point, I would be very cautious about reading too much from alternative splicing data in RNAseq, unless backed by other biochemical confirmations.