Today s Rainbow Chasing Paper by ENCODE Leader-in-Exile Ewan Birney

Today s Rainbow Chasing Paper by ENCODE Leader-in-Exile Ewan Birney

A new paper in biorxiv reminded us of a story shared by Ken Weiss.

Survivorship bias

Ellenberg begins his book with an illustration of how abstract logical thinking can solve important real-world problems in subtle ways. In WWII a mathematics research group was asked by the Army to help them locate armor plating on fighter aircraft. The planes were returning to base with scattered bullet holes from enemy fire and the idea was to put some protective plating where it would do the most good without adding cumbersome mileage-eating weight. The mathematician suggested to put the plating where the bullet holes weren’t. This seemed strange until he explained that this was because the bullet holes that were observed hadn’t done much damage: bullets hitting elsewhere had brought the plane down so it was never observed because the plane never returned to base. The engine compartment was the case in point: a shot to the engine was fatal to the aircraft, but to the wings and body, much less so.

If you think about the genome as the body of the plane and the variants as the bullets, survivorship bias would suggest that the places with variants are less important than the places with no change. This is nicely explained by Weiss in his commentary.

How can a gene be central to the development of the basis of a trait, and yet not be found in mapping to identify variation that causes failures of the trait? Indeed, the basic finding of GWAS and most other mapping approaches is that the tens or hundreds or thousands of genome ‘hits’ have individually trivial effects.

The answer may lie in survivorship bias. Like the lethality of bullets to the engine of a fighter, most variation in the main genes, those whose sequence is more highly conserved, is lethal to the embryo or manifest in pathology so clear that it never is the subject of case-control or other sorts of Big Data mapping. In other words, genome mapping may systematically be inevitably constrained to find small effects! That’s exactly the opposite of what’s been promised, and the reason is that the promises were, psychologically or strategically, based on extrapolation of the findings of strong, single-gene effects causing severe pediatric disease–a legacy of Mendel’s carefully chosen two-state traits.

To the extent this is a correct understanding, then genomewide mapping as it’s now being done is, from an evolutionary genomic perspective, necessarily rainbow-chasing. Indeed, a possibility is that most adaptive evolution is itself also due to the effects of minor variants, not major ones. Once the constraining interaction of the major genetic factors is in place, mostly what can nudge organisms in this direction or that, whether adaptively or in relation to complex, non-congenital disease, is based on assembled effects of individually very minor variants. In turn, that could be why slow, gradualism was so obviously the way evolution worked to Darwin, and why it generally still seems that way today.

Someone needs to explain that to ENCODE clown Ewan Birney, who developed a new tool to remedy a ‘frustration’ that is due to lack of understanding of science and not due to lack of tools. The new paper is available here .

Genome wide association studies provide an unbiased discovery mechanism for numerous human diseases. However, a frustration in the analysis of GWAS is that the majority of variants discovered do not directly alter protein-coding genes. We have developed a simple analysis approach that detects the tissue- specific regulatory component of a set of GWAS SNPs by identifying enrichment of overlap with DNase I hotspots from diverse tissue samples. Functional element Overlap analysis of the Results of GWAS Experiments (FORGE) is available as a web tool and as standalone software and provides tabular and graphical summaries of the enrichments. Conducting FORGE analysis on SNP sets for 260 phenotypes available from the GWAS catalogue reveals numerous overlap enrichments with tissuespecific components reflecting the known aetiology of the phenotypes as well as revealing other unforeseen tissue involvements that may lead to mechanistic insights for disease.

The abstract starts with a lie, which is unfortunate but is typical of Birney. GWAS has not helped in discovery of mechanism of any complex human disease. In fact, it failed miserably in trying to describe supposedly simple traits like height (check height of folly by Weiss), yet is being actively promoted to find genetic causes of diabetes, obesity, intelligence and what not. For example, check this post by Dan Graur - GWAS Excrement Again: PNAS Paper Explains 0.02% of the Variation in an Ill-Defined Trait.

Written by M. //