We came across the above tweet from Magdalena Skipper, the genetics editor of Nature magazine. Nature used to be a fine journal in some distant past, but lately it has become the mouthpiece of various over-rated clowns promoting their pet projects.
Today’s self-serving article comes from Broadstar Steven Hyman, who runs the Stanley Center for Psychiatric Research at Broad Institute. Dr. Hyman proposes to collect genetic data from more than 100,000 people and run a giant GWAS study to search for ‘the depression gene’, which he is sure will be found through his study. Needless to say, he expects government to pay for his pet project and Broad Institute to benefit greatly from it.
Dr. Hyman made his case about bigger GWAS studies for mental diseases so poorly that it reminded us of several popular quotes about insanity. Let us start with Einstein -
Insanity: doing the same thing over and over again and expecting different results.
The largest meta-analysis of genome-wide association studies of depression (approximately 9,500 cases) has yielded no significant findings. Similarly sized studies of almost all other conditions have convincingly implicated at least some genetic loci (see ‘Signal search’). So far, 108 independent loci have been found to demonstrate genome-wide significance for schizophrenia.
Nonetheless, I am convinced that genetic variants for depression can be found. [snip] More than 100,000 people with MDD will be needed to find enough loci to inform biology and therapeutics. Amassing a data set of this scale is difficult, but worthwhile and possible.
Geez. What if he fails with 100K people? We already know the answer. Dr. Hyman will propose to sequence the entire humanity, including bushmen, peshmerga and jarwas.
What are the metrics for failure versus success in such a large study? We know that there is none given that Dr. Hyman throws in the following canard of the ‘height GWAS’ study explaining 1/5th of heritability in the next sentence.
A meta-analysis of genome-wide association studies published earlier this year for adult height included more than 250,000 subjects and has found 697 common variants thus far, explaining nearly one-fifth of the heritability.
To understand why that statement is ‘at best misleading’, readers are encouraged to look at the following four blog posts by professor Ken Weiss. For introduction, professor Weiss wrote his textbook on genetic variation and human disease twenty years back and had been arguing against bad science (such as the type promoted by Hyman) in his blog for a long time.
The most recent is an extensive study in (where else?) Nature Genetics, by a page-load of authors. In summary, the authors found, pooling all the data from many different independent studies in different populations, that at their most stringent statistical acceptance (‘significance’) level, 697 independent (uncorrelated) spots scattered across the genome each individually contributed in a significance-detectable way to stature. The individual contributions were generally very small, but together about 20% of the overall estimated genetic component of stature was accounted for. However, using other criteria and analysis, including lowering the acceptable significance level, and depending on the method, up to 60% of the heritability could be accounted for by around 9,500 different ‘genes’. Don’t gasp! This kind of complexity was anticipated by many previous studies, and the current work confirms that.
Many issues are swept under the rug here, however–that is, relegated to a sometimes obscure warren of tunneling of Supplemental information. The individuals were all of European descent so that genetic contributions in other populations are not included. The analysis was adjusted for sex and age. Subjects were all, presumably, ‘normal’ in stature (i.e., no victims of Marfan or dwarfism), and all healthy enough to participate as volunteers. The majority were older than 18, but the range was 14 to 103, and 45% were male, 55% female. The data were also adjusted for family relationships.
Lately we have seen a big spike in the number of GWAS type studies to ‘understand’ various aspects of human brain. Dan Graur reported about one particularly bad study with his humorous touch.
The title states: Common genetic variants associated with cognitive performance identified using the proxy-phenotype method.
In the following I present to the reader with an annotated abstract:
We identify common genetic variants associated with cognitive performance using a two-stage approach, which we call the proxy-phenotype method. [We performed a complicated GWAS meta-analysis on an ill defined trait that only the Illuminati know how to measure and whose reproducibility is nil.]
First, we conduct a genome-wide association study of educational attainment in a large sample (n = 106,736), which produces a set of 69 education-associated SNPs. Second, using independent samples (n = 24,189), we measure the association of these education-associated SNPs with cognitive performance. [We now brag about the amount of unoriginal data that we used]
Three SNPs (rs1487441, rs7923609, and rs2721173) are significantly associated with cognitive performance after correction for multiple hypothesis testing. [We found three SNPs that are correlated with whatever cognitive performance measures.]
In an independent sample of older Americans (n = 8,652), we also show that a polygenic score derived from the education-associated SNPs is associated with memory and absence of dementia. [We studied carefully Darrell Huffs 1954 book How to Lie with Statistics and decided to follow his negative examples by combining apples and oranges]
Convergent evidence from a set of bioinformatics analyses implicates four specific genes (KNCMA1, NRXN1, POU2F3, and SCRT). All of these genes are associated with a particular neurotransmitter pathway involved in synaptic plasticity, the main cellular mechanism for learning and memory. [From other studies we picked 4 genes for which we could spin a Just So Story, and ignored the vast majority of data pointing to gene deserts.
The penultimate sentence in the discussion states:
In future work, the magnitude of explained variance will increase as researchers gain access to datasets with even larger first-stage samples.
In other words, a sample of 106,736 genomes, which produced a set of 69 education-associated SNPs is not enough. With more data the magnitude of explained variation will increase. Unfortunately, the magnitude of explained variation in this study is nowhere to be found in the paper. One has to dig in the supplementary material, to find its value: 0.0002 to 0.0006.
I bettcha one can replace cognitive performance by favorite color and get similar results. Why do semi-respectable journals still publish such crap?
Continuing on his last sentence, it seems like semi-respectable journals (PNAS) publishing crap and previously reputed journals (Nature) pimping for bigger-sized crap has become the new industry-standard.