Functions, Variants, Traits and Diseases - Three Interesting Commentaries (#GWAS)

Universal Gate in Electronic Circuit

What role does a single gene play in the bigger picture of defining traits? We will first present a conceptual idea from electrical engineering and then discuss three interesting commentaries.

In digital electronics, or the part of electrical engineering that specialized on designing the logic circuits of everything from traffic lights to complicated computer chips, logic gates are the basic building blocks. Given two binary inputs, it is possible to form six types of logical operations - AND, OR, EXOR, NOR, EX-NOR, NAND and NOR. If someone creates those six blocks using transistors, they can be interconnected to build any complex computer circuit. In the electronic control circuit, they are the equivalent of ‘genes’.

There is more. It has been shown that each of those basic operations can be further built from only one gate - NAND. Therefore, six ‘genes’ can be reduced to one ‘gene’ without any loss of complexity. Indeed exactly one gate is used to form very complex circuits such as modern microprocessors. NOR also works in similar manner.

Cellular logic does not always work through binary operations like above, but overall it is not at all necessary to have more building units (genes) to get higher complexity, when the complexity can be easily achieved through interconnections between the genes. Yet, gene annotation exercises use the faulty logic of trying to reduce each gene to one or one type of function. That logic gets extended into the attempts to find specific genes for diabetes or intelligence.

The taste of Heat

Ecodevoevo blog presented an interesting example from a recent paper, where finding gene for specific tasks fails.

A paper in this week’s Nature (“A gustatory receptor paralogue controls rapid warmth avoidance in Drosophila,” Ni et al.) reports that response to a steep temperature gradient by Drosophila may be through a taste receptor. Another legacy of the “gene for” approach to understanding genetics.

Most organisms prefer a small range of external temperatures, because it helps in body temperature maintenance, but the molecular mechanisms for thermal preference had not been understood. Now Ni et al. suggest one.

“Here we [show] that thermal preference is not a singular response, but involves multiple systems relevant in different contexts. We found previously that the transient receptor potential channel TRPA1 acts internally to control the slowly developing preference response of flies exposed to a shallow thermal gradient. We now find that the rapid response of flies exposed to a steep warmth gradient does not require TRPA1; rather, the gustatory receptor GR28B(D) drives this behaviour through peripheral thermosensors.”

That is, fruit flies respond to minor temperature fluctuation with one mechanism, the TRPA1, and to sharp fluctuations with another, a taste receptor.

Cystic Fibrosis, Genetic Variation and Gene Function

How about assigning variants to diseases? Another commentary from the same blog discusses how the most well understood ?F508 deletion associated with cystic fibrosis is less understood, when the dominant- recessive relations between two copies of the gene is taken into account.

More than 2000 variants in the cystic fibrosis transmembrane conductance regulator gene (CFTR) have been linked to the disease. About 70% of Europeans with cystic fibrosis have 2 copies of a 3 base deletion, called ?F508 because the deletion occurs at position 508 in the gene. This deletion causes a particular piece of the CFTR protein not be made when the protein is being synthesized, and this leads to an abnormal protein, and disease. That there can be one predominant allele in this ‘recessive’ disease is consistent with population genetics theory and also with the early discovery of the gene– because a high fraction of cases have two copies of the allele (or at least one copy, along with some other variant) made it findable with the techniques of a generation ago.

Why this deletion causes the particular symptoms of cystic fibrosis is well- understood, but why most of the remaining 2000 variants cause disease has not been demonstrated.

Once the ?F508 allele was found, it was shown to have a strong effect in carriers of two copies. This was consistent with the idea that CF is “recessive”, and in a kind of circular strategy this is what was found because the variant is so common.

But then other patients were found who had only one copy of that allele, plus a not-normal sequence in their other copy. Because the trait was assumed to be “recessive”, this other variant gets incorporated into the data base as if it’s causal. This is a kind of circular reasoning gone another step: if the disease is assumed to be recessive, then the variants in both the patient’s copies of the gene must be causal!

In some cases, the nature of both mutations in the gene, in terms of where in the protein structure the variant occurred was able to confirm a likely causal effect. This is consistent with the current paper. However, it was obvious that the phenotypes were not all the same, and many if not most individual had two different variants. That is not what “recessive” is classically supposed to mean–that is, two copies of the ‘bad’ variant. Instead, we have a quantitative relationship between the diploid genotype (that is, both copies of the gene) and the severity of the trait.

If this assumption is wrong, it could be that one or even both variants the patient has are not themselves causal, but only causal in the context of some other site(s) in the genome that interact with the CFTR gene, or even that have similar effects on their own. So to continue to use the term we esssentially get from Mendel and his peas (“recessive”) when what we really have is a quantitative relationship between genotype and phenotype, that may not even always involve the same gene, is an example of a theory lasting beyond the evidence, because investigators ‘want’ the trait to fit the simple model. The persistence of the simple model shows our addiction to it and the historical legacy by which terms and concepts cling on when they should be modified–or abandoned. This is a common iassue that applies to many purportedly single-gene traits. It’s why we put the word in quotes in this post.

Of course when a variant is very rare, and is generally seen in patients who also carry one of the strong-effect alleles, we have almost no way to test whether that variant is causal or not. If it hits a known major part of the gene, some severity correlations can be identified–and this has been known since about 1990. The current paper provides some larger-scale documentation, but really is confirming what has long been well known. Naturally, it is no surprise that the authors could not attribute specific mechanism to the almost 2000 genes in their study, many of which will be singletons, and why they’ll be so hard to confirm.

We still dont know why children resemble their parents

Finch and Pea blog presents another commentary relevant in this context.

I came to biology via chemistry & physics (and music too), and did my PhD in a hard core biochemistry and biophysics department. I certainly knew that differences in our DNA made us different from each other, but for me the question of the relationship between genotype and phenotype was addressed by knocking out genes to see which ones were essential for a particular pathway.

When I came to Barak Cohens lab and encountered this work, I had my genetic variation epiphany: understanding the relationship between genotype and phenotype is not so much about knowing what genes are essential to a pathway (thats molecular biology), its about knowing which genes vary in a population and cause variation in a phenotype.

I think geneticists underestimate how many biologists have never, ever thought about this question. Its clear many havet thought of the issue, because a common question our lab used to get is if you want to know what genes are involved in sporulation, why dont you just do a knock-out screen? But the point of the project wasnt to explain sporulation it was to explain variation in sporulation.

This was largely before GWAS, so the situation has probably improved more biologists are aware of questions about variation. But until you start thinking about it, and start reading the navel-gazing and head-scratching prompted by the less-than exciting GWAS results, its easy to miss the fact that the question of why children resemble their parents is non-trivial, unanswered, and raises some profound issues that geneticists have actively struggled with since Darwin highlighted how little scientists knew about heredity.

GWAS

Contrast the above with the following highly cited ‘success story’ on GWAS published in Nature Genetics in 2010.

Common SNPs explain a large proportion of the heritability for human height

SNPs discovered by genome-wide association studies(GWASs) account for only a small fraction of the genetic variation of complex traitsin human populations.Where isthe remaining heritability?We estimated the proportion of variance for human height explained by 294,831 SNPs genotyped on 3,925 unrelated individuals using a linear model analysis, and validated the estimation method with simulations based on the observed genotype data.We show that 45% of variance can be explained by considering all SNPs simultaneously.Thus, most of the heritability is not missing but has not previously been detected because the individual effects are too small to pass stringent significance tests. We provide evidence that the remaining heritability is due to incomplete linkage disequilibrium between causal variants and genotyped SNPs, exacerbated by causal variants having lower minor allele frequency than the SNPs explored to date.

The authors took 300K SNPs (!!), developed some kind of fitting model and claimed that they could explain 45% of heritability for human height from those SNPs. It is possible to fit anything with 300K parameters, but how realistic is that model?

Getting back to the unreal real world -

Boston hospitals to launch landmark genome study of newborns

Beginning in early 2014, the study the first-ever randomized trial of the benefits and risks of such sequencing will enroll 480 newborns and their parents, the hospitals said. The volunteers, healthy newborns from Brigham and Womens Hospital and infants from Boston Childrens Hospitals Neonatal Intensive Care Unit, will be divided into two groups. One group will receive conventional state-mandated newborn screening, the other will receive conventional screening and genome sequencing. Researchers will collect and analyze the genomic sequences, which may include information on potential causes of any birth defects, predispositions to future medical conditions and predictions about responses to certain drugs, and will return that information to parents and pediatricians to evaluate the medical, psychosocial and economic outcomes.

‹»Compressed Full-Text Indexes for Highly Repetitive Collections (Jouni Sirén)« »(As We Predicted Six Months Back), BGI Plans IPO«›