My earlier commentary on triplicate experiments received a number of insightful comments from our readers. Instead of responding to each of them individually, I decided to write another commentary on the same topic to explain my thinking. I do not have any disagreement with statistical reasoning presented by the readers about having more data, but somehow I feel that we are systematically removing ‘cleverness’ from our methodologies. To explain, let me ask my original question in a slightly different manner.
Imagine absolutely nothing is known about the genes involved in human heart formation, and you are the first person trying to answer the question. You have enough money to do exactly three pairs of RNAseq experiments comparing heart and muscle. What will be most informative?
Researcher A collects one pair of RNA sample from one individual and sequences it three times. That is stupid, right?
Researcher B sequences three heart and muscle RNA samples from his mom, dad and himself, and finds about 1000 genes up or down-regulated in heart over muscle.
Instead, researcher C collects samples from one Asian man, one European woman and one African child and does his experiment.
Intuitively we know that even though both researchers B and C did triplicate measurements, C designed his experiment more intelligently.
If we did array measurements, those would have been the only options. However, RNAseq opens up host of other potentially more informative possibilities. Why so? It is because RNAseq is both a sequencing study and an expression study.
Researcher D sequences heart and muscle samples from human, chimp and gorilla. That is another interesting possibility with RNAseq.
Researcher E does a Hail Mary pass and sequences heart and muscle samples from human, mouse and Drosophila. He finds about 30 genes over or under-expressed in heart vs muscle in all samples.
Whose triplicate measurement is the most informative?