Suppose you can locate a long read in the tardigrade genome sequencing data that has three genes from bacteriodes and three from tardigrade. Is that convincing evidence that microbial genes got integrated into the tardigrade genome through horizontal gene transfer?
Tardigrade debate brings attention to a different kind of artifact that appears primarily in the long read world. In an informative blog post titled - “Quick Look at the Two Manuscripts on Tardigrade LGT?”, Dr. Julie Dunning Hotopp mentions the possibility of seeing chimeric reads, where distant parts of the genome may get joined. If the original sample has contamination, that may result in side by side presence of microbial and eukaryotic genomic segments on some reads.
PacBio sequencing was conducted to further support these LGTs. First, low coverage PacBio is not a great method for LGT validation since it has steps in library construction that makes it prone to chimeras. This is a known problem we have published on that is not yet widely appreciated. However, LGTs that are recovered in both the PacBio dataset and the Illumina dataset should be real as you wouldnt expect such random events to occur repeatedly across two platforms. One figure is shown in the manuscript that is used to demonstrate the congruity. Congruity is expected, whether or not these are real LGTS or not, since most of the sequence is from tardigrade. Furthermore the PacBio assembly is <60 Mbp compared to the >200 Mbp Illumina assembly. This means that only about a quarter of the data in the Illumina assembly is found in the PacBio assembly. Therefore, a lot of data is missing; quite possibly these LGTs. If that was examined more closely, I couldnt find where it was presented. However, it does not seem to support the hypothesis.
In the meanwhile, the controversy reaches the pages of mainstream media with ‘open science’ claiming victory. Sometimes you have to wonder how Newton, Maxwell or Darwin managed to get anything done in the pre-twitter world.
Meanwhile, Sujai Kumar from the Edinburgh team says, The entire process is a victory for open science. His colleagues could never have done their analysis if their rivals hadnt willingly and promptly released their data. And even just a few years ago, they would have had nowhere to upload a manuscript detailing the conflicting results, which other scientists could check and discuss. It took just nine days for the second paper to follow the first.