We encourage our readers to take a look at the comparison of long read assemblers by Ryan Wick and Kathryn Holt. The authors benchmarked five different assemblers, namely Canu, Flye, Ra, Unicycler and Wtdbg2.
This post is not about the assemblers but rather the mode of publication. It will greatly help the bioinformatics community, if Wick and Holt do not publish their comparison in an “official journal”.
This is not because they did a sloppy job, but quite the opposite. They included all code and data for comparison so that others can quickly evaluate their results by redoing the comparison. This kind of rapid validation is unthinkable for most other published work. Moreover, they wrote down their results in the format of an academic paper and posted as the README file of the github project. This is a high-quality work.
If you are like me, you are probably tired of reading benchmarking papers in the bionformatics journals. Often, by the time such papers go through the review and publication process, the underlying programs get revised multiple times. Therefore, it is hard to evaluate whether the benchmarking results are valid without doing another round of benchmarking.
It will be trend-changing for bioinformatics, if the benchmarking papers are published as comprehensive github projects like Wick and Holt developed. That way, one interested in extending the benchmarking can simply fork the project, add extra comparisons and republish the results in github. For example, I noticed Peregrine assembler was not included in the current comparison. Adding that in an already published journal paper would be impossible, but rather easy in case of github publications.