Last year, we wrote extensively about the SPAdes paper (here, here, here, here and here). In additional to their innovative algorithm, the original SPAdes paper could also be considered as a mini-Assemblathon, because they ran all existing assembly programs on a small test set and designed their own assessment tools to convince themselves and others that the tests were appropriate. QUAST paper linked here presents those quality assessment tools in more details.
Summary: Limitations of genome sequencing techniques have lead to dozens of assembly algorithms, none of which is perfect. A number of methods for comparing assemblers have been developed, but none is yet a recognized benchmark. Further, most existing methods for comparing assemblies are only applicable to new assemblies of finished genomes; the problem of evaluating assemblies of previously unsequenced species has not been adequately considered. Here we present QUAST a QUality ASsessment Tool to evaluate and compare genome assemblies. This tool improves on leading assembly comparison software with new ideas and quality metrics. QUAST can evaluate assemblies both with a reference genome as well as without a reference. QUAST produces many reports, summary tables, and plots, to help scientists in their research and in their publications. In this study, we used QUAST to compare several genome assemblers on three data sets. QUAST tables and plots for all of them are available in the Supplementary Material and interactive versions of these reports are on the QUAST website.
If you did not realize, the wonderful Rosalind educational platform comes from the same group. They are also planning to start bioinformatics training course at Coursera.
Nikolay Vyahhi, an author of QUAST, emailed us another QUAST-related document that may help you get started.
We’d like to present QUAST, a convenient tool for assembly evaluation. It has already been mentioned on Homologus when it was employed for SPAdes vs. Ray benchmarking, and lately Rayan Chikhi recommended it in his guides. The other day QUAST paper has been accepted to Bioinformatics, and now it is available in Advanced Access.
QUAST computes a number of well-known metrics, including contig accuracy, number of genes discovered, N50, and others. It also introduces some new statistics, like NA50 (see paper). An analysis results in summary tables available in various formats, including human-readable plain text, easy for parsing tab-separated tables, and LaTeX. Additionally, the tool generates colorful plots. All tables and plots are summarized in an interactive HTML report.
Please continue here to read the rest.