The readers of our blog should be very familiar with parts of this project through their introduction with Minia assembler and DSK k-mer counter. Minia uses the same idea as diginorm (Bloom filters), but builds an entire assembler with it. Now Rayan Chikhi, Guillaume Rizk, Dominique Lavenier and their collaborators have converted those programs into an entire library with useful modules.
Motivation: Efficient and fast NGS algorithms are essential to analyze the terabytes of data generated by the next generation sequencing machines. A serious bottleneck can be the design of such algorithms, as they require sophisticated data structures and advanced hardware implementation.
Results: We propose an open-source library dedicated to genome assembly and analysis to fasten the process of developing efficient software. The library is based on a recent optimized de-Bruijn graph implementation allowing complex genomes to be processed on desktop computers using fast algorithms with very low memory footprints.
Availability and Implementation: The GATB library is written in C++ and is available at the following web site http://gatb.inria.fr under the A-GPL license.