BARCODE: Fast Lossless Compression via Cascading Bloom Filters

BARCODE: Fast Lossless Compression via Cascading Bloom Filters

This is a RECOMB poster of very nice work done by Roye Rozov -”>ccc

Background / Purpose:

Reference-based compression is currently the most efficient means of compressing deep sequencing read sequences. The most time-consuming step of performing such compression is the alignment of reads to a reference genome.

Main conclusion:

We developed an algorithm which allows for fast reference-based compression by hashing read sequences into bloom filters. We observed that by avoiding alignment to the reference genome, we can compress nearly as well as alignment-based methods up to an order of magnitude (9x) faster.

Written by M. //