Note: These tutorials are incomplete. More complete versions are being made available for our members. Sign up for free.

What is lost in de Bruijn Graphs

The readers are possibly wondering why de Bruijn graphs were not widely used with Sanger sequencing. It was because the de Bruijn graphs did not preserve long-range positional information. Suppose a de Bruijn graph is constructed from a long Sanger read. If the Sanger read has repetitive structure, de Bruijn graph for the read itself will contain loops. That means one cannot go back from the de Bruijn graph to the read. However, the read itself is a true representation of part of the genome. By converting it into de Bruijn graph, we lost what was already known about that part of the genome. Loss of positional information is the biggest drawback of going from sequence space to de Bruijn graph space. Longer the read size, more one has to lose.


Web Statistics