Note: These tutorials are incomplete. More complete versions are being made available for our members. Sign up for free.

Optimal K for Genome Assembly

De Bruijn assemblers like Velvet, Trinity, etc. use K-mers of odd length (21,23,25…). Here is the rationale. If K-mers are of even length, some K-mers can be reverse complements of themselves (e.g. ATATATATATAT). That will create ambiguity in the de Bruijn graph and make its resolution difficult. Palindromic K-mers can be always avoided with odd K-mer size, because the reverse complement of center nucleotide is different from the nucleotide itself.

In color space, reverse complement of a sequence is the reverse of the sequence. That means some odd-sized K-mers can be self-complementary, whereas even sized K-mers can never be so except for sequences like 111111111, 2222222, 3333333 and 44444444.

http://www.homolog.us/blogs/blog/2013/04/23/informed-and-automated-k-mer-size-selection-for-genome-assembly/


Web Statistics