Mutation Identification By Comparing Whole Genome NGS Data Using k-mers

Mutation Identification By Comparing Whole Genome NGS Data Using k-mers

Readers may take a look at an interesting bioinformatics paper that came out in NBT. (h/t: @OmicsOmicsBlog)

Mutation identification by direct comparison of whole-genome sequencing data from mutant and wild-type individuals using k-mers

Genes underlying mutant phenotypes can be isolated by combining marker discovery, genetic mapping and resequencing, but a more straightforward strategy for mapping mutations would be the direct comparison of mutant and wild-type genomes. Applying such an approach, however, is hampered by the need for reference sequences and by mutational loads that confound the unambiguous identification of causal mutations. Here we introduce NIKS (needle in the k-stack), a reference-free algorithm based on comparing k-mers in whole-genome sequencing data for precise discovery of homozygous mutations. We applied NIKS to eight mutants induced in nonreference rice cultivars and to two mutants of the nonmodel species Arabis alpina. In both species, comparing pooled F2 individuals selected for mutant phenotypes revealed small sets of mutations including the causal changes. Moreover, comparing M3 seedlings of two allelic mutants unambiguously identified the causal gene. Thus, for any species amenable to mutagenesis, NIKS enables forward genetics without requiring segregating populations, genetic maps and reference sequences.

Even if the main paper is locked, supplementary section is open and highly informative.

Written by M. //