Tutorials

Enjoy This Site? Join Our Remote R/Bioinformatics Classes

Note: These tutorials are incomplete. More complete versions are being made available for our members. Sign up for free.

Introduction

In addition to genome assembly, next-generation sequencing is being used to solve a number of other biological problems, namely RNAseq, metagenome assembly, resolution of polymorphic regions, etc. Performing short-read assembly is often the first step of analysis, especially for organisms with no reference genome. Even when a reference genome is available, an assembly step can reveal additional information not accessible through alignment methods.

In this section, we will cover the de Bruijn graph structures for those other assembly tasks and discuss what special challenges are posed by their assemblies. Is a genome assembler appropriate for transcriptomic or metagenomic libraries? What kind of errors are introduced, when a metagenomic sample is assembled using a traditional genome assembly pipeline? Is a bioinformatician better off with using a genome assembler on a transcriptome library than doing no assembly at all? Questions like those can only be answered through proper understanding of de Bruijn graph structures, and will hopefully be answered through following discussions.