Tutorials

Enjoy This Site? Join Our Remote R/Bioinformatics Classes

Note: These tutorials are incomplete. More complete versions are being made available for our members. Sign up for free.

De Novo Assembly of RNAseq

Genome-wide measurement of gene expression went through three generations over the last twenty years. At first, the sequencing of expressed sequence tags (EST) was in vogue, but the cost of Sanger sequencing was too high to obtain enough depth for quantitative analysis. That era was followed by microarray experiments and then RNA sequencing (RNAseq) using next-generation sequencing (NGS) technologies.

Unlike microarrays, NGS RNAseq provides direct measurement of the expressed sequences. However, the reads are too short and therefore it is necessary to assemble them into transcripts before they can be used for further analysis. In contrast, the ESTs usually covered at least half to the entire lengths of genes and no elaborate assembly was needed.

Despite this inconvenience, the depth of sequencing of NGS provides immense benefit in delineating alternative splice-forms and quantifying relative expressions of genes. The same quality of quantitative assessment was never possible with microarrays. The second advantage of RNAseq is in its ability to measure gene expression in organisms, where no reference genome is available. De novo assembly of RNAseq is especially important for the later group. Here we will look into the differences between RNAseq assembly methods and genome assembly methods discussed in previous sections.