Note: These tutorials are incomplete. More complete versions are being made available for our members. Sign up for free.

De Bruijn Graphs for Alternatively Spliced Genes

Alternate splicing is another source of complexity in de Bruijn graphs of transcriptomes. Let us first construct a graph for two alternatively spliced transcripts A and B for a gene. The regions shown in yellow and red are transcribed in both isoforms, whereas the green region is present only in A.

Figure

The de Bruijn graph is shown in circle and arrow format, and the paths for two transcripts are marked by dotted lines. We shall explain the graph construction qualitatively instead of going into nucleotide level detail. For large parts of yellow and red regions, K-mers are common between two transcripts. Therefore, their de Bruijn graph will connect sets of common nodes. The green region of A will generate many new K-mers and follow a path similar to blue upper branch of the de Bruijn graph. It is important to note that B will also generate K-mers not present in A. They are junction K-mers spanning between yellow and red junctions. Hence, de Bruijn graph of B will follow lower blue branch. From a cursory look of above figures, you may think that de Bruijn graphs of alternatively spliced genes and repetitive segments are identical. Are they? Please pay close attention to the direction of the arrows and you will see the difference. De Bruijn graphs are directed graphs, where flipping an arrow can completely change the meaning of the graph. For alternative splicing, all arrows are going from left to right. For repetitive structure, arrows connecting blue circles in the figure go from right to left. Another interesting observation – the first graph can be uniquely resolved into structures A and B, but the second graph cannot. For example, the de Bruijn graph of the following repetitive genomic segment also has the same de Bruijn structure as one considered earlier. Therefore, the graph shown here can resolve to many possible structures in nucleotide space. This multiplicity appears from presence of loop in the de Bruijn graph.


Web Statistics