Tutorials

Enjoy This Site? Join Our Remote R/Bioinformatics Classes

Note: These tutorials are incomplete. More complete versions are being made available for our members. Sign up for free.

Reversible Transform and Data Compression

We will start our discussion with Burrows-Wheeler transform, which is behind many efficient short read mapping programs.

What is a transform?

A transform is a set of rules to turn one thing into another. A recipe for making bread is a good example. You start with flour, yeast, water and salt, follow the rules, and have fresh bread out of the oven. Those, who are not particularly hungry, can try ‘count of nucleotides’ transform. This transform takes a long sequence and reports the number of As, Cs, Gs and Ts.

Reversible transform: A reversible transform turns A to B following a set of rules, but another set of rules can be derived to turn B into A. Have you seen anyone get flour and yeast out of bread? No. You are correct; making bread is not reversible. Neither can anyone reconstruct the original sequence in the above example from its count of nucleotides – 3As, 3Cs, 4Gs and 5Ts. The transform for converting nucleotide space data to SOLiD color space is another good example of irreversible transform.

Here are couple of reversible transforms -

The first example above does not need any explanation, and so we proceed to the second. Suppose you have a transform that takes a long sequence and notes the locations of various nucleotides. It is always possible to reconstruct the original sequence from the location data. The transformation is reversible.