Digital and Analog Transcriptomics

Digital and Analog Transcriptomics


In previous commentary, we argued that the mathematical approach derived by Shannon (and Nyquist, Hartley) to compare transmission quality of different communication channels (e.g. air, transmission cables, wave guide, fiber optics) may be of use to objectively compare different sequencing technologies. Details need to be worked out however.

While on the topic of signal communication, we like to point out how the technology developed over the years. The earliest long distance electrical communication technology was strictly digital. Scientists and technologists like Gauss, Weber, Cooke, Wheatstone, Morse and Vail developed telegraph technology to transmit dots and dashes as electrical signal over a long distance. By later half of 19h century, telegraph technology was widely implemented to develop a global communication network.

The miles of American telegraph grew from 40 in 1846 to 12,000 in 1850 to 23,000 in 1852. In Europe it increased from 2,000 in 1849 to 110,000 in 1869. The cost of sending 10 words was $1.55 in 1850, $1 in 1870, 40 in 1890. Within 29 years of its first installation at Euston Station, the telegraph network crossed the oceans to every continent but Antarctica, making instant global communication possible for the first time. The telegraph’s greatest accomplishment was to expand information boundaries, allowing data to reach its destination before its usefulness expired to a decisively higher degree than before, particularly in trade.

Although telegraph technology was very useful for long distance communication, the mode of communication was non-human. People wanted to talk to each other over long distance, not have a chat of dots and dashes.

The next generation of communication technology (telephone) allowed people to talk, but the mode of signal transmission was analog. Analog voice signal was carried through wires to let people hear each other’s voice over a long distance. Analog technology is also used in LP records to play music.

The biggest drawback of analog mode of communication is in its difficulty to remove noise and reconstruct the original signal. When the grooves of a LP record change due to friction, the original voice is lost for ever. Similarly, a repeater placed between Frankfurt and San Francisco to boost telephone signal will boost signal and noise equally, because it is nearly impossible to filter out noise that sounds like human voice.

Finally, communication technologies incorporated the best of both telegraph and telephone to allow digital transmission of voice over a long distance. In this mode, voice was converted into short piece of 0s and 1s, sent over long distance, and then reconstructed voice at the other end to be played for the listener.

At an abstract level, we see many similarities with the evolution of transcriptomics over the last two decades.

1. EST: At first, people used ESTs to identify expressed genes in samples. The method worked in identifying genes correctly, but it did not satisfy interests of scientists to find relative expression patterns of many genes. EST sequencing is ‘digital’, because it gives the exact nucleotide information.

2. Microarray: Microarray technology that followed produced analog signals quantifying relative expressions of various genes. It was very useful, because scientists could find out genes over-expressed in various tissues and build hypothesis on the way to understand genetic basis of various biological observations. Microarray data provided analog signals for expression data, but it did not give ‘digital’ information on the sequences themselves.

3. Next-gen sequencing: Loss of sequence information was the biggest drawback of microarray approach. Detecting expression on a probe was no guarantee that the gene it was referring to was indeed expressed. Next- generation sequencing approach such as RNAseq combined the best features of (1) and (2) by giving both expression levels and the sequences of underlying genes. You can also see other similarities between digital data communication, because the genes are split into small pieces, sequenced and reconstructed just like voice is reconstructed from 0s and 1s for the pleasure of the listener.

The biggest difference between the two is of course that in case of digital communication, we more or less know what the original sound was like, whereas in case of measuring biological signal, the ‘original’ is the mystery being solved.



Written by M. //