Entropy of Never Born Protein Sequences

Why do some proteins get born and some do not? Grzegorz Szoniec and Maciej J Ogorzalek attempts to find out.

Existing and known proteins are only a small subset of all possible sequences. Why were only some proteins selected during evolution? The reason is not known but two possible ways are considered: deterministic and random. To investigate theoretical sequences of amino acids a term Never Born Protein was introduced (Chiarabelli et al. 2006). Since 2006 only a few papers about them have been published. The most significant research has shown that 20% of them fold (i.e. reach stable and functional 3D structure) in laboratory conditions

It is a theoretical paper, where they generated Never Born Proteins with Random Blast tool and compared with natural proteins from Uniprot by measuring block entropy in both cases.

Findings and conclusion:


Both block and relative entropies are similar what means that both protein kinds contain strongly random sequences.

An artificially generated Never Born protein sequence is closely as random as a natural one.


Information theory approach suggests that protein selection during evolution was rather random/non-deterministic.

Natural proteins have no noticeable unique features in information theory sense.

