Salmonberry Genomics by High-school Students

A Generation Lost in the Bazaar

Often I download newly published bioinformatics programs or libraries from the github into my Windows laptop and try to compile them within its Cygwin UNIX environment. Over the years, I noticed that those C/C++ codes tend to fall into two distinct categories -

How does Multi-threaded Code Run in Assembly Language?

In the traditional model of computing, programmers write their codes in C or other high-level (i.e. human-readable) languages. Then a compiler (e.g. gcc) converts that code into assembly and machine (byte) instructions. This is because the microprocessor can understand only 0s and 1s, whereas the humans tend go crazy trying to make sense of such code. The assembly language is a happy compromise between the two. It presents the machine or byte-instructions in human-readable format.

Bioinformatics Contest - 2018

It is that time of the year again. Our friends from Rosalind, Stepik and Bioinformatics Institute are hosting another bioinformatics contest with qualifying round starting on Feb. 3rd. Details below.

A New Nemesis for Nanopore

Investor warning: The following post is for entertainment purposes only, and should not be considered as financial advice of any sort. In Feb 2016, we made a forecast that Oxford Nanopore would go out of business by the end of 2017. That did not happen, and we do deserve to get an ‘F’ for that forecast. We would also like to take this opportunity to make our readers aware of a relevant (and highly controversial) investment research report that came out recently.

DIY Ancestry Analysis using the GPS Algorithm

For those interested in trying out the cutting-edge tools in ancestry research on real data, I am open-sourcing my own genotype information in this github project along with all analysis steps. You need to install two programs - plink and admixture. Then by following the steps given in the README file, you should be able to find the geographic origin of the given sample, (which is me).

The Diversity of REcent and Ancient huMan (DREAM)

Is '23 and Me' Misleading its Jewish Customers on Ancestry?

‘Fake news’ - step aside. Now we got allegations of ‘fake ancestry’.

Minimizer - An Introductory Tutorial

This is a condensed version of our longer tutorial on minimizer algorithms available here. Many bioinformatics algorithms use short substrings of a longer sequence, commonly known as k-mers, for indexing, search or assembly. Minimizers allow efficient binning of those k-mers so that some information about the sequence contiguity is preserved.

Compact Universal Set of Minimizers

There has been a number of interesting recent developments on minimizers likely to make bioinformatics algorithms even more efficient. In this post, we like to mention three papers by Y. Orenstein, G. Marçais and collaborators.

Time to Shrink the National Institute of Health (NIH)?

In his recent budget, President Trump proposed to reduce taxes wasted in the NIH Money Pit sinkhole by twenty percent. Such a big cut will most likely not be approved by the Congress, because the political stars are aligning against it. The economic stars, on the other hand, are aligned in favor of drastic reduction of NIH funding in the coming years. We explained why in a post written four years ago. Shutting down parts of NIH, or even the entire agency, will not be an unmitigated disaster for science, and if at all, will be beneficial. We made an appeal to close NHGRI in “Let’s Discuss - Is it Time to Shut Down NHGRI?” and also wrote - “How Much Will the Americans Suffer, If NIH Shuts Down?”.

Is Google Tweaking Search Results to Block Our Posts Critical of NIH Director?

Every once in a while, we use Google search to find links to old posts in our blog. The method seemed to have worked without failure until today. Today we were looking for an earlier post critical of a paper by Francis Collins and Google never gave us the link, no matter how hard we searched for it. It is noteworthy that even after typing the entire title and adding ‘homolog.us’ on the search box, we do not find the relevant post anywhere in the first several pages of the Google results. Is some organization paying Google to block our posts critical of NIH? We present the evidence from four search engines (google, duckduckgo, bing, yahoo). You explain what is going on.

Gullible's Troubles

Apologies to the readers for not being able to make this week’s scheduled posts. Instead I am posting an entertaining essay on the birth of molecular biology. It is from a autobiographical book published in 1976. Any guessing the author will earn 99.99 homolog.us points :).

Another Tutorial - This Time on Pevzner's Videos

Grab them here on the left sidebar in bioinformatics courses section at the link ‘Pevzner Course’. I am still in the process of annotating the sets, including cross-linking similar sections.

A Tutorial with Ben Langmead's Bioinformatics Videos

Genome assembly algorithms through jigsaw puzzles - III

This is the third installment of “genome assembly algorithms through jigsaw puzzles”. We usually post them here every Tuesday, although we are late this week. You can find all those pieces in one place (and some more) at this link. We are developing this tutorial to explain genome assembly algorithms in a simple manner. In fact, rather than explaining, we expect you to discover the answer by manually solving a jigsaw puzzle. Later we show you how your solutions are related to the commonly used algorithms and their variations.

Tuesday Review - SAVE your day for CRISPR, Nature Fake News and Other Stories

1. SAVE your day for CRISPR

Two biorxiv papers cover the important topic of making CRISPR analysis user-friendly. In this context, we also included references to several other available CRISPR analysis tools for the benefit of our readers.

Genome assembly algorithms through jigsaw puzzles - II

This is the second installment of “genome assembly algorithms through jigsaw puzzles”. We post them here every Tuesday. You can find all those pieces in one place (and some more) at this link. We are developing this tutorial to explain genome assembly algorithms in a simple manner. In fact, rather than explaining, we expect you to discover the answer by manually solving a jigsaw puzzle. Later we show you how your solutions are related to the commonly used algorithms and their variations.

Monday review - Myers' dBG Paper, Pacbio's Multiplexing and Bioinformaticians' Foray into Escapism

1. Correcting Long Noisy Reads Using de Bruijn Graphs

Great news - the algorithmic concepts for short read assembly developed over the last decade need not be unlearned. In the two papers presented below, Myers, Pevzner and their colleagues use de Bruijn graphs for assembly and error correction of long noisy reads.

KMC tools tutorial - II

Yesterday we looked into the newly released ‘kmc tools’. Today we will work out another simple problem so that you feel familiar with it. We really love this powerful program, because, as the authors have shown, they could reproduce the results of many previously published bioinformatics papers with only a few commands.

More Articles ›