Note: These tutorials are incomplete. More complete versions are being made available for our members. Sign up for free.

Introduction

Romance novels and beginners' guides on computing languages are never in short supply. Why create another set of tutorials on popular programming languages, when similar information is freely available from hundreds of websites?

This set of tutorials is unusual in its approach. The focus is less on one or other languages and more on actual problem solving. In section 2, we present a bioinformatics problem from a recent paper. In sections 3-6, we provide brief introductions to R, PERL, python and C/C++. In section 7, we describe computer science concepts such as data structures, algorithms, functions and classes. We also explain the von Neumann architecture deployed in almost all commercially available computing machines. In section 8, we apply the knowledge gained from previous sections to solve the biological problem of section 2. Information presented up to section 8 would have been enough for most bioinformaticians 2-3 years back, but not any more. The codes to process next-gen libraries often require parallizability/scalability so that large data sets are analyzed in a timely manner. In section 9, we discuss few ideas on parallization and other possibilities.

Our presentation is motivated by the observation that new bioinformaticians often ask for the best programming languange for bioinformatics. Such questions tend to distract one from the fact that biggest effort in solving a problem is spent on understanding the scientific question and designing appropriate algorithm and data structure. Those, who do that well, are usually not constrained by the specifics of various programming languages.