Homolog.us

27 May 2026

Is Truth Absolutely Necessary for Science?

These days, corporate entities are increasingly dominating research in basic science. This essay looks into how this will change the most cherised convention of science.

introductory

21 May 2026

Biology is Messy, Because Physicists Failed

I came across an interesting blog post titled Biology is Messy. The main argument is that biology is not built on top of reductionist theories, and the current push to collect large amount of data and use “AI” to “solve” biology is not likely to lead to finding reductionist theories either. Instead we will get a bunch of overfitted models hyped up as “solutions” by corporate entities.

introductory

20 May 2026

Mathematical Side of AI and Its Applications in Biology

Right now everyone is either enamored of or is completely turned off by AI. Speaking of being fed up, please check the students at University of Florida booing at this commencement event. Much of their reactions comes from intense hype generated by the tech-bros of Silly Con valley to increase stock prices of their companies.

AI

14 May 2026

Gene Regulatory Network Reconstruction with Single-cell Data

On the subject of reconstructing gene regulatory networks (GRNs) from RNAseq and scRNAseq data, I am working through a number of review papers and the techniques described therein.

Gene Regulatory Network Inference in the Era of Single-cell Multi-omics - review paper published in 2023.
Gene Regulatory Network Reconstruction: Harnessing the Power of Single-cell Multi-omic Data - published in 2023.
Gene Regulatory Network Inference: an Introductory Survey - published in 2019.

As I learn from them, I will post my notes here for the benefit of others. Also, in this context, I will discuss how we can use AI (chatGPT) to speed up learning of bioinformatics tools.

genomics

11 May 2026

Scientific Questions (and Maybe Answers) for 2021-2038

In an earlier post, I divided the modern era of genetics into 18-year periods (eras). The discoveries of each era opened new questions and provided fuel for the next era. In the most recent era (2003-2020), biologists moved from working on individual genes to whole genome experiments, performing single-cell experiments instead of measuring gene expressions in many cells in aggregate and also moved out of “model organisms” to a wide variety of organisms, all thanks to inexpensive sequencing. The intellectual and commercial drives for these came from the experiences and questions posed by the previous era (1985-2002).

genomics

08 May 2026

Three Layers of AI with an Analogy

In my earlier post, I mentioned about three distinct activities described under the broad term “AI”. They are - (i) using web-based text engines like Chatgpt, Claude or Gemini and their extensions as coding tools, (ii) downloading numerical models directly from Huggingface and building applications on top of them, and (iii) developing and training mathematical models for new applications.

AI

08 May 2026

History of Genetics from 1949 to Today

I have decided to divide the last 70-80 years of genetics into different eras. Each period started with a set of burning questions, which were resolved by the end of the era. However, those answers created another set of burning questions to be resolved by the newcomers to the field. Please tell me whether you agree my classifications, and what you expect the current era to be like.

genomics

05 May 2026

Bioinformatics in the AI-era

In 2011, I wrote two articles (here and here ) providing beginners’ guides to bioinformatics. Eight years later (2019), I posted an updated guide here. Now that AI has become a powerful tool, it is time to discuss how the work of bioinformatics and computational biology is changing.

AI

22 Apr 2026

Coding with AI - It Speeds You Up and Then Slows You Down

I have been coding with AI assistance for about six months now and more actively since late December 2025. If I had to compress the entire experience into a single sentence, it would be this: AI made me faster first and then it made me slower. Let me explain.

AI

01 May 2025

Training Approach in Evo and Evo2

In the earlier posts of this series (here, here, here and here), we covered the mathematical and biological aspects of evo and evo2. One important topic that we have not covered yet is how the models were trained.

AI

30 Apr 2025

Massively Parameterized Statistics

In this article, I will argue that Multi Parameter Statistics, or even better, Massively Parameterized Statistics (MPS) better describes the application of AI models in biology and medicine. Also, I will introduce you to a new preprint on DNA sequence modeling that claims to match evo.

AI

29 Apr 2025

Biological Aspects of Evo and Evo2 - Semantic Mining

In the last three posts of this series (here, here and here), we covered the mathematical aspect of evo and evo2. Let us now discuss the biological findings from these models. It will take multiple posts to go over these topics.

AI

23 Apr 2025

StripedHyena in Evo and Evo2

In the first two posts of this series (here and here), we covered the AI-related mathematical concepts applied to evo and evo2. Before moving on to the biological side, here is one last post on the model.

AI

22 Apr 2025

Evo and Evo2 - Math and Algorithm

In the first post of this series, we covered the basic technical terms of the evo and evo2 papers. We also mentioned the key technological innovation that made their work possible. That led to the question - if they were using fast fourier transform (FFT), were they using convolutional neural network (CNN)? The answer is no. The computer science work done by the Stanford group is quite groundbreaking. Let me go over that in detail.

AI

20 Apr 2025

Discussing the Evo and Evo2 Papers

Two recent papers applying AI-related large language models on DNA sequences are gaining a lot of attentions and a bit of controversy. The first paper titled Sequence Modeling and Design from Molecular to Genome Scale with Evo wrote -

Trained on 2.7M prokaryotic and phage genomes, Evo can generalize across the three fundamental modalities of the central dogma of molecular biology to perform zero-shot function prediction that is competitive with, or outperforms, leading domain-specific language models. Evo also excels at multi-element generation tasks, which we demonstrate by generating synthetic CRISPR-Cas molecular complexes and entire transposable systems for the first time. Using information learned over whole genomes, Evo can also predict gene essentiality at nucleotide resolution and can generate coding-rich sequences up to 650 kb in length, orders of magnitude longer than previous methods.

AI

05 Dec 2023

Rules of the Genomes

What are the rules of the genomes? What patterns do the genome sequences follow? What biochemical and evolutionary mechanisms are behind these patterns? Are newly published genomes and pangenomes displaying many exceptions to the rules, or do they all confirm the expected patterns?

genome

31 Dec 2021

The Best of 2021

Now that we are on the very last day of 2021, it is not too late to review the positives of the year. I picked four categories (humor, science, society, technology) and shortlisted a tiny subset from many deserving candidates.

review

17 Dec 2021

Devastating Impact of Climate Change Around the World

Climate Change is taking a devastating toll on the lives of people around the world. This week, two students from Anderson High School died in their sleep. The school district canceled the final exams to help the grieving community. Separately, in Europe, three soccer players left game this week because of heart conditions. Also, 33-year old Argentine striker Sergio Aguero playing for Barcelona announced retirement from soccer due to heart condition. In Silicon valley, 43 year old Tyson Clark from Google Ventures and 44 year old Ryan Popple also died in their sleep.

coronavirus

15 Dec 2021

Did They Fake Their Entire NGS Experiment?

In NGS experiments, when the researchers encounter issues with genome assembly or analysis, they go back to the raw data composed of sequencing reads. In a latest preprint submitted to zenodo, Steven C. Quay did exactly that for a seminal paper and concluded - “The alternative conclusion is that this sample was not a fecal specimen but was contrived. The data cannot, however, distinguish between a non-fecal specimen that came from true field work on the one hand and a specimen created de novo in the laboratory on the other hand.” This is no simple matter, because the entire world had been running like headless chicken for the last two years relying on the genome assembly submitted in the paper.

coronavirus

09 Dec 2021

Another Unusual Connection Between Covid and AIDS

In early 2020, Prashant Pradhan and collaborators posted a preprint titled “Uncanny similarity of unique inserts in the 2019-nCoV spike protein to HIV-1 gp120 and Gag” in biorxiv. Based on the released emails from NIH under FOIA, we now know that this article and its coverage in zerohedge upset Fauci so much that he immediately convened an urgent meeting of virologists and several health bureaucrats from US, UK and Europe. All details of this meeting had been redacted, but the virologists present in the meeting fast-tracked a Nature Medicine paper claiming the virus definitely came from animals even though they described it as lab-engineered in their private emails. This paper was then used for over one year to censor all counter-arguments. Especially, biorxiv retracted the preprint due to intense pressure and thus destroyed its reputation as a preprint server for good.

coronavirus