DIY Ancestry Analysis using the GPS Algorithm

For those interested in trying out the cutting-edge tools in ancestry research on real data, I am open-sourcing my own genotype information in this github project along with all analysis steps. You need to install two programs - plink and admixture. Then by following the steps given in the README file, you should be able to find the geographic origin of the given sample, (which is me).

I got the genotyping done with GPS Origins. Their analysis tools have been developed by Dr. Eran Elhaik, whose GPS algorithm has been featured in this blog many times.

My experience with the company has been excellent. I sent them my cheek swab, and then they sent me back a password-protected website-link with all analysis information. Their analysis tool (i.e. GPS algorithm) matched my SNP profile with the profiles from people around the world, and based on that, it gave the most likely geographic location of my origin. Here are the top hits with over 5% matches -

  1. Southeastern India 39.2%
  2. Southwestern India 9.3%
  3. Western Siberia 8.9%
  4. Northern India 8.5%
  5. Southern France 6.9%
  6. Fennoscandia 6.5%
  7. Tuva 5.9%

If you want to gain better understanding of the algorithm, the company allows you to download your raw data. With that, you can run the analysis on your own following the steps outlined in the github project mentioned above. Many thanks to Dr. Elhaik for helping me resolve a number of mistakes I was making in doing the analysis. We are planning to post a detailed tutorial here explaining the conceptual steps in the analysis. Readers will also find the recently published DREAM paper helpful.

Written by M. //