Lamprey Genome Shows When Humans Evolved Jaws, Matching Arms, Legs and More
Title is to parody journalism practices in mainstream media, where the most extreme punchline is picked to sensationalize a highly nuanced article :)
A picture of Lamprey to know what is being sequenced:
Important links to follow:
Inside Story of Genome Assembly [We recommend you start here before reading the manuscript]
Supplementary materials and methods
-———————————
In a nutshell -
Why is lamprey interesting?
The lampreys are a very ancient lineage of vertebrates.
Their genome can be used to to track genomic changes for evolution for various vertebrates like fish and humans from invertebrate chrordates.
Project strategy
With above biological question in mind, authors sequenced lamprey genome and then checked the assembly for (i) gene duplication, (ii) changes in synteny structure, (iii) several relevant gene families.
Genome Sequencing and Assembly Method
They used Sanger-sequencing and Arachne as the primary assembler. Jeremiah Smith, who now moved to UK [1] from Benaroya Institute of Seattle, did much of this work. As Titus Brown explained:
The genome was a gigantic pain in the butt. We (and by “we” I mean Jeramiah Smith, the first author) could only assemble 800 Gbp, a maximum of 2/3 of the estimated complete genome (which is in the range of 1.2-1.6 Gbp, depending on which estimates you believe). This is partly because the genome has a bunch of really annoying GC-rich repeats that confounded much of our BAC sequencing and hence much of our scaffolding.
The other reason for the incompleteness of the genome is much less common and more problematic: we constructed our sequencing libraries from liver, which, in the lamprey means that we’re missing 20% or more of the genome. This is because the lamprey genome undergoes lineage-specific loss of genomic DNA. (At this point you should say “WHAT? WHY!?” and/or lament the cost of sequencing and analyzing a subset of the germ line genome :).
Gene annotation method
Genome + EST + RNAseq
From Nature paper:
Annotations for the lamprey genome assembly were generated using the automated genome annotation pipeline MAKER, which aligns and filters EST and protein homology evidence, identifies repeats, produces ab initio gene predictions, infers 5? and 3? UTRs and integrates these data to produce final downstream gene models along with quality control statistics. Inputs for MAKER included the P. marinus genome assembly, P. marinus ESTs, a species-specific repeat library and protein databases containing all annotated proteins for 14 metazoans (Supplementary Note) combined with the Uniprot/Swiss-Prot41 protein database and all sequences for Chondrichthyes (cartilaginous fishes) and Myxinidae (hagfishes) in the NCBI protein database42, 43. Ab initio gene predictions were produced inside of MAKER by the programs SNAP44 and Augustus45. MAKER was also passed P. marinus RNA-seq data processed by the programs tophat and cufflinks.
[1] UK = University of Kentucky