Three months back, we reported on an insanely polymorphic genome being assembled by BGI. The genome paper is finally published in Nature today (The oyster genome reveals stress adaptation and complexity of shell formation).
We worked on Oyster data over the last two years, and believe this genome can be paradise for those designing interesting algorithms for genome assembly from NGS data. If you want to go really crazy, feel free to combine Oyster and Pacbio !!
From the Nature paper:
To understand polymorphism in the oyster genome, we analysed allelic variation in the assembled genome (inbred) and one re-sequenced wild oyster (wild) (Supplementary Text C1). The inbred genome contained 3.1 million single- nucleotide polymorphisms and 258,405 short insertion/deletion (indels, 140?base pairs (bp)) yielding a sequence polymorphism rate of 0.73%, whereas the wild genome had 3.8?million single-nucleotide polymorphisms and 238,182 indels, or a polymorphism rate of 1.3% (Supplementary Table 7), comparable to previous estimates18. This 44% reduction in polymorphism in the inbred genome is smaller than the 59.4% predicted from four generations of brothersister mating, indicating that selection favouring heterozygotes had occurred19. The polymorphism combining inbred and wild (among four haplotypes) was 2.3%, higher than that in most studied animal genomes20, 21 but comparable to that in known high-polymorphism species7. In inbred and wild, we found 3,094 short indels located in coding regions inferred to cause frameshift variants in 2,665 genes, providing an important source for recessive lethal mutations.