We found a blog post (hat tip: Rayan Chikhi) that provides excellent introduction to probabilistic data structures. The context of the blog post is the internet world, but many issues mentioned there are equally applicable to those working on NGS. If you do not immediately recognize the similarity, here is a clue. The typical questions asked for the ‘random integer count’ example in the above blog post are identical to k-mer counting problem in NGS.
The original Whang, Vander-Zaden, Taylor article that the above-mentioned blog summarizes can be found at this link.