Multiindex Bloom Filter (miBF) data structure manuscript published in PNAS

Our manuscript describing miBF, a probabilistic data structure we developed for alignment-free sequence classification tasks, was published in PNAS. Alignment-free methods, including miBF, have applications ranging from transcript expression analysis, metagenome characterization, to de novo assembly to name a few. They are usually faster than alignment-based methods, but often limited in their sensitivity and memory requirements. In the manuscript we demonstrate that by using spaced-seeds and probabilistic data structures, the sensitivity of alignment-free classification methods can be improved, with memory requirements independent of the seed design and scalability linear to the indexed target size. miBF is freely available on GitHub