Boosting the FM-index on the GPU: effective techniques to mitigate random memory access

The recent advent of high-throughput sequencing machines producing big amounts of short reads has boosted the interest in efficient string searching techniques. As of today, many mainstream sequence alignment software tools rely on a special data structure, called the FM-index, which allows for fast exact searches in large genomic references. However, such searches translate into a pseudo-random memory access pattern, thus making memory access the limiting factor of all computation-efficient implementations, both on CPUs and GPUs. Here we show that several strategies can be put in place to remove the memory bottleneck on the GPU: more compact indexes can be implemented by having more threads work cooperatively on larger memory blocks, and a k-step FM-index can be used to further reduce the number of memory accesses. The combination of those and other optimisations yields an implementation that is able to process about 2 Gbases of queries per second on our test platform, being about 8 faster than a comparable multi-core CPU version, and about 3 to 5 faster than the FM-index implementation on the GPU provided by the recently announced Nvidia NVBIO bioinformatics library.

Trim content

® The Pirbright Institute 2024 | A company limited by guarantee, registered in England no. 559784. The Institute is also a registered charity.