String Synchronizing Sets: Sublinear-Time BWT Construction and Optimal LCE Data Structure

16/05/2019 - 12:00

In this talk, I will present the first algorithm that breaks the O(n)-time barrier for BWT construction. Given a binary string of length n, it builds the Burrows–Wheeler transform in O(n / √log n) time and O(n / log n) space. Any further progress in the time complexity of BWT construction would yield faster algorithms for the problem of counting inversions: it would improve upon the state-of-the-art O(m √log m)-time solution by Chan and Pǎtraşcu (SODA 2010).


The new algorithm is based on a concept of string synchronizing sets, originally designed for answering Longest Common Extension (LCE) queries, which is of independent interest. I will also show that how this technique yields a data structure of the optimal size O(n / log n) that answers LCE queries in O(1) time and, furthermore, can be deterministically constructed in the optimal O(n / log n) time.