You don’t necessarily need several threads, also you don’t need standard time domain convolution. See JUCE’s implementation of the uniformly-partitioned convolution.
However, for such long IRs, a non-uniformly partitioned convolution is necessary, which makes use of different partition-sizes (gardener scheme) and several threads to become efficient.
See also: Convolution Reverb - #2 by danielrudrich
