-o Switching between direct and normal I/O. Should use normal I/O if the input is small enough.
-o How does the speed of radix splitting decrease with increasing number of hash bits?
- Does it help to use more bits than we need, so that we sort less data in memory?
-o Add automatic joining to the custom presorter interface?
+o When quicksorting a large input (especially in threaded case), invest more
+ time to picking a good pivot.