X-Git-Url: http://mj.ucw.cz/gitweb/?a=blobdiff_plain;ds=inline;f=lib%2Fsorter%2FTODO;h=f4a802cacbceb191459a33a6bcd864a95aab51e6;hb=556cfa14c25a4124673e2a7489308e82021cf58b;hp=30f02a8c452ba2c83144a79743fd3ffa684742c7;hpb=e95a189fcc53f9185fb8866b2b6388ed9b20cf50;p=libucw.git diff --git a/lib/sorter/TODO b/lib/sorter/TODO index 30f02a8c..f4a802ca 100644 --- a/lib/sorter/TODO +++ b/lib/sorter/TODO @@ -7,7 +7,9 @@ Improvements: o Use radix-sort for internal sorting. o Parallelization of internal sorting. o Clean up data types and make sure they cannot overflow. (size_t vs. u64 vs. sh_off_t vs. uns) -o Buffer sizing in internal sorters. o Switching between direct and normal I/O. -o When merging, choose the output file with less runs instead of always switching? o Deal with too rough range estimates in radix splitting. +o How does the speed of radix splitting decrease with increasing number of hash bits? + Does it help to use more bits than we need, so that we sort less data in memory? +o Log messages should show both original and new size of the data. The speed + should be probably calculated from the former.