From: Martin Mares Date: Sat, 8 Sep 2007 20:40:29 +0000 (+0200) Subject: Explain why we don't need the join hack for custom presorting. X-Git-Tag: holmes-import~506^2~13^2~37 X-Git-Url: http://mj.ucw.cz/gitweb/?a=commitdiff_plain;h=6ba042cf64e2de2e21ecc8713e737392ba19a45a;p=libucw.git Explain why we don't need the join hack for custom presorting. --- diff --git a/lib/sorter/TODO b/lib/sorter/TODO index 09e4986b..a86acf1e 100644 --- a/lib/sorter/TODO +++ b/lib/sorter/TODO @@ -14,7 +14,6 @@ Improvements: o Switching between direct and normal I/O. Should use normal I/O if the input is small enough. o How does the speed of radix splitting decrease with increasing number of hash bits? Does it help to use more bits than we need, so that we sort less data in memory? -o Add automatic joining to the custom presorter interface? Users of lib/sorter/array.h which might use radix-sorting: indexer/chewer.c diff --git a/lib/sorter/govern.c b/lib/sorter/govern.c index 13fc7a61..826417d1 100644 --- a/lib/sorter/govern.c +++ b/lib/sorter/govern.c @@ -48,6 +48,11 @@ sorter_presort(struct sort_context *ctx, struct sort_bucket *in, struct sort_buc sorter_alloc_buf(ctx); if (in->flags & SBF_CUSTOM_PRESORT) { + /* + * The trick with automatic joining, which we use for the normal presorter, + * is not necessary with the custom presorter, because the custom presorter + * is never called in the middle of the sorted data. + */ struct fastbuf *f = sbuck_write(out); out->runs++; return ctx->custom_presort(f, ctx->big_buf, ctx->big_buf_size);