From: Martin Mares Date: Sat, 8 Sep 2007 14:35:50 +0000 (+0200) Subject: Moved the last few relevant NOTES to TODO. X-Git-Tag: holmes-import~506^2~13^2~40 X-Git-Url: http://mj.ucw.cz/gitweb/?a=commitdiff_plain;h=573a9269e029f71af9fd1e4b3a32d4d692050da5;p=libucw.git Moved the last few relevant NOTES to TODO. --- diff --git a/debug/sorter/NOTES b/debug/sorter/NOTES deleted file mode 100644 index c6a1e5d9..00000000 --- a/debug/sorter/NOTES +++ /dev/null @@ -1,78 +0,0 @@ -Users of lib/sorter.h: - -centrum/indexer/oook.c fixed oid -centrum/indexer/patch-index.c fixed oid -gather/daemon/expire.c variable complex -gather/daemon/gc.c variable complex -gather/daemon/gc.c variable hash -gather/shepherd/shep-cleanup.c 5 bytes oid 4 min -gather/shepherd/shep-cleanup.c fixed hash 35 min -gather/shepherd/shep-export.c fixed oid 9 min -gather/shepherd/shep-export.c fixed weight -gather/shepherd/shep-plan.c fixed weight + oid in mem -gather/shepherd/shep-plan.c fixed site ptr + complex in mem -gather/shepherd/shep-record.c fixed reducible to int in mem -gather/shepherd/shep-record.c fixed oid in mem -gather/shepherd/shep.c fixed oid -gather/shepherd/shep.c fixed hash -gather/shepherd/shep-cork.c fixed hash + int 20 min -gather/shepherd/shep-feedback.c fixed hash -gather/shepherd/shep-merge.c fixed hash + int -gather/shepherd/shep-merge.c fixed complex (probably can create monotone hashes for that) 38 min -gather/shepherd/shep-mirror.c fixed hash + more -gather/shepherd/shep-mirror.c fixed hash + more -gather/shepherd/shep-recover.c fixed hash + more -gather/shepherd/shep-select.c fixed weight (non-triv) + hash 21 min -indexer/attrsort.c fixed large oid -indexer/backlinker.c 8 bytes int -indexer/black-gen.c fixed oid -indexer/feedback-gath.c fixed hash + complex 7 min -indexer/fpsort.c fixed hash + oid 9 min -indexer/imagesigs.c 5 bytes xlat[oid] + oid -indexer/keywords.c fixed complex multi-pass -indexer/labelsort.c variable oid + complex 2:00 -indexer/mergeimages.c variable xlat[index] -indexer/mergesigns.c fixed complex, but has hashes 1:33 -indexer/mergesums.c fixed hash -indexer/mkgraph.c complex complex (with melding) 34 min -indexer/psort.c fixed hash + oid 2 min -indexer/reftexts.c 5 bytes + var oid -indexer/reftexts.c fixed fpos -indexer/reftexts.c fixed oid -indexer/reftexts.c 10 bytes + var fpos -indexer/ssort.c variable hash (with melding) 1:30 -indexer/weights.c 8 bytes + var xlat[index] -indexer/wsort.c variable id (with melding) 7:20 -lang/lang-tables.c fixed complex -lang/stem-dict-gen.c variable hash (with melding) - -Users of lib/arraysort.h on big arrays: - -indexer/chewer.c fixed hash + others -indexer/chewer.c u32 id -indexer/imagesigs.c fixed s32 -indexer/lexfreq.c ptr indirect int -indexer/lexorder.c ptr complex -indexer/lexorder.c ptr complex -indexer/lexsort.c ptr complex -indexer/mergeimages.c fixed s32 -indexer/mkgraph.c u32 indirect int -indexer/mkgraph.c 2*u32 complex, but have hash -indexer/mkgraph.c 2*u32 complex, but have hash -indexer/reftexts.c fixed indirect int - -Interface: - -Key structure fixed aligned / fixed unaligned / variable -Data structure none / given by data_len(key) function -Key I/O default (breadb/bwriteb or attempt direct I/O) / user-defined -Data I/O always built-in -Input file / fastbuf / generator function - + delete input ASAP flag -Output file / fastbuf / consumer function -Key comparison comparison function - monotone hash function - preliminary comparison function (?) -Key merging none / guaranteed unique / user-defined merge function: - multi-way merge in memory - 2-way merge on fastbufs diff --git a/lib/sorter/TODO b/lib/sorter/TODO index f36cc1ce..52fb6cf5 100644 --- a/lib/sorter/TODO +++ b/lib/sorter/TODO @@ -15,3 +15,18 @@ o Switching between direct and normal I/O. Should use normal I/O if the input i o How does the speed of radix splitting decrease with increasing number of hash bits? Does it help to use more bits than we need, so that we sort less data in memory? o Add automatic joining to the custom presorter interface? + +Users of lib/arraysort.h on big arrays (consider conversion to lib/sorter/array.h): + +indexer/chewer.c fixed hash + others +indexer/chewer.c u32 id +indexer/imagesigs.c fixed s32 +indexer/lexfreq.c ptr indirect int +indexer/lexorder.c ptr complex +indexer/lexorder.c ptr complex +indexer/lexsort.c ptr complex +indexer/mergeimages.c fixed s32 +indexer/mkgraph.c u32 indirect int +indexer/mkgraph.c 2*u32 complex, but have hash +indexer/mkgraph.c 2*u32 complex, but have hash +indexer/reftexts.c fixed indirect int