4 A very common need is sorting data. Therefore libUCW contains few
5 routines to accomplish that task. They are much more universal than
6 qsort(), since they allow you to sort structures indexed by a macro,
7 sort data externally, if they do not fit into memory, merge data with
8 the same keys and sort data of variable length.
10 All routines described below are <<generic:,generic algorithms>>.
12 - <<array-simple,Simple array sorting>>
13 * <<mandatory-simple,Mandatory macros>>
14 * <<optional-simple,Optional macros>>
15 * <<example-simple,Example>>
16 - <<array,Huge array sorting>>
17 * <<mandatory-array,Mandatory macros>>
18 * <<optional-array,Optional macros>>
19 - <<external,External sorting>>
25 If you want to sort some data in memory and you aren't too picky about
26 setting how, you just use the routine defined in
27 `sorter/array-simple.h`. It is an optimised hybrid
28 quick-sort/insert-sort algorithm (quick-sort is used to split the
29 input into small parts, each is then sorted by insert-sort). It is
30 more than 2 times faster than stdlib's qsort(), mostly because of
33 You need to define few macros and include the header. You get a
34 sorting function in return. It will be called
35 <<fun__GENERIC_LINK_|ASORT_PREFIX|sort|,`ASORT_PREFIX(sort)`>>.
40 - `ASORT_PREFIX(name)` -- The identifier generating macro.
41 - `ASORT_KEY_TYPE` -- Data type of a single array entry key.
46 - `ASORT_ELT(i)` -- Indexing macro. Returns the key of the
47 corresponding entry. If not provided, usual array with sequential
49 - `ASORT_LT(x,y)` -- Comparing macro. If not provided, compares by the
51 - `ASORT_SWAP(i,j)` -- Swap elements with indices `i` and `j`. If not
52 provided, it assumes `ASORT_ELT` is l-value and it just swaps keys.
53 - `ASORT_TRESHOLD` -- Sequences of at most this amount of elements are
54 sorted by quick-sort, smaller are sorted by insert-sort. Defaults to
55 `8` (result of experimentation).
56 - `ASORT_EXTRA_ARGS` -- Pass some extra arguments to the function.
57 They are visible from all the macros. Must start with a comma.
59 !!ucw/sorter/array-simple.h ASORT_PREFIX
65 Let's sort an array of integers, in the usual way.
67 #define ASORT_PREFIX(X) intarr_##X
68 #define ASORT_TYPE int
69 #include <ucw/sorter/array-simple.h>
71 This generates an intarr_sort(int *array, uns array_size) function that
72 can be used the obvious way.
74 A more complicated example could be sorting a structure, where items
75 with odd indices are stored in one array, even in another. Each item
76 could be a structure containing a string and an integer. We would like
77 to sort them by the strings.
84 #include <string.h> // Because of strcmp
85 #define ASORT_PREFIX(X) complicated_##X
86 #define ASORT_TYPE struct elem
87 #define ASORT_ELT(i) ((i % 2 ? even_array : odd_array)[i / 2])
88 #define ASORT_LT(x, y) (strcmp((x).string, (y).string) < 0)
89 #define ASORT_EXTRA_ARGS , struct elem *odd_array, struct elem *even_array
90 #include <ucw/sorter/sorter/array-simple.h>
92 Now we got a complicated_sort(uns array_size, struct elem *odd_array,
93 struct *even_array) function to perform our sorting.
99 This one is very similar to the simple array sorter, but it is
100 optimised for huge arrays. It is used mostly by the
101 <<external,external sorter>> machinery described below, but you can
104 It differs in few details:
105 - It supports only continuous arrays, no indexing macro can be
107 - It is able to sort in parallel on SMP systems. It assumes all
108 callbacks you provide are thread-safe.
109 - If you provide a monotone hash function (if `hash(x) < hash(y)`, then
110 `x < y`, but `x` and `y` may differ when `hash(x) == hash(y)`), it
111 will use it to gain some more speed by radix-sort.
117 - `ASORT_PREFIX(x)` -- The identifier generating macro.
118 - `ASORT_KEY_TYPE` -- Type of elements in the array.
124 - `ASORT_LT(x,y)` -- Comparing macro. Uses the `<` operator if not
126 - `ASORT_HASH(x)` -- A monotone hash function (or macro). Should
128 - `ASORT_LONG_HASH(x)` -- Like `ASORT_HASH(x)`, but returns 64-bit
129 number instead of 32-bit.
130 - `ASORT_TRESHOLD` -- How small should a chunk of data be to be sorted
131 by insert-sort? Defaults to `8` elements.
132 - `ASORT_RADIX_BITS` -- How many bits of the hash function should be
133 used at once for radix-sort? The default is guessed from your
136 !!ucw/sorter/array.h ASORT_PREFIX