abstract.tex

   1 \input macros.tex
   2 \finalfalse  %%FIXME
   3
   4 \def\rawchapter#1{\vensure{0.5in}\bigbreak\bigbreak
   5 \leftline{\chapfont #1}
   6 }
   7
   8 \def\rawsection#1{\bigskip
   9 \leftline{\secfont #1}
  10 \nobreak
  11 \medskip
  12 \nobreak
  13 }
  14
  15 \chapter{Introduction}
  16 \bigskip
  17
  18 This thesis tells the story of two well-established problems of algorithmic
  19 graph theory: the minimum spanning trees and ranks of permutations. At distance,
  20 both problems seem to be simple, boring and already solved, because we have poly\-nom\-ial-time
  21 algorithms for them since ages. But when we come closer and seek algorithms that
  22 are really efficient, the problems twirl and twist and withstand many a~brave
  23 attempt at the optimum solution. They also reveal a~vast and diverse landscape
  24 of a~deep and beautiful theory. Still closer, this landscape turns out to be interwoven
  25 with the intricate details of various models of computation and even of arithmetics
  26 itself.
  27
  28 We have tried to cover all known important results on both problems and unite them
  29 in a~single coherent theory. At many places, we have attempted to contribute my own
  30 little stones to this mosaic: several new results, simplifications of existing
  31 ones, and last, but not least filling in important details where the original
  32 authors have missed some.
  33
  34 When compared with the earlier surveys on the minimum spanning trees, most
  35 notably Graham and Hell \cite{graham:msthistory} and Eisner \cite{eisner:tutorial},
  36 this work adds many of the recent advances, the dynamic algorithms and
  37 also the relationship with computational models. No previous work covering
  38 the ranking problems in their entirety is known.
  39
  40 We~have tried to stick to the usual notation except where it was too inconvenient.
  41 Most symbols are defined at the place where they are used for the first time.
  42 To avoid piling up too many symbols at places that speak about a~single fixed graph,
  43 this graph is always called~$G$, its set of vertices and edges are denoted by $V$
  44 and~$E$ respectively, and we~also use~$n$ for the number of its vertices and $m$~for
  45 the number of edges. At places where there could be a~danger of confusion, more explicit notation
  46 is used instead.
  47
  48 \chapter{Minimum Spanning Trees}
  49
  50 \section{The Problem}
  51
  52 The problem of finding a minimum spanning tree of a weighted graph is one of the
  53 best studied problems in the area of combinatorial optimization since its birth.
  54 Its colorful history (see \cite{graham:msthistory} and \cite{nesetril:history} for the full account)
  55 begins in~1926 with the pioneering work of Bor\o{u}vka
  56 \cite{boruvka:ojistem}\foot{See \cite{nesetril:boruvka} for an English translation with commentary.},
  57 who studied primarily an Euclidean version of the problem related to planning
  58 of electrical transmission lines (see \cite{boruvka:networks}), but gave an efficient
  59 algorithm for the general version of the problem. As it was well before the dawn of graph
  60 theory, the language of his paper was complicated, so we will better state the problem
  61 in contemporary terminology:
  62
  63 \proclaim{Problem}Given an undirected graph~$G$ with weights $w:E(G)\rightarrow {\bb R}$,
  64 find its minimum spanning tree, defined as follows:
  65
  66 \defn\id{mstdef}%
  67 For a given graph~$G$ with weights $w:E(G)\rightarrow {\bb R}$:
  68 \itemize\ibull
  69 \:A~subgraph $H\subseteq G$ is called a \df{spanning subgraph} if $V(H)=V(G)$.
  70 \:A~\df{spanning tree} of~$G$ is any spanning subgraph of~$G$ that is a tree.
  71 \:For any subgraph $H\subseteq G$ we define its \df{weight} $w(H):=\sum_{e\in E(H)} w(e)$.
  72 \:A~\df{minimum spanning tree (MST)} of~$G$ is a spanning tree~$T$ such that its weight $w(T)$
  73   is the smallest possible among all the spanning trees of~$G$.
  74 \:For a disconnected graph, a \df{(minimum) spanning forest (MSF)} is defined as
  75   a union of (minimum) spanning trees of its connected components.
  76 \endlist
  77
  78 Bor\o{u}vka's work was further extended by Jarn\'\i{}k \cite{jarnik:ojistem}, again in
  79 mostly geometric setting, and he has discovered another efficient algorithm.
  80 In the next 50 years, several significantly faster algorithms were published, ranging
  81 from the $\O(m\timesbeta(m,n))$ time algorithm by Fredman and Tarjan \cite{ft:fibonacci},
  82 over algorithms with inverse-Ackermann type complexity by Chazelle \cite{chazelle:ackermann}
  83 and Pettie \cite{pettie:ackermann}, to an~algorithm by Pettie \cite{pettie:optimal}
  84 whose time complexity is provably optimal.
  85
  86 Before we discuss the algorithms, let us review the basic properties of spanning trees.
  87 We will mostly follow the theory developed by Tarjan in~\cite{tarjan:dsna} and show
  88 that the weights on edges are not necessary for the definition of the MST.
  89
  90 \defnn{Heavy and light edges}\id{heavy}%
  91 Let~$G$ be a~connected graph with edge weights~$w$ and $T$ its spanning tree. Then:
  92 \itemize\ibull
  93 \:For vertices $x$ and $y$, let $T[x,y]$ denote the (unique) path in~$T$ joining $x$ with~$y$.
  94 \:For an edge $e=xy$ we will call $T[e]:=T[x,y]$ the \df{path covered by~$e$} and
  95   the edges of this path \df{edges covered by~$e$}.
  96 \:An edge~$e$ is called \df{light with respect to~$T$} (or just \df{$T$-light}) if it covers a~heavier edge, i.e., if there
  97   is an~edge $f\in T[e]$ such that $w(f) > w(e)$.
  98 \:An edge~$e$ is called \df{$T$-heavy} if it covers a~lighter edge.
  99 \endlist
 100
 101 \thm
 102 A~spanning tree~$T$ is minimum iff there is no $T$-light edge.
 103
 104 \thm
 105 If all edge weights are distinct, then the minimum spanning tree is unique.
 106
 107 \para
 108 When $G$ is a graph with distinct edge weights, we will use $\mst(G)$ to denote
 109 its unique minimum spanning tree.
 110 To simplify the description of MST algorithms, we will assume that the weights
 111 of all edges are distinct and that instead of numeric weights we are given a~\df{comparison oracle.}
 112 The oracle is a~function that answers questions of type ``Is $w(e)<w(f)$?'' in
 113 constant time. This will conveniently shield us from problems with representation
 114 of real numbers in algorithms and in the few cases where we need a more concrete
 115 input, we will explicitly state so. In case the weights are not distinct, the ties
 116 can be broken arbitrarily.
 117
 118 \section{Classical algorithms}
 119
 120 The characterization of MST's in terms of light edges makes it easy to develop
 121 the Tarjan's Red-Blue meta-algorithm, which is based on the following properties:
 122
 123 \lemman{Blue lemma, also known as the Cut rule}\id{bluelemma}%
 124 The lightest edge of every cut is contained in the MST.
 125
 126 \lemman{Red lemma, also known as the Cycle rule}\id{redlemma}%
 127 An~edge~$e$ is not contained in the MST iff it is the heaviest on some cycle.
 128
 129 \para
 130 The algorithm repeatedly colors lightest edges of cuts blue and heaviest
 131 edges of cycles red. We prove that no matter which order of the colorings
 132 we use, the algorithm always stops and the blue edges form the MST.
 133
 134 All three classical MST algorithms (Bor\o{u}vka's, Jarn\'\i{}k's and Kruskal's)
 135 can be then obtained as specializations of this procedure. We also calculate the
 136 time complexity of standard implementations of these algorithms.
 137
 138 \algn{Bor\o{u}vka \cite{boruvka:ojistem}, Choquet \cite{choquet:mst}, Sollin \cite{sollin:mst}, and others}
 139 \algo
 140 \algin A~graph~$G$ with an edge comparison oracle.
 141 \:$T\=$ a forest consisting of vertices of~$G$ and no edges.
 142 \:While $T$ is not connected:
 143 \::For each component $T_i$ of~$T$, choose the lightest edge $e_i$ from the cut
 144    separating $T_i$ from the rest of~$T$.
 145 \::Add all $e_i$'s to~$T$.
 146 \algout Minimum spanning tree~$T$.
 147 \endalgo
 148
 149 \thm
 150 The Bor\o{u}vka's algorithm finds the MST in time $\O(m\log n)$.
 151
 152 \algn{Jarn\'\i{}k \cite{jarnik:ojistem}, Prim \cite{prim:mst}, Dijkstra \cite{dijkstra:mst}}\id{jarnik}%
 153 \algo
 154 \algin A~graph~$G$ with an edge comparison oracle.
 155 \:$T\=$ a single-vertex tree containing an~arbitrary vertex of~$G$.
 156 \:While there are vertices outside $T$:
 157 \::Pick the lightest edge $uv$ such that $u\in V(T)$ and $v\not\in V(T)$.
 158 \::$T\=T+uv$.
 159 \algout Minimum spanning tree~$T$.
 160 \endalgo
 161
 162 \thm
 163 The Jarn\'\i{}k's algorithm computes the MST of a~given graph in time $\O(m\log n)$.
 164
 165 \algn{Kruskal \cite{kruskal:mst}}
 166 \algo
 167 \algin A~graph~$G$ with an edge comparison oracle.
 168 \:Sort edges of~$G$ by their increasing weights.
 169 \:$T\=\hbox{an empty spanning subgraph}$.
 170 \:For all edges $e$ in their sorted order:
 171 \::If $T+e$ is acyclic, add~$e$ to~$T$.
 172 \::Otherwise drop~$e$.
 173 \algout Minimum spanning tree~$T$.
 174 \endalgo
 175
 176 \thm
 177 The Kruskal's algorithm finds the MST of a given graph in time $\O(m\log n)$.
 178 If the edges are already sorted by their weights, the time drops to
 179 $\O(m\timesalpha(m,n))$.\foot{Here $\alpha(m,n)$ is the inverse Ackermann's
 180 function.}
 181
 182 \section{Contractive algorithms}
 183
 184 While the classical algorithms are based on growing suitable trees, they
 185 can be also reformulated in terms of edge contraction. Instead of keeping
 186 a~forest of trees, we can keep each tree contracted to a single vertex.
 187 This replaces the relatively complex tree-edge incidencies by simple
 188 vertex-edge incidencies, potentially speeding up the calculation at the
 189 expense of having to perform the contractions.
 190 A~contractive version of the Bor\o{u}vka's algorithm is easy to formulate
 191 and also to analyse:
 192
 193 \algn{Contractive version of Bor\o{u}vka's algorithm}
 194 \algo
 195 \algin A~graph~$G$ with an edge comparison oracle.
 196 \:$T\=\emptyset$.
 197 \:$\ell(e)\=e$ for all edges~$e$. \cmt{Initialize edge labels.}
 198 \:While $n(G)>1$:
 199 \::For each vertex $v_k$ of~$G$, let $e_k$ be the lightest edge incident to~$v_k$.
 200 \::$T\=T\cup \{ \ell(e_1),\ldots,\ell(e_n) \}$.\hfil\break\cmt{Remember labels of all selected edges.}
 201 \::Contract all edges $e_k$, inheriting labels and weights.\foot{In other words, we will ask the comparison oracle for the edge $\ell(e)$ instead of~$e$.}
 202 \::Flatten $G$ (remove parallel edges and loops).
 203 \algout Minimum spanning tree~$T$.
 204 \endalgo
 205
 206 \thm
 207 The Contractive Bor\o{u}vka's algorithm finds the MST of the input graph in
 208 time $\O(\min(n^2,m\log n))$.
 209
 210 We also show that this time bound is tight --- we construct an~explicit
 211 family of graphs on which the algorithm spends $\Theta(m\log n)$ steps.
 212 Given a~planar graph, the algorithm however runs much faster (we get a~linear-time
 213 algorithm much simpler than the one of Matsui \cite{matsui:planar}):
 214
 215 \thm
 216 When the input graph is planar, the Contractive Bor\o{u}vka's algorithm runs in
 217 time $\O(n)$.
 218
 219 Graph contractions are indeed a~very powerful tool and they can be used in other MST
 220 algorithms as well. The following lemma shows the gist:
 221
 222 \lemma
 223 Let $G$ be a weighted graph, $e$~an arbitrary edge of~$\mst(G)$, $G/e$ the multigraph
 224 produced by contracting~$e$ in~$G$, and $\pi$ the bijection between edges of~$G-e$ and
 225 their counterparts in~$G/e$. Then $\mst(G) = \pi^{-1}[\mst(G/e)] + e.$
 226
 227 \chapter{Fine Details of Computation}
 228
 229 \section{Models and machines}
 230
 231 Traditionally, computer scientists have been using a~variety of computational models
 232 as a~formalism in which their algorithms are stated. If we were studying
 233 NP-complete\-ness, we could safely assume that all these models are equivalent,
 234 possibly up to polynomial slowdown which is negligible. In our case, the
 235 differences between good and not-so-good algorithms are on a~much smaller
 236 scale, so we need to state our computation models carefully and develop
 237 a repertoire of basic data structures tailor-made for the fine details of the
 238 models.
 239
 240 In recent decades, most researchers in the area of combinatorial algorithms
 241 have been considering two computational models: the Random Access Machine and the Pointer
 242 Machine. The former is closer to the programmer's view of a~real computer,
 243 the latter is slightly more restricted and ``asymptotically safe.''
 244 We will follow this practice and study our algorithms in both models.
 245
 246 The \df{Random Access Machine (RAM)} is not a~single coherent model, but rather a~family
 247 of closely related machines (See Cook and Reckhow \cite{cook:ram} for one of the usual formal definitions
 248 and Hagerup \cite{hagerup:wordram} for a~thorough description of the differences
 249 between the RAM variants.) We will consider the variant usually called the \df{Word-RAM.}
 250 It allows the ``C-language operators'', i.e., arithmetics and bitwise logical operations,
 251 running in constant time on words of a~specified size.
 252
 253 The \df{Pointer Machine (PM)} also does not seem to have any well established definition.
 254 The various kinds of pointer machines are examined by Ben-Amram in~\cite{benamram:pm},
 255 but unlike the RAM's they turn out to be equivalent up to constant slowdown.
 256 Our formal definition is closely related to the \em{linking automaton} proposed
 257 by Knuth in~\cite{knuth:fundalg}.
 258
 259 \section{Pointer machine techniques}
 260
 261 In the Contractive Bor\o{u}vka's algorithm, we needed to contract a~given
 262 set of edges in the current graph and then flatten the graph, all this in time $\O(m)$.
 263 This can be easily handled on both the RAM and the PM by bucket sorting. We develop
 264 a~bunch of pointer-based sorted techniques which can be summarized by the following
 265 lemma:
 266
 267 \lemma
 268 Partitioning of a~collection of sequences $S_1,\ldots,S_n$, whose elements are
 269 arbitrary pointers and symbols from a~finite alphabet, to equality classes can
 270 be performed on the Pointer Machine in time $\O(n + \sum_i \vert S_i \vert)$.
 271
 272 \para
 273 A~direct consequence of this unification is a~linear-time algorithm for subtree
 274 isomorphism, significantly simpler than the standard one due to Zemlayachenko (see \cite{zemlay:treeiso}
 275 and also Dinitz et al.~\cite{dinitz:treeiso}). When we apply a~similar technique
 276 to general graphs, we get the framework of topological graph computation
 277 of Buchsbaum et al.~\cite{buchsbaum:verify}.
 278
 279 \defn
 280 A~\df{graph computation} is a~function that takes a~\df{labeled undirected graph} as its input. The labels of
 281 vertices and edges can be arbitrary symbols drawn from a~finite alphabet. The output
 282 of the computation is another labeling of the same graph. This time, the vertices and
 283 edges can be labeled with not only symbols of the alphabet, but also with pointers to the vertices
 284 and edges of the input graph, and possibly also with pointers to outside objects.
 285 A~graph computation is called \df{topological} if it produces isomorphic
 286 outputs for isomorphic inputs. The isomorphism of course has to preserve not only
 287 the structure of the graph, but also the labels in the obvious way.
 288
 289 \defn
 290 For a~collection~$\C$ of graphs, we define $\vert\C\vert$ as the number of graphs in
 291 the collection and $\Vert\C\Vert$ as their total size, i.e., $\Vert\C\Vert = \sum_{G\in\C} n(G) + m(G)$.
 292
 293 \thm
 294 Suppose that we have a~topological graph computation~$\cal T$ that can be performed in time
 295 $T(k)$ for graphs on $k$~vertices. Then we can run~$\cal T$ on a~collection~$\C$
 296 of labeled graphs on~$k$ vertices in time $\O(\Vert\C\Vert + (k+s)^{k(k+2)}\cdot (T(k)+k^2))$,
 297 where~$s$ is a~constant depending only on the number of symbols used as vertex/edge labels.
 298
 299 \section{Data structures on the RAM}
 300
 301 There is a~lot of data structures designed specifically for the RAM. These structures
 302 take advantage of both indexing and arithmetics and they often surpass the known
 303 lower bounds for the same problem on the~PM. In many cases, they achieve constant time
 304 per operation, at least when either the magnitude of the values or the size of
 305 the data structure is suitably bounded.
 306
 307 A~classical result of this type is the tree of van Emde Boas~\cite{boas:vebt}
 308 which represent a~subset of the integers $\{0,\ldots,U-1\}$. It allows insertion,
 309 deletion and order operations (minimum, maximum, successor etc.) in time $\O(\log\log U)$,
 310 regardless of the size of the subset. If we replace the heap used in the Jarn\'\i{}k's
 311 algorithm (\ref{jarnik}) by this structure, we immediately get an~algorithm
 312 for finding the MST in integer-weighted graphs in time $\O(m\log\log w_{max})$,
 313 where $w_{max}$ is the maximum weight.
 314
 315 A~real breakthrough has however been made by Fredman and Willard who introduced
 316 the Fusion trees~\cite{fw:fusion}. They again perform membership and predecessor
 317 operation on a~set of $n$~integers, but with time complexity $\O(\log_W n)$
 318 per operation on a~Word-RAM with $W$-bit words. This of course assumes that
 319 each element of the set fits in a~single word. As $W$ must at least~$\log n$,
 320 the operations take $\O(\log n/\log\log n)$ time and thus we are able to sort $n$~integers
 321 in time~$o(n\log n)$. This was a~beginning of a~long sequence of faster and
 322 faster sorting algorithms, culminating with the work of Thorup and Han.
 323 They have improved the time complexity of integer sorting to $\O(n\log\log n)$ deterministically~\cite{han:detsort}
 324 and expected $\O(n\sqrt{\log\log n})$ for randomized algorithms~\cite{hanthor:randsort},
 325 both in linear space.
 326
 327 The Fusion trees themselves have very limited use in graph algorithms, but the
 328 principles behind them are ubiquitous in many other data structures and these
 329 will serve us well and often. We are going to build the theory of Q-heaps,
 330 which will later lead to a~linear-time MST algorithm for arbitrary integer weights.
 331 Other such structures will help us in building linear-time RAM algorithms for computing the ranks
 332 of various combinatorial structures in Chapter~\ref{rankchap}.
 333
 334 Outside our area, important consequences of RAM data structures include the
 335 Thorup's $\O(m)$ algorithm for single-source shortest paths in undirected
 336 graphs with positive integer weights \cite{thorup:usssp} and his $\O(m\log\log
 337 n)$ algorithm for the same problem in directed graphs \cite{thorup:sssp}. Both
 338 algorithms have been then significantly simplified by Hagerup
 339 \cite{hagerup:sssp}.
 340
 341 Despite the progress in the recent years, the corner-stone of all RAM structures
 342 is still the representation of combinatorial objects by integers introduced by
 343 Fredman and Willard.
 344
 345 First of all, we observe that we can encode vectors in integers:
 346
 347 \notan{Bit strings}\id{bitnota}%
 348 We will work with binary representations of natural numbers by strings over the
 349 alphabet $\{\0,\1\}$: we will use $\(x)$ for the number~$x$ written in binary,
 350 $\(x)_b$ for the same padded to exactly $b$ bits by adding leading zeroes,
 351 and $x[k]$ for the value of the $k$-th bit of~$x$ (with a~numbering of bits such that $2^k[k]=1$).
 352 The usual conventions for operations on strings will be utilized: When $s$
 353 and~$t$ are strings, we write $st$ for their concatenation and
 354 $s^k$ for the string~$s$ repeated $k$~times.
 355 When the meaning is clear from the context,
 356 we will use $x$ and $\(x)$ interchangeably to avoid outbreak of symbols.
 357
 358 \defn
 359 The \df{bitwise encoding} of a~vector ${\bf x}=(x_0,\ldots,x_{d-1})$ of~$b$-bit numbers
 360 is an~integer~$x$ such that $\(x)=\(x_{d-1})_b\0\(x_{d-2})_b\0\ldots\0\(x_0)_b$. In other
 361 words, $x = \sum_i 2^{(b+1)i}\cdot x_i$. (We have interspersed the elements with \df{separator bits.})
 362
 363 \para
 364 If we want to fit the whole vector in a~single machine word, the parameters $b$ and~$d$ must satisfy
 365 the condition $(b+1)d\le W$ (where $W$~is the word size of the machine).
 366 By using multiple-precision arithmetics, we can encode all vectors satisfying $bd=\O(W)$.
 367 We describe how to translate simple vector manipulations to sequences of $\O(1)$ RAM operations
 368 on their codes. For example, we can handle element-wise comparison of vectors, insertion
 369 in a~sorted vector or shuffling elements of a~vector according to a~fixed permutation,
 370 all in $\O(1)$ time. This also implies that several functions on numbers can be performed
 371 in constant time, most notably binary logarithms.
 372 The vector operations then serve as building blocks for construction of the Q-heaps. We get:
 373
 374 \thm
 375 Let $W$ and~$k$ be positive integers such that $k=\O(W^{1/4})$. Let~$Q$
 376 be a~Q-heap of at most $k$-elements of $W$~bits each. Then we can perform
 377 Q-heap operations on~$Q$ (insertion, deletion, search for a~given value and search
 378 for the $i$-th smallest element) in constant time on a~Word-RAM with word size~$W$,
 379 after spending time $\O(2^{k^4})$ on the same RAM on precomputing of tables.
 380
 381 \cor
 382 For every positive integer~$r$ and $\delta>0$ there exists a~data structure
 383 capable of maintaining the minimum of a~set of at most~$r$ word-sized numbers
 384 under insertions and deletions. Each operation takes $\O(1)$ time on a~Word-RAM
 385 with word size $W=\Omega(r^{\delta})$, after spending time
 386 $\O(2^{r^\delta})$ on precomputing of tables.
 387
 388 \chapter{Ranking Combinatorial Structures}\id{rankchap}%
 389
 390 \chapter{Bibliography}
 391
 392 \dumpbib
 393
 394 \vfill\eject
 395 \ifodd\pageno\else\hbox{}\fi
 396
 397 \bye