From: Martin Mares Date: Sat, 13 Sep 2008 14:25:21 +0000 (+0200) Subject: Removed files that are not needed for the saga itself. X-Git-Url: http://mj.ucw.cz/gitweb/?a=commitdiff_plain;h=6e832b0a77ed3aaabf0046fe0081c305856379d4;p=saga.git Removed files that are not needed for the saga itself. --- diff --git a/Makefile b/Makefile index 89f0c42..cbd8ba1 100644 --- a/Makefile +++ b/Makefile @@ -1,4 +1,4 @@ -all: saga.ps abstract.ps abscover.ps pubs.ps +all: saga.ps CHAPTERS=cover pref mst ram adv opt dyn appl rank epilog notation @@ -8,12 +8,6 @@ CHAPTERS=cover pref mst ram adv opt dyn appl rank epilog notation tex $< && mv $*.toc $*.tok tex $< && mv $*.toc $*.tok -pubs.dvi: pubs.tex macros.tex fonts12.tex - tex $< - -abscover.dvi: abscover.tex - csplain $< - saga.dvi: $(addsuffix .tex,$(CHAPTERS)) %.ps: %.dvi @@ -28,10 +22,4 @@ mostlyclean: clean: mostlyclean rm -f *.ps *.pdf -countrefs: saga.dvi - grep -c bibitem saga.bbl - -upload: saga.pdf - scp -C saga.pdf jw:www/papers/saga/ - .SECONDARY: diff --git a/abscover.tex b/abscover.tex deleted file mode 100644 index 2915681..0000000 --- a/abscover.tex +++ /dev/null @@ -1,205 +0,0 @@ -% Cover of the abstract - -\input macros.tex -\input fonts12.tex - -\finaltrue -\hwobble=0mm -\advance\hsize by 1cm -\advance\vsize by 20pt - -\nopagenumbers -\parindent=0pt - -%%% Title page %%% - -{ - -\vglue 0.7in - -\font\ft=cmr17 -\font\xt=cmb17 at 24pt -\font\yt=cmti17 -\font\ct=cmsy17 at 18pt -\font\st=cmcsc17 at 20pt -\ft -\baselineskip=24pt - -\centerline{Charles University in Prague} -\centerline{Faculty of Mathematics and Physics} - -\vfil -\vfil - -\centerline{\epsfxsize=0.4\hsize\epsfbox{pic/mfflogo.eps}} - -\vfil -\vfil - -\centerline{\st Abstract of Doctoral Thesis} - -\vfil - -\centerline{\xt Graph Algorithms} - -\vfil -\vfil - -\centerline{\yt {\ct M}\kern-0.13em artin {\ct M}\kern-0.13em are\v{s}} - -\vfil - -\centerline{Department of Applied Mathematics} -\centerline{Malostransk\'e n\'am.~25} -\centerline{Prague, Czech Republic} - -\vfil - -\centerline{Supervisor: Prof.~RNDr.~Jaroslav Ne\v{s}et\v{r}il, DrSc.} -\centerline{Branch I4: Discrete Models and Algorithms} - -\vfil - -\centerline{2008} - -\vskip 0.5in -\eject - -\vglue 0pt -\eject -\vglue 0.7in - -\centerline{Univerzita Karlova v Praze} -\centerline{Matematicko-fyzik\'aln\'\i{} fakulta} - -\vfil -\vfil - -\centerline{\epsfxsize=0.4\hsize\epsfbox{pic/mfflogo.eps}} - -\vfil -\vfil - -\centerline{\st Autorefer\'at} - -\vfil - -\centerline{\xt Grafov\'e algoritmy} - -\vfil -\vfil - -\centerline{\yt {\ct M}\kern-0.13em artin {\ct M}\kern-0.13em are\v{s}} - -\vfil - -\centerline{Katedra aplikovan\'e matematiky} -\centerline{Malostransk\'e n\'am.~25} -\centerline{118 00~~Praha 1} - -\vfil - -\centerline{\v{S}kolitel: Prof.~RNDr.~Jaroslav Ne\v{s}et\v{r}il, DrSc.} -\centerline{Obor I4: Diskr\'etn\'\i{} modely a algoritmy} - -\vfil - -\centerline{2008} -\vskip 0.5in -\eject - -} - -%%% Subtitle page %%% - -{ -\vglue 0pt - -\language=\czech -\chyph -\font\ft=csr12 -\font\fb=csbx12 -\ft -\parskip=0pt - -Disertaèní práce byla vypracována v~rámci interního a navazujícího externího -doktorského studia na~Katedøe aplikované matematiky Matematicko-fyzikální fakulty -Univerzity Karlovy v~Praze. - -\bigskip - -{\fb Uchazeè:} Mgr. Martin Mare¹ - -\bigskip - -{\fb ©kolitel:} Prof.~RNDr.~Jaroslav Ne¹etøil, DrSc. - -\bigskip - -{\fb ©kolící pracovi¹tì:} - -\smallskip - -{\obeylines -Katedra aplikované matematiky -MFF UK -Malostranské nám. 25 -118 00 Praha 1 -} - -\bigskip - -{\fb Oponenti:} - -\smallskip - -{\obeylines - -Josep Díaz -Departament de Llenguatges i Sistemes Inform\`atics -Universitat Polit\`ecnica de Catalunya -Campus Nord -- Ed. Omega, 240 -Jordi Girona Salgado, 1--3 -E-08034 Barcelona, Spain - -\medskip - -Doc. RNDr. Václav Koubek, DrSc. -Katedra teoretické informatiky a matematické logiky -MFF UK -Malostranské nám. 25 -118 00 Praha 1 - -\medskip - -Patrice Ossona de Mendez -École de Hautes Études en Sciences Sociales -CAMS -- UMR 8557 -54 Boulevard Raspail -75006 Paris, France - -} - -\bigskip - -{\fb Pøedseda oborové rady I4:} Prof. RNDr. Jaroslav Ne¹etøil, DrSc. - -\bigskip - -Autoreferát byl rozeslán dne 27. 6. 2008. - -\medskip - -Obhajoba disertaèní práce se koná dne 29. 7. 2008 od 14:30 pøed komisí obhajoby -doktorských disertaèních prací v~oboru I4 v~budovì MFF UK na~Malostranském námìstí 25, -118 00 Praha 1. - -\medskip - -S~disertaèní prací je mo¾no se seznámit na~studijním oddìlení doktorského studia MFF UK, -Ke~Karlovu~3, 121 16 Praha 2. - -\vfill\eject -} - -\bye diff --git a/abstract.tex b/abstract.tex deleted file mode 100644 index 7c67936..0000000 --- a/abstract.tex +++ /dev/null @@ -1,1408 +0,0 @@ -\input macros.tex -\input fonts10.tex - -\finaltrue -\hwobble=0mm -\advance\hsize by 1cm -\advance\vsize by 20pt - -\def\rawchapter#1{\vensure{0.5in}\bigskip\goodbreak -\leftline{\chapfont #1} -} - -\def\rawsection#1{\medskip\smallskip -\leftline{\secfont #1} -\nobreak -\smallskip -\nobreak -} - -\def\schapter#1{\chapter{#1}\medskip} - -\schapter{Introduction} - -This thesis tells the story of two well-established problems of algorithmic -graph theory: the minimum spanning trees and ranks of permutations. At distance, -both problems seem to be simple, boring and already solved, because we have poly\-nom\-ial-time -algorithms for them since ages. But when we come closer and seek algorithms that -are really efficient, the problems twirl and twist and withstand many a~brave -attempt at the optimum solution. They also reveal a~vast and diverse landscape -of a~deep and beautiful theory. Still closer, this landscape turns out to be interwoven -with the intricate details of various models of computation and even of arithmetics -itself. - -We have tried to cover all known important results on both problems and unite them -in a~single coherent theory. At many places, we have attempted to contribute our own -little stones to this mosaic: several new results, simplifications of existing -ones, and last, but not least filling in important details where the original -authors have missed some. - -When compared with the earlier surveys on the minimum spanning trees, most -notably Graham and Hell \cite{graham:msthistory} and Eisner \cite{eisner:tutorial}, -this work adds many of the recent advances, the dynamic algorithms and -also the relationship with computational models. No previous work covering -the ranking problems in their entirety is known. - -We~have tried to stick to the usual notation except where it was too inconvenient. -Most symbols are defined at the place where they are used for the first time. -To avoid piling up too many symbols at places that speak about a~single fixed graph, -this graph is always called~$G$, its set of vertices and edges are denoted by $V$ -and~$E$ respectively, and we~also use~$n$ for the number of its vertices and $m$~for -the number of edges. At places where there could be a~danger of confusion, more explicit notation -is used instead. - -\chapter{Minimum Spanning Trees} - -\section{The Problem} - -The problem of finding a minimum spanning tree of a weighted graph is one of the -best studied problems in the area of combinatorial optimization since its birth. -Its colorful history (see \cite{graham:msthistory} and \cite{nesetril:history} for the full account) -begins in~1926 with the pioneering work of Bor\o{u}vka -\cite{boruvka:ojistem}\foot{See \cite{nesetril:boruvka} for an English translation with commentary.}, -who studied primarily an Euclidean version of the problem related to planning -of electrical transmission lines (see \cite{boruvka:networks}), but gave an efficient -algorithm for the general version of the problem. As it was well before the dawn of graph -theory, the language of his paper was complicated, so we will better state the problem -in contemporary terminology: - -\proclaim{Problem}Given an undirected graph~$G$ with weights $w:E(G)\rightarrow {\bb R}$, -find its minimum spanning tree, defined as follows: - -\defn\id{mstdef}% -For a given graph~$G$ with weights $w:E(G)\rightarrow {\bb R}$: -\itemize\ibull -\:A~subgraph $H\subseteq G$ is called a \df{spanning subgraph} if $V(H)=V(G)$. -\:A~\df{spanning tree} of~$G$ is any spanning subgraph of~$G$ that is a tree. -\:For any subgraph $H\subseteq G$ we define its \df{weight} $w(H):=\sum_{e\in E(H)} w(e)$. -\:A~\df{minimum spanning tree (MST)} of~$G$ is a spanning tree~$T$ such that its weight $w(T)$ - is the smallest possible among all the spanning trees of~$G$. -\:For a disconnected graph, a \df{(minimum) spanning forest (MSF)} is defined as - a union of (minimum) spanning trees of its connected components. -\endlist - -Bor\o{u}vka's work was further extended by Jarn\'\i{}k \cite{jarnik:ojistem}, again in -mostly geometric setting, and he has discovered another efficient algorithm. -In the next 50 years, several significantly faster algorithms were published, ranging -from the $\O(m\timesbeta(m,n))$ time algorithm by Fredman and Tarjan \cite{ft:fibonacci}, -over algorithms with inverse-Ackermann type complexity by Chazelle \cite{chazelle:ackermann} -and Pettie \cite{pettie:ackermann}, to an~algorithm by Pettie \cite{pettie:optimal} -whose time complexity is provably optimal. - -Before we discuss the algorithms, let us review the basic properties of spanning trees. -We will mostly follow the theory developed by Tarjan in~\cite{tarjan:dsna} and show -that the weights on edges are not necessary for the definition of the MST. - -\defnn{Heavy and light edges}\id{heavy}% -Let~$G$ be a~connected graph with edge weights~$w$ and $T$ its spanning tree. Then: -\itemize\ibull -\:For vertices $x$ and $y$, let $T[x,y]$ denote the (unique) path in~$T$ joining $x$ with~$y$. -\:For an edge $e=xy$ we will call $T[e]:=T[x,y]$ the \df{path covered by~$e$} and - the edges of this path \df{edges covered by~$e$}. -\:An edge~$e$ is called \df{light with respect to~$T$} (or just \df{$T$-light}) if it covers a~heavier edge, i.e., if there - is an~edge $f\in T[e]$ such that $w(f) > w(e)$. -\:An edge~$e$ is called \df{$T$-heavy} if it covers a~lighter edge. -\endlist - -\thm -A~spanning tree~$T$ is minimum iff there is no $T$-light edge. - -\thm -If all edge weights are distinct, then the minimum spanning tree is unique. - -\para -When $G$ is a graph with distinct edge weights, we will use $\mst(G)$ to denote -its unique minimum spanning tree. -To simplify the description of MST algorithms, we will assume that the weights -of all edges are distinct and that instead of numeric weights we are given a~\df{comparison oracle.} -The oracle is a~function that answers questions of type ``Is $w(e)1$: -\::For each vertex $v_k$ of~$G$, let $e_k$ be the lightest edge incident to~$v_k$. -\::$T\=T\cup \{ \ell(e_1),\ldots,\ell(e_n) \}$.\cmt{Remember labels of all selected edges.} -\::Contract all edges $e_k$, inheriting labels and weights.\foot{In other words, we will ask the comparison oracle for the edge $\ell(e)$ instead of~$e$.} -\::Flatten $G$ (remove parallel edges and loops). -\algout Minimum spanning tree~$T$. -\endalgo - -\thm -The Contractive Bor\o{u}vka's algorithm finds the MST of the graph given as -its input in time $\O(\min(n^2,m\log n))$. - -We also show that this time bound is tight --- we construct an~explicit -family of graphs on which the algorithm spends $\Theta(m\log n)$ steps. -Given a~planar graph, the algorithm however runs much faster (we get a~linear-time -algorithm much simpler than the one of Matsui \cite{matsui:planar}): - -\thm -When the input graph is planar, the Contractive Bor\o{u}vka's algorithm runs in -time $\O(n)$. - -Graph contractions are indeed a~very powerful tool and they can be used in other MST -algorithms as well. The following lemma shows the gist: - -\lemman{Contraction lemma}\id{contlemma}% -Let $G$ be a weighted graph, $e$~an arbitrary edge of~$\mst(G)$, $G/e$ the multigraph -produced by contracting~$e$ in~$G$, and $\pi$ the bijection between edges of~$G-e$ and -their counterparts in~$G/e$. Then $\mst(G) = \pi^{-1}[\mst(G/e)] + e.$ - -\chapter{Fine Details of Computation} - -\section{Models and machines} - -Traditionally, computer scientists have been using a~variety of computational models -as a~formalism in which their algorithms are stated. If we were studying -NP-complete\-ness, we could safely assume that all these models are equivalent, -possibly up to polynomial slowdown which is negligible. In our case, the -differences between good and not-so-good algorithms are on a~much smaller -scale, so we need to state our computation models carefully and develop -a repertoire of basic data structures tailor-made for the fine details of the -models. In recent decades, most researchers in the area of combinatorial algorithms -have been considering the following two computational models, and we will do likewise. - -The \df{Random Access Machine (RAM)} is not a~single coherent model, but rather a~family -of closely related machines (See Cook and Reckhow \cite{cook:ram} for one of the usual formal definitions -and Hagerup \cite{hagerup:wordram} for a~thorough description of the differences -between the RAM variants.) We will consider the variant usually called the \df{Word-RAM.} -It allows the ``C-language operators'', i.e., arithmetics and bitwise logical operations, -running in constant time on words of a~specified size. - -The \df{Pointer Machine (PM)} also does not seem to have any well established definition. -The various kinds of pointer machines are examined by Ben-Amram in~\cite{benamram:pm}, -but unlike the RAM's they turn out to be equivalent up to constant slowdown. -Our formal definition is closely related to the \em{linking automaton} proposed -by Knuth in~\cite{knuth:fundalg}. - -\section{Bucket sorting and related techniques}\id{bucketsort}% - -In the Contractive Bor\o{u}vka's algorithm, we needed to contract a~given -set of edges in the current graph and then flatten the graph, all this in time $\O(m)$. -This can be easily handled on both the RAM and the PM by bucket sorting. We develop -a~bunch of pointer-based sorting techniques which can be summarized by the following -lemma: - -\lemma -Partitioning of a~collection of sequences $S_1,\ldots,S_n$, whose elements are -arbitrary pointers and symbols from a~finite alphabet, to equality classes can -be performed on the Pointer Machine in time $\O(n + \sum_i \vert S_i \vert)$. - -\para -A~direct consequence of this unification is a~linear-time algorithm for subtree -isomorphism, significantly simpler than the standard one due to Zemlayachenko (see \cite{zemlay:treeiso} -and also Dinitz et al.~\cite{dinitz:treeiso}). When we apply a~similar technique -to general graphs, we get the framework of topological graph computation -of Buchsbaum et al.~\cite{buchsbaum:verify}. - -\defn -A~\df{graph computation} is a~function that takes a~\df{labeled undirected graph} as its input. The labels of -vertices and edges can be arbitrary symbols drawn from a~finite alphabet. The output -of the computation is another labeling of the same graph. This time, the vertices and -edges can be labeled with not only symbols of the alphabet, but also with pointers to the vertices -and edges of the input graph, and possibly also with pointers to outside objects. -A~graph computation is called \df{topological} if it produces isomorphic -outputs for isomorphic inputs. The isomorphism of course has to preserve not only -the structure of the graph, but also the labels in the obvious way. - -\defn -For a~collection~$\C$ of graphs, we define $\vert\C\vert$ as the number of graphs in -the collection and $\Vert\C\Vert$ as their total size, i.e., $\Vert\C\Vert = \sum_{G\in\C} n(G) + m(G)$. - -\thm -Suppose that we have a~topological graph computation~$\cal T$ that can be performed in time -$T(k)$ for graphs on $k$~vertices. Then we can run~$\cal T$ on a~collection~$\C$ -of labeled graphs on~$k$ vertices in time $\O(\Vert\C\Vert + (k+s)^{k(k+2)}\cdot (T(k)+k^2))$, -where~$s$ is a~constant depending only on the number of symbols used as vertex/edge labels. - -\section{Data structures on the RAM}\id{ramds}% - -There is a~lot of data structures designed specifically for the RAM. These structures -take advantage of both indexing and arithmetics and they often surpass the known -lower bounds for the same problem on the~PM. In many cases, they achieve constant time -per operation, at least when either the magnitude of the values or the size of -the data structure is suitably bounded. - -A~classical result of this type is the tree of van Emde Boas~\cite{boas:vebt} -which represents a~subset of the integers $\{0,\ldots,U-1\}$. It allows insertion, -deletion and order operations (minimum, maximum, successor etc.) in time $\O(\log\log U)$, -regardless of the size of the subset. If we replace the heap used in the Jarn\'\i{}k's -algorithm (\ref{jarnik}) by this structure, we immediately get an~algorithm -for finding the MST in integer-weighted graphs in time $\O(m\log\log w_{max})$, -where $w_{max}$ is the maximum weight. - -A~real breakthrough has however been made by Fredman and Willard who introduced -the Fusion trees~\cite{fw:fusion}. They again perform membership and predecessor -operation on a~set of $n$~integers, but with time complexity $\O(\log_W n)$ -per operation on a~Word-RAM with $W$-bit words. This of course assumes that -each element of the set fits in a~single word. As $W$ must at least~$\log n$, -the operations take $\O(\log n/\log\log n)$ time and thus we are able to sort $n$~integers -in time~$o(n\log n)$. This was further improved by Han and Thorup \cite{han:detsort,hanthor:randsort}. - -The Fusion trees themselves have very limited use in graph algorithms, but the -principles behind them are ubiquitous in many other data structures and these -will serve us well and often. We are going to build the theory of Q-heaps, -which will later lead to a~linear-time MST algorithm for arbitrary integer weights. -Other such structures will help us in building linear-time RAM algorithms for computing the ranks -of various combinatorial structures in Chapter~\ref{rankchap}. - -Outside our area, important consequences of RAM data structures include the -Thorup's $\O(m)$ algorithm for single-source shortest paths in undirected -graphs with positive integer weights \cite{thorup:usssp} and his $\O(m\log\log -n)$ algorithm for the same problem in directed graphs \cite{thorup:sssp}. Both -algorithms have been then significantly simplified by Hagerup -\cite{hagerup:sssp}. - -Despite the progress in the recent years, the corner-stone of all RAM structures -is still the representation of combinatorial objects by integers introduced by -Fredman and Willard. -First of all, we observe that we can encode vectors in integers: - -\notan{Bit strings}\id{bitnota}% -We will work with binary representations of natural numbers by strings over the -alphabet $\{\0,\1\}$: we will use $\(x)$ for the number~$x$ written in binary, -$\(x)_b$ for the same padded to exactly $b$ bits by adding leading zeroes, -and $x[k]$ for the value of the $k$-th bit of~$x$ (with a~numbering of bits such that $2^k[k]=1$). -The usual conventions for operations on strings will be utilized: When $s$ -and~$t$ are strings, we write $st$ for their concatenation and -$s^k$ for the string~$s$ repeated $k$~times. -When the meaning is clear from the context, -we will use $x$ and $\(x)$ interchangeably to avoid outbreak of symbols. - -\defn -The \df{bitwise encoding} of a~vector ${\bf x}=(x_0,\ldots,x_{d-1})$ of~$b$-bit numbers -is an~integer~$x$ such that $\(x)=\(x_{d-1})_b\0\(x_{d-2})_b\0\ldots\0\(x_0)_b$. In other -words, $x = \sum_i 2^{(b+1)i}\cdot x_i$. (We have interspersed the elements with \df{separator bits.}) - -\para -If we want to fit the whole vector in a~single machine word, the parameters $b$ and~$d$ must satisfy -the condition $(b+1)d\le W$ (where $W$~is the word size of the machine). -By using multiple-precision arithmetics, we can encode all vectors satisfying $bd=\O(W)$. -We describe how to translate simple vector manipulations to sequences of $\O(1)$ RAM operations -on their codes. For example, we can handle element-wise comparison of vectors, insertion -in a~sorted vector or shuffling elements of a~vector according to a~fixed permutation, -all in $\O(1)$ time. This also implies that several functions on numbers can be performed -in constant time, most notably binary logarithms. -The vector operations then serve as building blocks for construction of the Q-heaps. We get: - -\thm -Let $W$ and~$k$ be positive integers such that $k=\O(W^{1/4})$. Let~$Q$ -be a~Q-heap of at most $k$-elements of $W$~bits each. Then we can perform -Q-heap operations on~$Q$ (insertion, deletion, search for a~given value and search -for the $i$-th smallest element) in constant time on a~Word-RAM with word size~$W$, -after spending time $\O(2^{k^4})$ on the same RAM on precomputing of tables. - -\cor -For every positive integer~$r$ and $\delta>0$ there exists a~data structure -capable of maintaining the minimum of a~set of at most~$r$ word-sized numbers -under insertions and deletions. Each operation takes $\O(1)$ time on a~Word-RAM -with word size $W=\Omega(r^{\delta})$, after spending time -$\O(2^{r^\delta})$ on precomputing of tables. - -\chapter{Advanced MST Algorithms} - -\section{Minor-closed graph classes}\id{minorclosed}% - -The contractive algorithm given in Section~\ref{contalg} has been found to perform -well on planar graphs, but in general its time complexity was not linear. -Can we find any broader class of graphs where the linear bound holds? -The right context turns out to be the minor-closed classes, which are -closed under contractions and have bounded density. - -\defn\id{minordef}% -A~graph~$H$ is a \df{minor} of a~graph~$G$ (written as $H\minorof G$) iff it can be obtained -from a~subgraph of~$G$ by a sequence of simple graph contractions. - -\defn -A~class~$\cal C$ of graphs is \df{minor-closed}, when for every $G\in\cal C$ and -every minor~$H$ of~$G$, the graph~$H$ lies in~$\cal C$ as well. A~class~$\cal C$ is called -\df{non-trivial} if at least one graph lies in~$\cal C$ and at least one lies outside~$\cal C$. - -\example -Non-trivial minor-closed classes include: -planar graphs, -graphs embeddable in any fixed surface (i.e., graphs of bounded genus), -graphs embeddable in~${\bb R}^3$ without knots or without interlocking cycles, -and graphs of bounded tree-width or path-width. - -\para -Many of the nice structural properties of planar graphs extend to -minor-closed classes, too (see Lov\'asz \cite{lovasz:minors} for a~nice survey -of this theory and Diestel \cite{diestel:gt} for some of the deeper results). -For analysis of the contractive algorithm, we will make use of the bounded -density of minor-closed classes: - -\defn\id{density}% -Let $G$ be a~graph and $\cal C$ be a class of graphs. We define the \df{edge density} -$\varrho(G)$ of~$G$ as the average number of edges per vertex, i.e., $m(G)/n(G)$. The -edge density $\varrho(\cal C)$ of the class is then defined as the infimum of $\varrho(G)$ over all $G\in\cal C$. - -\thmn{Density of minor-closed classes, Mader~\cite{mader:dens}} -Every non-trivial minor-closed class of graphs has finite edge density. - -\thmn{MST on minor-closed classes, Mare\v{s} \cite{mm:mst}}\id{mstmcc}% -For any fixed non-trivial minor-closed class~$\cal C$ of graphs, the Contractive Bor\o{u}vka's -algorithm (\ref{contbor}) finds the MST of any graph of this class in time -$\O(n)$. (The constant hidden in the~$\O$ depends on the class.) - -\paran{Local contractions}\id{nobatch}% -The contractive algorithm uses ``batch processing'' to perform many contractions -in a single step. It is also possible to perform them one edge at a~time, -batching only the flattenings. A~contraction of an edge~$uv$ can be done in time~$\O(\deg(u))$, -so we have to make sure that there is a~steady supply of low-degree vertices. -It indeed is in minor-closed classes: - -\lemman{Low-degree vertices}\id{lowdeg}% -Let $\cal C$ be a graph class with density~$\varrho$ and $G\in\cal C$ a~graph -with $n$~vertices. Then at least $n/2$ vertices of~$G$ have degree at most~$4\varrho$. - -This leads to the following algorithm: - -\algn{Local Bor\o{u}vka's Algorithm, Mare\v{s} \cite{mm:mst}}% -\algo -\algin A~graph~$G$ with an edge comparison oracle and a~parameter~$t\in{\bb N}$. -\:$T\=\emptyset$. -\:$\ell(e)\=e$ for all edges~$e$. -\:While $n(G)>1$: -\::While there exists a~vertex~$v$ such that $\deg(v)\le t$: -\:::Select the lightest edge~$e$ incident with~$v$. -\:::Contract~$e$. -\:::$T\=T + \ell(e)$. -\::Flatten $G$, removing parallel edges and loops. -\algout Minimum spanning tree~$T$. -\endalgo - -\thm -When $\cal C$ is a minor-closed class of graphs with density~$\varrho$, the -Local Bor\o{u}vka's Algorithm with the parameter~$t$ set to~$4\varrho$ -finds the MST of any graph from this class in time $\O(n)$. (The constant -in the~$\O$ depends on~the class.) - -\section{Iterated algorithms}\id{iteralg}% - -We have seen that the Jarn\'\i{}k's Algorithm \ref{jarnik} runs in $\Theta(m\log n)$ time. -Fredman and Tarjan \cite{ft:fibonacci} have shown a~faster implementation using their Fibonacci -heaps, which runs in time $\O(m+n\log n)$. This is $\O(m)$ whenever the density of the -input graph reaches $\Omega(\log n)$. This suggests that we could combine the algorithm with -another MST algorithm, which identifies a~subset of the MST edges and contracts -them to increase the density of the graph. For example, if we perform several Bor\o{u}vka -steps and then we run the Jarn\'\i{}k's algorithm, we find the MST in time $\O(m\log\log n)$. - -Actually, there is a~much better choice of the algorithms to combine: use the -Jarn\'\i{}k's algorithm with a~Fibonacci heap multiple times, each time stopping it after a~while. -A~good choice of the stopping condition is to place a~limit on the size of the heap. -We start with an~arbitrary vertex, grow the tree as usually and once the heap gets too large, -we conserve the current tree and start with a~different vertex and an~empty heap. When this -process runs out of vertices, it has identified a~sub-forest of the MST, so we can -contract the edges of~this forest and iterate. This improves the time complexity -significantly: - -\thm\id{itjarthm}% -The Iterated Jarn\'\i{}k's algorithm finds the MST of the input graph in time -$\O(m\timesbeta(m,n))$, where $\beta(m,n):=\min\{ i \mid \log^{(i)}n \le m/n \}$. - -\cor -The Iterated Jarn\'\i{}k's algorithm runs in time $\O(m\log^* n)$. - -\paran{Integer weights}% -The algorithm spends most of the time in phases which have small heaps. Once the -heap grows to $\Omega(\log^{(k)} n)$ for any fixed~$k$, the graph gets dense enough -to guarantee that at most~$k$ phases remain. This means that if we are able to -construct a~heap of size $\Omega(\log^{(k)} n)$ with constant time per operation, -we can get a~linear-time algorithm for MST. This is the case when the weights are -integers (we can use the Q-heap trees from Section~\ref{ramds}). - -\thmn{MST for integer weights, Fredman and Willard \cite{fw:transdich}}\id{intmst}% -MST of a~graph with integer edge weights can be found in time $\O(m)$ on the Word-RAM. - -\section{Verification of minimality}\id{verifysect}% - -Now we will turn our attention to a~slightly different problem: given a~spanning -tree, how to verify that it is minimum? We will show that this can be achieved -in linear time and it will serve as a~basis for a~randomized linear-time -MST algorithm in the next section. - -MST verification has been studied by Koml\'os \cite{komlos:verify}, who has -proven that $\O(m)$ edge comparisons are sufficient, but his algorithm needed -super-linear time to find the edges to compare. Dixon, Rauch and Tarjan \cite{dixon:verify} -have later shown that the overhead can be reduced -to linear time on the RAM using preprocessing and table lookup on small -subtrees. Later, King has given a~simpler algorithm in \cite{king:verifytwo}. - -To verify that a~spanning tree~$T$ is minimum, it is sufficient to check that all -edges outside~$T$ are $T$-heavy. For each edge $uv\in E\setminus T$, we will -find the heaviest edge of the tree path $T[u,v]$ (we will call it the \df{peak} -of the path) and compare its weight to $w(uv)$. We have therefore transformed -the MST verification to the problem of finding peaks for a~set of \df{query -paths} on a~given tree. By a~sequence of further transformations, we can even -assume that the given tree is \df{complete branching} (all vertices are on -the same level and internal vertices always have outdegree~2) and that the -query paths join a~vertex with one of its ancestors. - -Koml\'os has given a~simple algorithm that traverses the complete branching -tree recursively. At each moment, it maintains an~array of peaks of the restrictions -of the query paths to the subtree below the current vertex. If we account for the -comparisons performed by this algorithm carefully and express the bound in terms -of the size of the original problem (before all the transformations), we get: - -\thmn{Verification of the MST, Koml\'os \cite{komlos:verify}}\id{verify}% -For every weighted graph~$G$ and its spanning tree~$T$, it is sufficient to -perform $\O(m)$ comparisons of edge weights to determine whether~$T$ is minimum -and to find all $T$-light edges in~$G$. - -It remains to demonstrate that the overhead of the algorithm needed to find -the required comparisons and to infer the peaks from their results can be decreased, -so that it gets bounded by the number of comparisons and therefore also by $\O(m)$. -We will follow the idea of King from \cite{king:verifytwo}, but as we have the power -of the RAM data structures from Section~\ref{ramds} at our command, the low-level -details will be easier. Still, the construction is rather technical, so we omit -it from this abstract and state only the final theorem: - -\thmn{Verification of MST on the RAM}\id{ramverify}% -There is a~RAM algorithm which for every weighted graph~$G$ and its spanning tree~$T$ -determines whether~$T$ is minimum and finds all $T$-light edges in~$G$ in time $\O(m)$. - -\section{A randomized algorithm}\id{randmst}% - -When we analysed the Contractive Bor\o{u}vka's algorithm in Section~\ref{contalg}, -we observed that while the number of vertices per iteration decreases exponentially, -the number of edges generally does not, so we spend $\Theta(m)$ time on every phase. -Karger, Klein and Tarjan \cite{karger:randomized} have overcome this problem by -combining the Bor\o{u}vka's algorithm with filtering based on random sampling. -This leads to a~randomized algorithm which runs in linear expected time. - -The principle of the filtering is simple: Let us consider any spanning tree~$T$ -of the input graph~$G$. Each edge of~$G$ that is $T$-heavy is the heaviest edge -of some cycle, so by the Red lemma it cannot participate in -the MST of~$G$. We can therefore discard all $T$-heavy edges and continue with -finding the MST on the reduced graph. Of course, not all choices of~$T$ are equally -good, but it will soon turn out that when we take~$T$ as the MST of a~randomly selected -subgraph, only a~small expected number of edges remains: - -\lemman{Random sampling, Karger \cite{karger:sampling}}\id{samplemma}% -Let $H$~be a~subgraph of~$G$ obtained by including each edge independently -with probability~$p$. Let further $F$~be the minimum spanning forest of~$H$. Then the -expected number of $F$-nonheavy\foot{That is, $F$-light edges and also edges of~$F$ itself.} -edges in~$G$ is at most $n/p$. - -\para -We will formulate the algorithm as a~doubly-recursive procedure. It alternatively -performs steps of the Bor\o{u}vka's algorithm and filtering based on the above lemma. -The first recursive call computes the MSF of the sampled subgraph, the second one -finds the MSF of the original graph, but without the heavy edges. - -\algn{MSF by random sampling --- the KKT algorithm}\id{kkt}% -\algo -\algin A~graph $G$ with an~edge comparison oracle. -\:Remove isolated vertices from~$G$. If no vertices remain, stop and return an~empty forest. -\:Perform two Bor\o{u}vka steps (iterations of Algorithm \ref{contbor}) on~$G$ and - remember the set~$B$ of the edges having been contracted. -\:Select a~subgraph~$H\subseteq G$ by including each edge independently with - probability $1/2$. -\:$F\=\msf(H)$ calculated recursively. -\:Construct $G'\subseteq G$ by removing all $F$-heavy edges of~$G$. -\:$R\=\msf(G')$ calculated recursively. -\:Return $R\cup B$. -\algout The minimum spanning forest of~$G$. -\endalgo - -\>A~careful analysis of this algorithm, based on properties of its recursion tree -and on the peak-finding algorithm of the previous section, yields the following time bounds: - -\thm -The KKT algorithm runs in time $\O(\min(n^2,m\log n))$ in the worst case on the RAM. -The expected time complexity is $\O(m)$. - -\chapter{Approaching Optimality}\id{optchap}% - -\section{Soft heaps}\id{shsect}% - -A~vast majority of MST algorithms that we have encountered so far is based on -the Tarjan's Blue rule (Lemma \ref{bluelemma}), the only exception being the -randomized KKT algorithm, which also used the Red rule (Lemma \ref{redlemma}). Recently, Chazelle -\cite{chazelle:ackermann} and Pettie \cite{pettie:ackermann} have presented new -deterministic algorithms for the MST which are also based on the combination of -both rules. They have reached worst-case time complexity -$\O(m\timesalpha(m,n))$ on the Pointer Machine. We will devote this chapter to -their results and especially to another algorithm by Pettie and Ramachandran -\cite{pettie:optimal} which is provably optimal. - -At the very heart of all these algorithms lies the \df{soft heap} discovered by -Chazelle \cite{chazelle:softheap}. It is a~meldable priority queue, roughly -similar to the Vuillemin's binomial heaps \cite{vuillemin:binheap} or Fredman's -and Tarjan's Fibonacci heaps \cite{ft:fibonacci}. The soft heaps run faster at -the expense of \df{corrupting} a~fraction of the inserted elements by raising -their values (the values are however never lowered). This allows for -a~trade-off between accuracy and speed, controlled by a~parameter~$\varepsilon$. - -In the thesis, we describe the exact mechanics of the soft heaps and analyse its complexity. -The important properties are characterized by the following theorem: - -\thmn{Performance of soft heaps, Chazelle \cite{chazelle:softheap}}\id{softheap}% -A~soft heap with error rate~$\varepsilon$ ($0<\varepsilon\le 1/2$) processes -a~sequence of operations starting with an~empty heap and containing $n$~\s -in time $\O(n\log(1/\varepsilon))$ on the Pointer Machine. At every moment, the -heap contains at most $\varepsilon n$ corrupted items. - -\section{Robust contractions} - -Having the soft heaps at hand, we would like to use them in a~conventional MST -algorithm in place of a~normal heap. We can for example try implanting the soft heap -in the Jarn\'i{}k's algorithm, preferably in the earlier -version without Fibonacci heaps as the soft heaps lack the \ operation. -This brave, but somewhat simple-minded attempt is however doomed to -fail because of corruption of items inside the soft heap. -While the basic structural properties of MST's no longer hold in corrupted graphs, -there is a~weaker form of the Contraction lemma that takes the corrupted edges into account. -Before we prove this lemma, we expand our awareness of subgraphs which can be contracted. - -\defn -A~subgraph $C\subseteq G$ is \df{contractible} iff for every pair of edges $e,f\in\delta(C)$\foot{That is, -of~$G$'s edges with exactly one endpoint in~$C$.} there exists a~path in~$C$ connecting the endpoints -of the edges $e,f$ such that all edges on this path are lighter than either $e$ or~$f$. - -For example, when we stop the Jarn\'\i{}k's algorithm at some moment and -we take a~subgraph~$C$ induced by the constructed tree, this subgraph is contractible. -We can now easily reformulate the Contraction lemma (\ref{contlemma}) in the language -of contractible subgraphs: - -\lemman{Generalized contraction} -When~$C\subseteq G$ is a~contractible subgraph, then $\msf(G)=\msf(C) \cup \msf(G/C)$. - -Let us bring corruption back to the game and state a~``robust'' version -of this lemma. - -\nota\id{corrnota}% -When~$G$ is a~weighted graph and~$R$ a~subset of its edges, we will use $G\crpt -R$ to denote an arbitrary graph obtained from~$G$ by increasing the weights of -some of the edges in~$R$. -Whenever~$C$ is a~subgraph of~$G$, we will use $R^C$ to refer to the edges of~$R$ with -exactly one endpoint in~$C$ (i.e., $R^C = R\cap \delta(C)$). - -\lemman{Robust contraction, Chazelle \cite{chazelle:almostacker}}\id{robcont}% -Let $G$ be a~weighted graph and $C$~its subgraph contractible in~$G\crpt R$ -for some set~$R$ of edges. Then $\msf(G) \subseteq \msf(C) \cup \msf((G/C) \setminus R^C) \cup R^C$. - -\para -We will now mimic the Iterated Jarn\'\i{}k's algorithm. We will partition the given graph to a~collection~$\C$ -of non-overlapping contractible subgraphs called \df{clusters} and we put aside all edges that got corrupted in the process. -We recursively compute the MSF of those subgraphs and of the contracted graph. Then we take the -union of these MSF's and add the corrupted edges. According to the previous lemma, this does not produce -the MSF of~$G$, but a~sparser graph containing it, on which we can continue. -%%The following theorem describes the properties of this partition: - -\thmn{Partitioning to contractible clusters, Chazelle \cite{chazelle:almostacker}}\id{partthm}% -Given a~weighted graph~$G$ and parameters $\varepsilon$ ($0<\varepsilon\le 1/2$) -and~$t$, we can construct a~collection $\C=\{C_1,\ldots,C_k\}$ of clusters and a~set~$R^\C$ of edges such that: - -\numlist\ndotted -\:All the clusters and the set~$R^\C$ are mutually edge-disjoint. -\:Each cluster contains at most~$t$ vertices. -\:Each vertex of~$G$ is contained in at least one cluster. -\:The connected components of the union of all clusters have at least~$t$ vertices each, - except perhaps for those which are equal to a~connected component of $G\setminus R^\C$. -\:$\vert R^\C\vert \le 2\varepsilon m$. -\:$\msf(G) \subseteq \bigcup_i \msf(C_i) \cup \msf\bigl((G / \bigcup_i C_i) \setminus R^\C\bigr) \cup R^\C$. -\:The construction takes $\O(n+m\log (1/\varepsilon))$ time. -\endlist - -\section{Decision trees}\id{dtsect}% - -The Pettie's and Ramachandran's algorithm combines the idea of robust partitioning with optimal decision -trees constructed by brute force for very small subgraphs. -%%Formally, the decision trees are defined as follows: -Let us define them first: - -\defnn{Decision trees and their complexity}\id{decdef}% -A~\df{MSF decision tree} for a~graph~$G$ is a~binary tree. Its internal vertices -are labeled with pairs of $G$'s edges to be compared, each of the two outgoing tree edges -corresponds to one possible result of the comparison. -Leaves of the tree are labeled with spanning trees of the graph~$G$. - -A~\df{computation} of the decision tree on a~specific permutation of edge weights -in~$G$ is the path from the root to a~leaf such that the outcome of every comparison -agrees with the edge weights. The result of the computation is the spanning tree -assigned to its final leaf. -A~decision tree is \df{correct} iff for every permutation the corresponding -computation results in the real MSF of~$G$ with the particular weights. - -The \df{time complexity} of a~decision tree is defined as its depth. It therefore -bounds the number of comparisons spent on every path. (It need not be equal since -some paths need not correspond to an~actual computation --- the sequence of outcomes -on the path could be unsatisfiable.) - -A~decision tree is called \df{optimal} if it is correct and its depth is minimum possible -among the correct decision trees for the given graph. -We will denote an~arbitrary optimal decision tree for~$G$ by~${\cal D}(G)$ and its -complexity by~$D(G)$. - -The \df{decision tree complexity} $D(m,n)$ of the MSF problem is the maximum of~$D(G)$ -over all graphs~$G$ with $n$~vertices and~$m$ edges. - -\obs -Decision trees are the most general deterministic comparison-based computation model possible. -The only operations that count in its time complexity are comparisons. All -other computation is free, including solving NP-complete problems or having -access to an~unlimited source of non-uniform constants. The decision tree -complexity is therefore an~obvious lower bound on the time complexity of the -problem in all other comparison-based models. - -The downside is that we do not know any explicit construction of the optimal -decision trees, nor even a~non-constructive proof of their complexity. -On the other hand, the complexity of any existing comparison-based algorithm -can be used as an~upper bound on the decision tree complexity. Also, we can -construct an~optimal decision tree using brute force: - -\lemma -An~optimal MST decision tree for a~graph~$G$ on~$n$ vertices can be constructed on -the Pointer Machine in time $\O(2^{2^{4n^2}})$. - -\section{An optimal algorithm}\id{optalgsect}% - -Once we have developed the soft heaps, partitioning and MST decision trees, -it is now simple to state the Pettie's and Ramachandran's MST algorithm -and prove that it is asymptotically optimal among all MST algorithms in -comparison-based models. Several standard MST algorithms from the previous -chapters will also play their roles. -We will describe the algorithm as a~recursive procedure: - -\algn{Optimal MST algorithm, Pettie and Ramachandran \cite{pettie:optimal}}\id{optimal}% -\algo -\algin A~connected graph~$G$ with an~edge comparison oracle. -\:If $G$ has no edges, return an~empty tree. -\:$t\=\lfloor\log^{(3)} n\rfloor$. \cmt{the size of clusters} -\:Call the partitioning procedure (\ref{partthm}) on $G$ and $t$ with $\varepsilon=1/8$. It returns - a~collection~$\C=\{C_1,\ldots,C_k\}$ of clusters and a~set~$R^\C$ of corrupted edges. -\:$F_i \= \mst(C_i)$ for all~$i$, obtained using optimal decision trees. -\:$G_A \= (G / \bigcup_i C_i) \setminus R^\C$. \cmt{the contracted graph} -\:$F_A \= \msf(G_A)$ calculated by the Iterated Jarn\'\i{}k's algorithm (see Section \ref{iteralg}). -\:$G_B \= \bigcup_i F_i \cup F_A \cup R^\C$. \cmt{combine subtrees with corrupted edges} -\:Run two Bor\o{u}vka steps (iterations of the Contractive Bor\o{u}vka's algorithm, \ref{contbor}) on~$G_B$, - getting a~contracted graph~$G_C$ and a~set~$F_B$ of MST edges. -\:$F_C \= \mst(G_C)$ obtained by a~recursive call to this algorithm. -\:Return $F_B \cup F_C$. -\algout The minimum spanning tree of~$G$. -\endalgo - -\>Correctness of this algorithm immediately follows from the Partitioning theorem (\ref{partthm}) -and from the proofs of the respective algorithms used as subroutines. As for time complexity, we prove: - -\thm -The time complexity of the Optimal algorithm is $\Theta(D(m,n))$. - -\paran{Complexity of MST}% -As we have already noted, the exact decision tree complexity $D(m,n)$ of the MST problem -is still open and so therefore is the time complexity of the optimal algorithm. However, -every time we come up with another comparison-based algorithm, we can use its complexity -(or more specifically the number of comparisons it performs, which can be even lower) -as an~upper bound on the optimal algorithm. -The best explicit comparison-based algorithm known to date has been discovered by Chazelle -\cite{chazelle:ackermann} and independently by Pettie \cite{pettie:ackermann}. It achieves complexity $\O(m\timesalpha(m,n))$. -Using any of these results, we can prove an~Ackermannian upper bound on the -optimal algorithm: - -\thm -The time complexity of the Optimal algorithm is $\O(m\timesalpha(m,n))$. - -\chapter{Dynamic Spanning Trees}\id{dynchap}% - -\section{Dynamic graph algorithms} - -In many applications, we often need to solve a~certain graph problem for a~sequence of graphs that -differ only a~little, so recomputing the solution for every graph from scratch would be a~waste of -time. In such cases, we usually turn our attention to \df{dynamic graph algorithms.} A~dynamic -algorithm is in fact a~data structure that remembers a~graph. It offers operations that modify the -structure of the graph and also operations that query the result of the problem for the current -state of the graph. A~typical example of a~problem of this kind is dynamic maintenance of connected -components: - -\problemn{Dynamic connectivity} -Maintain an~undirected graph under a~sequence of the following operations: -\itemize\ibull -\:$\(n)$ --- Create a~graph with $n$~isolated vertices $\{1,\ldots,n\}$. -(It is possible to modify the structure to support dynamic addition and removal of vertices, too.) -\:$\(G,u,v)$ --- Insert an~edge $uv$ to~$G$ and return its unique -identifier. This assumes that the edge did not exist yet. -\:$\(G,e)$ --- Delete an~edge specified by its identifier from~$G$. -\:$\(G,u,v)$ --- Test if vertices $u$ and~$v$ are in the same connected component of~$G$. -\endlist - -\>In this chapter, we will focus on the dynamic version of the minimum spanning forest. -This problem seems to be intimately related to the dynamic connectivity. Indeed, all known -algorithms for dynamic connectivity maintain some sort of a~spanning forest. -This suggests that a~dynamic MSF algorithm could be obtained by modifying the -mechanics of the data structure to keep the forest minimum. -We however have to answer one important question first: What should be the output of -our MSF data structure? Adding an~operation that returns the MSF of the current -graph would be of course possible, but somewhat impractical as this operation would have to -spend $\Omega(n)$ time on the mere writing of its output. A~better way seems to -be making the \ and \ operations report the list of modifications -of the MSF implied by the change in the graph. It is easy to prove that $\O(1)$ -modifications always suffice, so we can formulate our problem as follows: - -\problemn{Dynamic minimum spanning forest} -Maintain an~undirected graph with distinct weights on edges (drawn from a~totally ordered set) -and its minimum spanning forest under a~sequence of the following operations: -\itemize\ibull -\:$\(n)$ --- Create a~graph with $n$~isolated vertices $\{1,\ldots,n\}$. -\:$\(G,u,v,w)$ --- Insert an~edge $uv$ of weight~$w$ to~$G$. Return its unique - identifier and the list of additions and deletions of edges in $\msf(G)$. -\:$\(G,e)$ --- Delete an~edge specified by its identifier from~$G$. - Return the list of additions and deletions of edges in $\msf(G)$. -\endlist - -\paran{Incremental MSF}% -In case only edge insertions are allowed, the problem reduces to finding the heaviest -edge (peak) on the tree path covered by the newly inserted edge and replacing the peak -if needed. This can be handled quite efficiently by using the Link-Cut trees of Sleator -and Tarjan \cite{sleator:trees}. We obtain logarithmic time bound: - -\thmn{Incremental MSF} -When only edge insertions are allowed, the dynamic MSF can be maintained in time $\O(\log n)$ -amortized per operation. - -\section{Dynamic connectivity} - -The fully dynamic connectivity problem has a~long and rich history. In the 1980's, Frederickson \cite{frederickson:dynamic} -has used his topological trees to construct a~dynamic connectivity algorithm of complexity $\O(\sqrt m)$ per update and -$\O(1)$ per query. Eppstein et al.~\cite{eppstein:sparsify} have introduced a~sparsification technique which can bring the -updates down to $\O(\sqrt n)$. Later, several different algorithms with complexity on the order of $n^\varepsilon$ -were presented by Henzinger and King \cite{henzinger:mst} and also by Mare\v{s} \cite{mares:dga}. -A~polylogarithmic time bound was first reached by the randomized algorithm of Henzinger and King \cite{henzinger:randdyn}. -The best result known as of now is the $\O(\log^2 n)$ time deterministic algorithm by Holm, -de~Lichtenberg and Thorup \cite{holm:polylog}, which will we describe in this section. - -The algorithm will maintain a~spanning forest~$F$ of the current graph~$G$, represented by an~ET-tree -which will be used to answer connectivity queries. The edges of~$G\setminus F$ will be stored as~non-tree -edges in the ET-tree. Hence, an~insertion of an~edge to~$G$ either adds it to~$F$ or inserts it as non-tree. -Deletions of non-tree edges are also easy, but when a~tree edge is deleted, we have to search for its -replacement among the non-tree edges. - -To govern the search in an~efficient way, we will associate each edge~$e$ with a~level $\ell(e) \le -L = \lfloor\log_2 n\rfloor$. For each level~$i$, we will use~$F_i$ to denote the subforest -of~$F$ containing edges of level at least~$i$. Therefore $F=F_0 \supseteq F_1 \supseteq \ldots \supseteq F_L$. -We will maintain the following \em{invariants:} - -{\narrower -\def\iinv{{\bo I\the\itemcount~}} -\numlist\iinv -\:$F$~is the maximum spanning forest of~$G$ with respect to the levels. (In other words, -if $uv$ is a~non-tree edge, then $u$ and~$v$ are connected in~$F_{\ell(uv)}$.) -\:For each~$i$, the components of~$F_i$ have at most $\lfloor n/2^i \rfloor$ vertices each. -(This implies that it does not make sense to define~$F_i$ for $i>L$, because it would be empty -anyway.) -\endlist -} - -At the beginning, the graph contains no edges, so both invariants are trivially -satisfied. Newly inserted edges enter level~0, which cannot break I1 nor~I2. - -When we delete a~tree edge at level~$\ell$, we split a~tree~$T$ of~$F_\ell$ to two -trees $T_1$ and~$T_2$. Without loss of generality, let us assume that $T_1$ is the -smaller one. We will try to find the replacement edge of the highest possible -level that connects the spanning tree back. From I1, we know that such an~edge cannot belong to -a~level greater than~$\ell$, so we start looking for it at level~$\ell$. According -to~I2, the tree~$T$ had at most $\lfloor n/2^\ell\rfloor$ vertices, so $T_1$ has -at most $\lfloor n/2^{\ell+1} \rfloor$ of them. Thus we can move all level~$\ell$ -edges of~$T_1$ to level~$\ell+1$ without violating either invariant. - -We now start enumerating the non-tree edges incident with~$T_1$. Each such edge -is either local to~$T_1$ or it joins $T_1$ with~$T_2$. We will therefore check each edge -whether its other endpoint lies in~$T_2$ and if it does, we have found the replacement -edge, so we insert it to~$F_\ell$ and stop. Otherwise we move the edge one level up. (This -will be the grist for the mill of our amortization argument: We can charge most of the work on level -increases and we know that the level of each edge can reach at most~$L$.) - -If the non-tree edges at level~$\ell$ are exhausted, we try the same in the next -lower level and so on. If there is no replacement edge at level~0, the tree~$T$ -remains disconnected. - -The implementation uses the Eulerian Tour trees of Henzinger and King \cite{henzinger:randdyn} -to represent the forests~$F_\ell$ together with the non-tree edges at each particular level. -A~simple amortized analysis using the levels yields the following result: - -\thmn{Fully dynamic connectivity, Holm et al.~\cite{holm:polylog}}\id{dyncon}% -Dynamic connectivity can be maintained in time $\O(\log^2 n)$ amortized per -\ and \ and in time $\O(\log n/\log\log n)$ per \ -in the worst case. - -\rem\id{dclower}% -An~$\Omega(\log n/\log\log n)$ lower bound for the amortized complexity of the dynamic connectivity -problem has been proven by Henzinger and Fredman \cite{henzinger:lowerbounds} in the cell -probe model with $\O(\log n)$-bit words. Thorup has answered by a~faster algorithm -\cite{thorup:nearopt} that achieves $\O(\log n\log^3\log n)$ time per update and -$\O(\log n/\log^{(3)} n)$ per query on a~RAM with $\O(\log n)$-bit words. (He claims -that the algorithm runs on a~Pointer Machine, but it uses arithmetic operations, -so it does not fit the definition of the PM we use. The algorithm only does not -need direct indexing of arrays.) So far, it is not known how to extend this algorithm -to fit our needs, so we omit the details. - -\section{Dynamic spanning forests}\id{dynmstsect}% - -Let us turn our attention back to the dynamic MSF. -Most of the early algorithms for dynamic connectivity also imply $\O(n^\varepsilon)$ -algorithms for dynamic maintenance of the MSF. Henzinger and King \cite{henzinger:twoec,henzinger:randdyn} -have generalized their randomized connectivity algorithm to maintain the MSF in $\O(\log^5 n)$ time per -operation, or $\O(k\log^3 n)$ if only $k$ different values of edge weights are allowed. They have solved -the decremental version of the problem first (which starts with a~given graph and only edge deletions -are allowed) and then presented a~general reduction from the fully dynamic MSF to its decremental version. -We will describe the algorithm of Holm, de Lichtenberg and Thorup \cite{holm:polylog}, who have followed -the same path. They have modified their dynamic connectivity algorithm to solve the decremental MSF -in $\O(\log^2 n)$ and obtained the fully dynamic MSF working in $\O(\log^4 n)$ per operation. - -\paran{Decremental MSF}% -Turning the algorithm from the previous section to the decremental MSF requires only two -changes: First, we have to start with the forest~$F$ equal to the MSF of the initial -graph. As we require to pay $\O(\log^2 n)$ for every insertion, we can use almost arbitrary -MSF algorithm to find~$F$. Second, when we search for an~replacement edge, we need to pick -the lightest possible choice. We will therefore use a~weighted version of the ET-trees. -We must ensure that the lower levels cannot contain a~lighter replacement edge, -but fortunately the light edges tend to ``bubble up'' in the hierarchy of -levels. This can be formalized in form of the following invariant: - -{\narrower -\def\iinv{{\bo I\the\itemcount~}} -\numlist\iinv -\itemcount=2 -\:On every cycle, the heaviest edge has the smallest level. -\endlist -} - -\>This immediately implies that we always select the right replacement edge: - -\lemma\id{msfrepl}% -Let $F$~be the minimum spanning forest and $e$ any its edge. Then among all replacement -edges for~$e$, the lightest one is at the maximum level. - -A~brief analysis also shows that the invariant I3 is observed by all operations -on the structure. We can conclude: - -\thmn{Decremental MSF, Holm et al.~\cite{holm:polylog}} -When we start with a~graph on $n$~vertices with~$m$ edges and we perform a~sequence of -edge deletions, the MSF can be initialized in time $\O((m+n)\cdot\log^2 n)$ and then -updated in time $\O(\log^2 n)$ amortized per operation. - -\paran{Fully dynamic MSF}% -The decremental MSF algorithm can be turned to a~fully dynamic one by a~blackbox -reduction of Holm et al.: - -\thmn{MSF dynamization, Holm et al.~\cite{holm:polylog}} -Suppose that we have a~decremental MSF algorithm with the following properties: -\numlist\ndotted -\:For any $a$,~$b$, it can be initialized on a~graph with~$a$ vertices and~$b$ edges. -\:Then it executes an~arbitrary sequence of deletions in time $\O(b\cdot t(a,b))$, where~$t$ is a~non-decreasing function. -\endlist -\>Then there exists a~fully dynamic MSF algorithm for a~graph on $n$~vertices, starting -with no edges, that performs $m$~insertions and deletions in amortized time: -$$ -\O\left( \log^3 n + \sum_{i=1}^{\log m} \sum_{j=1}^i \; t(\min(n,2^j), 2^j) \right) \hbox{\quad per operation.} -$$ - -\corn{Fully dynamic MSF}\id{dynmsfcorr}% -There is a~fully dynamic MSF algorithm that works in time $\O(\log^4 n)$ amortized -per operation for graphs on $n$~vertices. - -\paran{Dynamic MSF with limited edge weights}% -If the set from which the edge weights are drawn is small, we can take a~different -approach. If only two values are allowed, we split the graph to subgraphs $G_1$ and~$G_2$ -induced by the edges of the respective weights and we maintain separate connectivity -structures (together with a~spanning tree) for $G_1$ and $G_2 \cup T_1$ (where $T_1$ -is a~spanning tree of~$G_1$). We can easily modify the structure for $G_2\cup -T_1$ to prefer the edges of~$T_1$. This ensures that the spanning tree of $G_2\cup T_1$ -will be the MST of the whole~$G$. - -If there are more possible values, we simply iterate this construction: the $i$-th -structure contains edges of weight~$i$ and the edges of the spanning tree from the -$(i-1)$-th structure. We get: - -\thmn{MSF with limited edge weights} -There is a~fully dynamic MSF algorithm that works in time $\O(k\cdot\log^2 n)$ amortized -per operation for graphs on $n$~vertices with only $k$~distinct edge weights allowed. - -\section{Almost minimum trees}\id{kbestsect}% - -In some situations, finding the single minimum spanning tree is not enough and we are interested -in the $K$~lightest spanning trees, usually for some small value of~$K$. Katoh, Ibaraki -and Mine \cite{katoh:kmin} have given an~algorithm of time complexity $\O(m\log\beta(m,n) + Km)$, -building on the MST algorithm of Gabow et al.~\cite{gabow:mst}. -Subsequently, Eppstein \cite{eppstein:ksmallest} has discovered an~elegant preprocessing step which allows to reduce -the running time to $\O(m\log\beta(m,n) + \min(K^2,Km))$ by eliminating edges -which are either present in all $K$ trees or in none of them. -We will show a~variant of their algorithm based on the MST verification -procedure of Section~\ref{verifysect}. - -In this section, we will require the edge weights to be numeric, because -comparisons are certainly not sufficient to determine the second best spanning tree. We will -assume that our computation model is able to add, subtract and compare the edge weights -in constant time. -Let us focus on finding the second lightest spanning tree first. - -\paran{Second lightest spanning tree}% -Suppose that we have a~weighted graph~$G$ and a~sequence $T_1,\ldots,T_z$ of all its spanning -trees. Also suppose that the weights of these spanning trees are distinct and that the sequence -is ordered by weight, i.e., $w(T_1) < \ldots < w(T_z)$ and $T_1 = \mst(G)$. Let us observe -that each tree is similar to at least one of its predecessors: - -\lemman{Difference lemma}\id{kbl}% -For each $i>1$ there exists $jThus we can run the previous algorithm for finding the best edge exchange -on both~$G_1$ and~$G_2$ and find~$T_3$ again in time $\O(m)$. - -\paran{Further spanning trees}% -The construction of auxiliary graphs can be iterated to obtain $T_1,\ldots,T_K$ -for an~arbitrary~$K$. We will build a~\df{meta-tree} of auxiliary graphs. Each node of this meta-tree -carries a~graph and its minimum spanning tree. The root node contains~$(G,T_1)$, -its sons have $(G_1,T_1/e)$ and $(G_2,T_2)$. When $T_3$ is obtained by an~exchange -in one of these sons, we attach two new leaves to that son and we let them carry the two auxiliary -graphs derived by contracting or deleting the exchanged edge. Then we find the best -edge exchanges among all leaves of the new meta-tree and repeat the process. By Observation \ref{tbobs}, -each spanning tree of~$G$ is generated exactly once. The Difference lemma guarantees that -the trees are enumerated in the increasing order. So we get: - -\lemma\id{kbestl}% -Given~$G$ and~$T_1$, we can find $T_2,\ldots,T_K$ in time $\O(Km + K\log K)$. - -\paran{Invariant edges}% -Our algorithm can be further improved for small values of~$K$ (which seems to be the common -case in most applications) by the reduction of Eppstein \cite{eppstein:ksmallest}. -He has proven that there are many edges of~$T_1$ -which are guaranteed to be contained in $T_2,\ldots,T_K$ as well, and likewise there are -many edges of $G\setminus T_1$ which are excluded from all those spanning trees. -When we combine this with the previous construction, we get the following theorem: - -\thmn{Finding $K$ lightest spanning trees}\id{kbestthm}% -For a~given graph~$G$ with real edge weights and a~positive integer~$K$, the $K$~best spanning trees can be found -in time $\O(m\timesalpha(m,n) + \min(K^2,Km + K\log K))$. - -\chapter{Ranking Combinatorial Structures}\id{rankchap}% - -\section{Ranking and unranking}\id{ranksect}% - -The techniques for building efficient data structures on the RAM, which we have described -in Section~\ref{ramds}, can be also used for a~variety of problems related -to ranking of combinatorial structures. Generally, the problems are stated -in the following way: - -\defn\id{rankdef}% -Let~$C$ be a~set of objects and~$\prec$ a~linear order on~$C$. The \df{rank} -$R_{C,\prec}(x)$ of an~element $x\in C$ is the number of elements $y\in C$ such that $y\prec x$. -We will call the function $R_{C,\prec}$ the \df{ranking function} for $C$ ordered by~$\prec$ -and its inverse $R^{-1}_{C,\prec}$ the \df{unranking function} for $C$ and~$\prec$. When the set -and the order are clear from the context, we will use plain~$R(x)$ and $R^{-1}(x)$. -Also, when $\prec$ is defined on a~superset~$C'$ of~$C$, we naturally extend $R_C(x)$ -to elements $x\in C'\setminus C$. - -\example -Let us consider the set $C_k=\{\0,\1\}^k$ of all binary strings of length~$k$ ordered -lexicographically. Then $R^{-1}(i)$ is the $i$-th smallest element of this set, that -is the number~$i$ written in binary and padded to~$k$ digits (i.e., $\(i)_k$ in the -notation of Section~\ref{ramds}). Obviously, $R(x)$ is the integer whose binary -representation is the string~$x$. - -%-------------------------------------------------------------------------------- - -\section{Ranking of permutations} -\id{pranksect} - -One of the most common ranking problems is ranking of permutations on the set~$[n]=\{1,2,\ldots,n\}$. -This is frequently used to create arrays indexed by permutations: for example in Ruskey's algorithm -for finding Hamilton cycles in Cayley graphs (see~\cite{ruskey:ham} and \cite{ruskey:hce}) -or when exploring state spaces of combinatorial puzzles like the Loyd's Fifteen \cite{ss:fifteen}. -Many other applications are surveyed by Critani et al.~\cite{critani:rau} and in -most cases, the time complexity of the whole algorithm is limited by the efficiency -of the (un)ranking functions. - -The permutations are usually ranked according to their lexicographic order. -In fact, an~arbitrary order is often sufficient if the ranks are used solely -for indexing of arrays. The lexicographic order however has an~additional advantage -of a~nice structure, which allows various operations on permutations to be -performed directly on their ranks. - -Na\"\i{}ve algorithms for lexicographic ranking require time $\Theta(n^2)$ in the -worst case \cite{reingold:catp} and even on average~\cite{liehe:raulow}. -This can be easily improved to $O(n\log n)$ by using either a binary search -tree to calculate inversions, or by a divide-and-conquer technique, or by clever -use of modular arithmetic (all three algorithms are described in Knuth -\cite{knuth:sas}). Myrvold and Ruskey \cite{myrvold:rank} mention further -improvements to $O(n\log n/\log \log n)$ by using the RAM data structures of Dietz -\cite{dietz:oal}. - -Linear time complexity was reached by Myrvold and Ruskey \cite{myrvold:rank} -for a~non-lexicographic order, which is defined locally by the history of the -data structure. -However, they leave the problem of lexicographic ranking open. -We will describe a~general procedure which, when combined with suitable -RAM data structures, yields a~linear-time algorithm for lexicographic -(un)ranking. - -\nota\id{brackets}% -We will view permutations on a~finite set $A\subseteq {\bb N}$ as ordered $\vert A\vert$-tuples -(in other words, arrays) containing every element of~$A$ exactly once. We will -use square brackets to index these tuples: $\pi=(\pi[1],\ldots,\pi[\vert A\vert])$, -and sub-tuples: $\pi[i\ldots j] = (\pi[i],\pi[i+1],\ldots,\pi[j])$. -The lexicographic ranking and unranking functions for the permutations on~$A$ -will be denoted by~$L(\pi,A)$ and $L^{-1}(i,A)$ respectively. - -\obs\id{permrec}% -Let us first observe that permutations have a simple recursive structure. -If we fix the first element $\pi[1]$ of a~permutation~$\pi$ on the set~$[n]$, the -elements $\pi[2], \ldots, \pi[n]$ form a~permutation on $[n]-\{\pi[1]\} = \{1,\ldots,\pi[1]-1,\pi[1]+1,\ldots,n\}$. -The lexicographic order of two permutations $\pi$ and~$\pi'$ on the original set is then determined -by $\pi[1]$ and $\pi'[1]$ and only if these elements are equal, it is decided -by the lexicographic comparison of permutations $\pi[2\ldots n]$ and $\pi'[2\ldots n]$. -Moreover, when we fix $\pi[1]$, all permutations on the smaller set occur exactly -once, so the rank of $\pi$ is $(\pi[1]-1)\cdot (n-1)!$ plus the rank of -$\pi[2\ldots n]$. - -This gives us a~reduction from (un)ranking of permutations on $[n]$ to (un)rank\-ing -of permutations on a $(n-1)$-element set, which suggests a straightforward -algorithm, but unfortunately this set is different from $[n-1]$ and it even -depends on the value of~$\pi[1]$. We could renumber the elements to get $[n-1]$, -but it would require linear time per iteration. To avoid this, we generalize the -problem to permutations on subsets of $[n]$. For a permutation $\pi$ on a~set -$A\subseteq [n]$ of size~$m$, similar reasoning gives a~simple formula: -$$ -L((\pi[1],\ldots,\pi[m]),A) = R_A(\pi[1]) \cdot (m-1)! + -L((\pi[2],\ldots,\pi[m]), A\setminus\{\pi[1]\}), -$$ -which uses the ranking function~$R_A$ for~$A$. This recursive formula immediately -translates to the following recursive algorithms for both ranking and unranking -(described for example in \cite{knuth:sas}): - -\alg $\(\pi,i,n,A)$: Compute the rank of a~permutation $\pi[i\ldots n]$ on~$A$. -\id{rankalg} -\algo -\:If $i\ge n$, return~0. -\:$a\=R_A(\pi[i])$. -\:$b\=\(\pi,i+1,n,A \setminus \{\pi[i]\})$. -\:Return $a\cdot(n-i)! + b$. -\endalgo - -\>We can call $\(\pi,1,n,[n])$ for ranking on~$[n]$, i.e., to calculate -$L(\pi,[n])$. - -\alg $\(j,i,n,A)$: Return an~array~$\pi$ such that $\pi[i\ldots n]$ is the $j$-th permutation on~$A$. -\id{unrankalg} -\algo -\:If $i>n$, return $(0,\ldots,0)$. -\:$x\=R^{-1}_A(\lfloor j/(n-i)! \rfloor)$. -\:$\pi\=\(j\bmod (n-i)!,i+1,n,A\setminus \{x\})$. -\:$\pi[i]\=x$. -\:Return~$\pi$. -\endalgo - -\>We can call $\(j,1,n,[n])$ to calculate $L^{-1}(j,[n])$. - -\paran{Representation of sets}% -The most time-consuming parts of the above algorithms are of course operations -on the set~$A$. If we store~$A$ in a~data structure of a~known time complexity, the complexity -of the whole algorithm is easy to calculate: - -\lemma\id{ranklemma}% -Suppose that there is a~data structure maintaining a~subset of~$[n]$ under a~sequence -of deletions, which supports ranking and unranking of elements, and that -the time complexity of a~single operation is at most~$t(n)$. -Then lexicographic ranking and unranking of permutations can be performed in time $\O(n\cdot t(n))$. - -If we store~$A$ in an~ordinary array, we have insertion and deletion in constant time, -but ranking and unranking in~$\O(n)$, so $t(n)=\O(n)$ and the algorithm is quadratic. -Binary search trees give $t(n)=\O(\log n)$. The data structure of Dietz \cite{dietz:oal} -improves it to $t(n)=O(\log n/\log \log n)$. In fact, all these variants are equivalent -to the classical algorithms based on inversion vectors, because at the time of processing~$\pi[i]$, -the value of $R_A(\pi[i])$ is exactly the number of elements forming inversions with~$\pi[i]$. - -To obtain linear time complexity, we will make use of the representation of -vectors by integers on the RAM as developed in Section~\ref{ramds}. We observe -that since the words of the RAM need to be able to hold integers as large as~$n!$, -the word size must be at least $\log n! = \Theta(n\log n)$. Therefore the whole -set~$A$ fits in~$\O(1)$ words and we get: - -\thmn{Lexicographic ranking of permutations} -When we order the permutations on the set~$[n]$ lexicographically, both ranking -and unranking can be performed on the RAM in time~$\O(n)$. - -\paran{The case of $k$-permutations}% -Our algorithm can be also generalized to lexicographic ranking of -\df{$k$-permutations,} that is of ordered $k$-tuples of distinct elements drawn from the set~$[n]$. -There are $n^{\underline k} = n\cdot(n-1)\cdot\ldots\cdot(n-k+1)$ -such $k$-permutations and they have a~recursive structure similar to the one of -the permutations. -Unfortunately, the ranks of $k$-permutations can be much smaller, so we can no -longer rely on the same data structure fitting in a constant number of word-sized integers. -For example, if $k=1$, the ranks are $\O(\log n)$-bit numbers, but the data -structure still requires $\Theta(n\log n)$ bits. - -We do a minor side step by remembering the complement of~$A$ instead, that is -the set of the at most~$k$ elements we have already seen. We will call this set~$H$ -(because it describes the ``holes'' in~$A$). Since $\Omega(k\log n)$ bits are needed -to represent the rank, the vector representation of~$H$ certainly fits in a~constant -number of words. When we translate the operations on~$A$ to operations on~$H$, -again stored as a~vector, we get: - -\thmn{Lexicographic ranking of $k$-permutations} -When we order the $k$-per\-mu\-ta\-tions on the set~$[n]$ lexicographically, both -ranking and unranking can be performed on the RAM in time~$\O(k)$. - -\section{Restricted permutations} - -Another interesting class of combinatorial objects that can be counted and -ranked are restricted permutations. An~archetypal member of this class are -permutations without a~fixed point, i.e., permutations~$\pi$ such that $\pi(i)\ne i$ -for all~$i$. These are also called \df{derangements} or \df{hatcheck permutations.} -We will present a~general (un)ranking method for any class of restricted -permutations and derive a~linear-time algorithm for the derangements from it. - -\defn\id{permnota}% -We will fix a~non-negative integer~$n$ and use ${\cal P}$ for the set of -all~permutations on~$[n]$. -A~\df{restriction graph} is a~bipartite graph~$G$ whose parts are two copies -of the set~$[n]$. A~permutation $\pi\in{\cal P}$ satisfies the restrictions -if $(i,\pi(i))$ is an~edge of~$G$ for every~$i$. - -\paran{Equivalent formulations}% -We will follow the path unthreaded by Kaplansky and Riordan -\cite{kaplansky:rooks} and charted by Stanley in \cite{stanley:econe}. -We will relate restricted permutations to placements of non-attacking -rooks on a~hollow chessboard. - -\defn -A~\df{board} is the grid $B=[n]\times [n]$. It consists of $n^2$ \df{squares.} -A~\df{trace} of a~permutation $\pi\in{\cal P}$ is the set of squares \hbox{$T(\pi)=\{ (i,\pi(i)) ; i\in[n] \}$.} - -\obs\id{rooksobs}% -The traces of permutations (and thus the permutations themselves) correspond -exactly to placements of $n$ rooks at the board in a~way such that the rooks do -not attack each other (i.e., there is at most one rook in every row and -likewise in every column; as there are $n$~rooks, there must be exactly one of them in -every row and column). When speaking about \df{rook placements,} we will always -mean non-attacking placements. - -Restricted permutations then correspond to placements of rooks on a~board with -some of the squares removed. The \df{holes} (missing squares) correspond to the -non-edges of~$G$, so $\pi\in{\cal P}$ satisfies the restrictions iff -$T(\pi)$ avoids the holes. - -Placements of~$n$ rooks (and therefore also restricted permutations) can be -also equated with perfect matchings in the restriction graph~$G$. The edges -of the matching correspond to the squares occupied by the rooks, the condition -that no two rooks share a~row nor column translates to the edges not touching -each other, and the use of exactly~$n$ rooks is equivalent to the matching -being perfect. - -There is also a~well-known correspondence between the perfect matchings -in a~bipartite graph and non-zero summands in the formula for the permanent -of the bipartite adjacency matrix~$M$ of the graph. This holds because the -non-zero summands are in one-to-one correspondence with the placements -of~$n$ rooks on the corresponding board. The number of restricted -permutations is therefore equal to the permanent of the matrix~$M$. - -The diversity of the characterizations of restricted permutations brings -both good and bad news. The good news is that we can use the -plethora of known results on bipartite matchings. Most importantly, we can efficiently -determine whether there exists at least one permutation satisfying a~given set of restrictions: - -\thm -There is an~algorithm which decides in time $\O(n^{1/2}\cdot m)$ whether there exists -a~permutation satisfying a~given restriction graph. The $n$ and~$m$ are the number -of vertices and edges of the restriction graph. - -The bad news is that computing the permanent is known to be~$\#\rm P$-complete even -for zero-one matrices (as proven by Valiant \cite{valiant:permanent}). -As a~ranking function for a~set of~matchings can be used to count all such -matchings, we obtain the following theorem: - -\thm\id{pcomplete}% -If there is a~polynomial-time algorithm for lexicographic ranking of permutations with -a~set of restrictions which is a~part of the input, then $\rm P=\#P$. - -However, the hardness of computing the permanent is the only obstacle. -We show that whenever we are given a~set of restrictions for which -the counting problem is easy (and it is also easy for subgraphs obtained -by deleting vertices), ranking is easy as well. The key will be once again -a~recursive structure, similar to the one we have seen in the case of plain -permutations in \ref{permrec}. We get: - -\thmn{Lexicographic ranking of restricted permutations} -Suppose that we have a~family of matrices ${\cal M}=\{M_1,M_2,\ldots\}$ such that $M_n\in \{0,1\}^{n\times n}$ -and it is possible to calculate the permanent of~$M'$ in time $\O(t(n))$ for every matrix $M'$ -obtained by deletion of rows and columns from~$M_n$. Then there exist algorithms -for ranking and unranking in ${\cal P}_{A,M_n}$ in time $\O(n^4 + n^2\cdot t(n))$ -if $M_n$ and an~$n$-element set~$A$ are given as a~part of the input. - -Our time bound for ranking of general restricted permutations section is obviously very coarse. -Its main purpose was to demonstrate that many special cases of the ranking problem can be indeed computed in polynomial time. -For most families of restriction matrices, we can do much better. These speedups are hard to state formally -in general (they depend on the structure of the matrices), but we demonstrate them on the -specific case of derangements. We show that each matrix can be sufficiently characterized -by two numbers: the order of the matrix and the number of zeroes in it. We find a~recurrent -formula for the permanent, based on these parameters, which we use to precalculate all -permanents in advance. When we plug it in the general algorithm, we get: - -\thmn{Ranking of derangements}% -For every~$n$, the derangements on the set~$[n]$ can be ranked and unranked according to the -lexicographic order in time~$\O(n)$ after spending $\O(n^2)$ on initialization of auxiliary tables. - -\schapter{Conclusions} - -We have seen the many facets of the minimum spanning tree problem. It has -turned out that while the major question of the existence of a~linear-time -MST algorithm is still open, backing off a~little bit in an~almost arbitrary -direction leads to a~linear solution. This includes classes of graphs with edge -density at least $\lambda_k(n)$ (the $k$-th row inverse of the Ackermann's function) for an~arbitrary fixed~$k$, -minor-closed classes, and graphs whose edge weights are -integers. Using randomness also helps, as does having the edges pre-sorted. - -If we do not know anything about the structure of the graph and we are only allowed -to compare the edge weights, we can use the Pettie's MST algorithm. -Its time complexity is guaranteed to be asymptotically optimal, -but we do not know what it really is --- the best what we have is -an~$\O(m\timesalpha(m,n))$ upper bound and the trivial $\Omega(m)$ lower bound. - -One thing we however know for sure. The algorithm runs on the weakest of our -computational models ---the Pointer Machine--- and its complexity is linear -in the minimum number of comparisons needed to decide the problem. We therefore -need not worry about the details of computational models, which have contributed -so much to the linear-time algorithms for our special cases. Instead, it is sufficient -to study the complexity of MST decision trees. However, not much is known about these trees so far. - -As for the dynamic algorithms, we have an~algorithm which maintains the minimum -spanning forest within poly-logarithmic time per operation. -The optimum complexity is once again undecided --- the known lower bounds are very far -from the upper ones. -The known algorithms run on the Pointer machine and we do not know if using a~stronger -model can help. - -For the ranking problems, the situation is completely different. We have shown -linear-time algorithms for three important problems of this kind. The techniques, -which we have used, seem to be applicable to other ranking problems. On the other -hand, ranking of general restricted permutations has turned out to balance on the -verge of $\#{\rm P}$-completeness. All our algorithms run -on the RAM model, which seems to be the only sensible choice for problems of -inherently arithmetic nature. While the unit-cost assumption on arithmetic operations -is not universally accepted, our results imply that the complexity of our algorithm -is dominated by the necessary arithmetics. - -Aside from the concrete problems we have solved, we have also built several algorithmic -techniques of general interest: the unification procedures using pointer-based -bucket sorting and the vector computations on the RAM. We hope that they will -be useful in many other algorithms. - -\schapter{Bibliography} - -\dumpbib - -\vfill\eject -\ifodd\pageno\else\hbox{}\fi - -\bye diff --git a/programs/n0.c b/programs/n0.c deleted file mode 100644 index 4327bd6..0000000 --- a/programs/n0.c +++ /dev/null @@ -1,156 +0,0 @@ -#include - -#define MAX 13 - -// Faktorial -int f(int n) -{ - static int ff[MAX] = { 1 }; - if (!ff[n]) - ff[n] = n*f(n-1); - return ff[n]; -} - -// Kombinacni cislo -int c(int n, int k) -{ - if (k > n/2) - k = n-k; - long long int r = 1; - for (int i=1; i<=k; i++) - { - r *= n--; - r /= i; - } - return r; -} - -// Satnarka -int s(int d) -{ - int r = 0; - int sg = 1; - for (int k=0; k<=d; k++) - { - r += sg*f(d)/f(k); - sg = -sg; - } - return r; -} - -// Castecne restrikce -int n0(int z, int d) -{ - static int nn[MAX][MAX]; - if (!nn[z][d]) - { - if (!z) - nn[z][d] = f(d); - else if (z == d) - nn[z][d] = s(d); - else - nn[z][d] = z*n0(z-1,d-1) + (d-z)*n0(z,d-1); - } - return nn[z][d]; -} - -// Vzorecek ze Stanleyho -int s0(int z, int d) -{ - int r = 0; - int p = 1; - for (int k=0; k<=z; k++) - { - r += p * f(d-k) * c(z,k); - p = -p; - } - return r; -} - -// Satnarciny pomerny -double alpha(int n) -{ - double x = 1; - int sg = -1; - for (int i=1; i<=n; i++) - { - x += sg*(1. / f(i)); - sg = -sg; - } - return x; -} - -int main(void) -{ - printf("Satnarka obema zpusoby:\n"); - for (int i=1; i= i) - printf("%d", n0(i, j)); - putchar('\t'); - } - putchar('\n'); - } - putchar('\n'); - - printf("Totez podle vzorecku ze Stanleyho:\n"); - for (int i=0; i= i) - printf("%d", s0(i, j)); - putchar('\t'); - } - putchar('\n'); - } - putchar('\n'); - - printf("Rozdily:\n"); - for (int i=0; i= i && i > 0) - printf("%d", n0(i-1,j)-n0(i,j)); - putchar('\t'); - } - putchar('\n'); - } - - printf("Pomery:\n"); - for (int i=0; i= i && i > 0) - { - double d = (double)n0(i-1,j)/(n0(i-1,j)-n0(i,j)); - printf("%2.4f", d); - } - putchar('\t'); - } - putchar('\n'); - } - - return 0; -} diff --git a/pubs.tex b/pubs.tex deleted file mode 100644 index 90faf9a..0000000 --- a/pubs.tex +++ /dev/null @@ -1,113 +0,0 @@ -\input macros.tex -\input fonts12.tex -\nopagenumbers - -\finaltrue -\hwobble=0mm -\advance\hsize by 1cm -\advance\vsize by 20pt - -\font\chapfont=csb14 at 16pt -\def\rawchapter#1{\vensure{0.5in}\bigskip\goodbreak -\leftline{\chapfont #1} -} - -\def\rawsection#1{\medskip\smallskip -\leftline{\secfont #1} -\nobreak -\smallskip -\nobreak -} - -\def\schapter#1{\chapter{#1}\medskip} - -\rawchapter{Publications of Martin Mare\v{s}} -\bigskip - -{ - -\def\bibitem[#1]#2#3\par{\:\eatspaces #3} -\def\em{\it} -\frenchspacing -\newcount\citecount -\def\newblock{\hskip .11em plus .33em minus .07em }% -\def\citelist{\numlist\singlecit\rightskip=0pt} -\def\singlecit{\global\advance\citecount by 1[\the\citecount]} -\hfuzz=4pt - -{\>\bo Research articles in journals and conference proceedings:} -\medskip - -\citelist - -\bibitem[Mar04]{mm:mst} -M.~Mare\v{s}. -\newblock {Two linear time algorithms for MST on minor closed graph classes}. -\newblock {\em {Archivum Mathematicum}}, 40:315--320. Masaryk University, Brno, - Czech Republic, 2004. - -\bibitem[MS07]{mm:rank} -M.~Mare\v{s} and M.~Straka. -\newblock Linear-time ranking of permutations. -\newblock In {\em Algorithms --- ESA 2007: 15th Annual European Symposium}, - volume 4698 of {\em {Lecture Notes in Computer Science}}, pages 187--193. - Springer Verlag, 2007. - -\bibitem[Mar07b]{mm:grading} -{M. Mare\v{s}}. -\newblock {Perspectives on Grading Systems}. -\newblock {\em Olympiads in Informatics}, 1:124--130. Institute of - Mathematics and Informatics, Vilnius, Lithuania, 2007. - -\endlist - -\>All three papers have been already published. - -\bigskip -{\>\bo Textbooks for university courses:} -\medskip - -\citelist - -\bibitem[Mar07]{mm:ga} -M.~Mare\v{s}. -\newblock {Krajinou grafov\'ych algoritm\accent23u (Through the Landscape of - Graph Algorithms)}. -\newblock ITI series 2007--330, Institut Teoretick\'e Informatiky, Praha, Czech - Republic, 2007. -\newblock -ISBN 978-80-239-9049-2. -\newblock In Czech. - -\endlist - -\bigskip - -\rawsection{Citations} - -\citelist - -%S. Tazari and M. Müller-Hannemann -%Shortest Paths in Linear Time on Minor-Closed Graph Classes with an Application to Steiner Tree Approximation (abstract) -%submitted for publication, 2007. - -\bibitem[HW07]{hochstein:maxflow} -J.~M. Hochstein and K.~Weihe. -\newblock {Maximum $s$-$t$-flow with $k$ crossings in $\O(k^3n \log n)$ time}. -\newblock In {\em SODA 2007: Proceedings of the 18th annual ACM-SIAM symposium - on Discrete algorithms}, pages 843--847, 2007. -\newblock Cites~[1]. - -\bibitem[MHT07]{tazari:mcgc} -M.~M\"uller-Hannemann and S.~Tazari. -\newblock {Handling Proper Minor-Closed Graph Classes in Linear Time: Shortest - Paths and 2-Approximate Steiner Trees}. -\newblock Tech Report 2007/5, University of Halle-Wittenberg, Institute of - Computer Science, 2007. -\newblock Cites~[1]. - -\endlist - -} - -\bye diff --git a/slides/Makefile b/slides/Makefile deleted file mode 100644 index 67ad505..0000000 --- a/slides/Makefile +++ /dev/null @@ -1,7 +0,0 @@ -all: slides.pdf - -slides.pdf: slides.tex - pdflatex slides.tex - -clean: - rm -f *~ *.{aux,log,nav,out,pdf,snm,toc} diff --git a/slides/brum2.png b/slides/brum2.png deleted file mode 100644 index 3b4fb98..0000000 Binary files a/slides/brum2.png and /dev/null differ diff --git a/slides/slides.tex b/slides/slides.tex deleted file mode 100644 index b0bacc8..0000000 --- a/slides/slides.tex +++ /dev/null @@ -1,364 +0,0 @@ -\documentclass{beamer} -\usepackage[latin2]{inputenc} -\usepackage{palatino} -\usetheme{Warsaw} -\title[Graph Algorithms]{Graph Algorithms\\Spanning Trees and Ranking} -\author[Martin Mare¹]{Martin Mare¹\\\texttt{mares@kam.mff.cuni.cz}} -\institute{Department of Applied Mathematics\\MFF UK Praha} -\date{2008} -\begin{document} -\setbeamertemplate{navigation symbols}{} -\setbeamerfont{title page}{family=\rmfamily} - -\def\[#1]{\hskip0.3em{\color{violet} [#1]}} -\def\O{{\cal O}} - -\begin{frame} -\titlepage -\end{frame} - -\begin{frame}{The Minimum Spanning Tree Problem} - -{\bf 1. Minimum Spanning Tree Problem:} - -\begin{itemize} -\item Given a~weighted undirected graph,\\ - what is its lightest spanning tree? -\item In fact, a~linear order on edges is sufficient. -\item Efficient solutions are very old \[Borùvka 1926] -\item A~long progression of faster and faster algorithms. -\item Currently very close to linear time, but still not there. -\end{itemize} - -\end{frame} - -\begin{frame}{The Ranking Problems} - -{\bf 2. Ranking of Combinatorial Structures:} - -\begin{itemize} -\item We are given a~set~$C$ of objects with a~linear order $\prec$. -\item {\bf Ranking function $R_\prec(x)$:} how many objects precede~$x$? -\item {\bf Unranking function $R^{-1}_\prec(i)$:} what is the $i$-th object? -\end{itemize} - -\pause - -\begin{example}[toy] -$C=\{0,1\}^n$ with lexicographic order - -\pause -$R$ = conversion from binary\\ -$R^{-1}$ = conversion to binary -\end{example} - -\pause -\begin{example}[a~real one] -$C=$ set of all permutations on $\{1,\ldots,n\}$ -\end{example} - -\pause -How to compute the (un)ranking function efficiently? - -For permutations, an $\O(n\log n)$ algorithm was known \[folklore]. - -We will show how to do that in $\O(n)$. - -\end{frame} - -\begin{frame}{Models of computation: RAM} - -As we approach linear time, we must specify the model. - -~ - -{\bf 1. The Random Access Machine (RAM):} - -\begin{itemize} -\item Works with integers -\item Memory: an array of integers indexed by integers -\end{itemize} - -~ -\pause - -Many variants exist, we will use the {\bf Word-RAM:} - -\begin{itemize} -\item Machine words of $W$ bits -\item The ``C operations'': arithmetics, bitwise logical op's -\item Unit cost -\item We know that $W\ge\log_2 \vert \hbox{input} \vert$ -\end{itemize} - -\end{frame} - -\begin{frame}{Models of computation: PM} - -{\bf 2. The Pointer Machine (PM):} - -\begin{itemize} -\item Memory cells accessed via pointers -\item Each cell contains $\O(1)$ pointers and $\O(1)$ symbols -\item Operates only on pointers and symbols -\end{itemize} - -~ -\pause - -\begin{beamerboxesrounded}[upper=block title example,shadow=true]{Key differences} -\begin{itemize} -\item PM has no arrays, we can emulate them in $\O(\log n)$ time. -\item PM has no arithmetics. -\end{itemize} -\end{beamerboxesrounded} - -~ - -We can emulate PM on RAM with constant slowdown. - -Emulation of RAM on PM is more expensive. - -\end{frame} - -\begin{frame}{PM Techniques} - -{\bf Bucket Sorting} does not need arrays. - -~ - -Interesting consequences: - -\begin{itemize} -\item Flattening of multigraphs in $\O(m+n)$ -\item Unification of sequences in $\O(n+\sum_i\ell_i+\vert\Sigma\vert)$ -\item (Sub)tree isomorphism in $\O(n)$ simplified \[M. 2008] -\item Batched graph computations \[Buchsbaum et al.~1998] -\end{itemize} - -\end{frame} - -\begin{frame}{RAM Techniques} - -We can use RAM as a vector machine: - -~ - -\begin{example}[parallel search] -\def\sep{\;{\color{brown}0}} -\def\seq{\;{\color{brown}1}} -\def\sez{\;{\color{red}0}} -We can encode the vector $(1,5,3,0)$ with 3-bit fields as: -\begin{center} -\sep001\sep101\sep011\sep000 -\end{center} -And then search for 3 by: - -\begin{center} -\begin{tabular}{rcl} - &\seq001\seq101\seq011\seq000 & $(1,5,3,0)$ \\ -{\sc xor} &\sep011\sep011\sep011\sep011 & $(3,3,3,3)$ \\ -\hline - &\seq010\seq110\seq000\seq011 \\ -$-$ &\sep001\sep001\sep001\sep001 & $(1,1,1,1)$ \\ -\hline - &\seq001\seq101\sez111\seq010 \\ -{\sc and} &\seq000\seq000\seq000\seq000 \\ -\hline - &\seq000\seq000\sez000\seq000 \\ -\end{tabular} -\end{center} -\end{example} - -\end{frame} - -\begin{frame}{RAM Data Structures} - -We can translate vector operations to $\O(1)$ RAM instructions - -\smallskip - -\dots\ as long as the vector fits in $\O(1)$ words. - -~ - -We can build ``small'' data structures operating in $\O(1)$ time: - -\begin{itemize} -\item Sets -\item Ordered sets with ranking -\item ``Small'' heaps of ``large'' integers \[Fredman \& Willard 1990] -\end{itemize} - -\end{frame} - -\begin{frame}{Minimum Spanning Trees} -Algorithms for Minimum Spanning Trees: - -\begin{itemize} -\item Classical algorithms \[Borùvka, Jarník-Prim, Kruskal] -\item Contractive: $\O(m\log n)$ using flattening on the PM \\ - (lower bound \[M.]) -\item Iterated: $\O(m\,\beta(m,n))$ \[Fredman \& Tarjan~1987] \\ - where $\beta(m,n) = \min\{ k: \log_2^{(k)} n \le m/n \}$ -\item Even better: $\O(m\,\alpha(m,n))$ using {\it soft heaps}\hfil\break\[Chazelle 1998, Pettie 1999] -\item MST verification: $\O(m)$ on RAM \[King 1997, M. 2008] -\item Randomized: $\O(m)$ expected on RAM \[Karger et al.~1995] -\end{itemize} -\end{frame} - -\begin{frame}{MST -- Special cases} - -Cases for which we have an $\O(m)$ algorithm: - -~ - -Special graph structure: - -\begin{itemize} -\item Planar graphs \[Tarjan 1976, Matsui 1995, M. 2004] (PM) -\item Minor-closed classes \[Tarjan 1983, M. 2004] (PM) -\item Dense graphs (by many of the general PM algorithms) -\end{itemize} - -~ -\pause - -Or we can assume more about weights: - -\begin{itemize} -\item $\O(1)$ different weights \[folklore] (PM) -\item Integer weights \[Fredman \& Willard 1990] (RAM) -\item Sorted weights (RAM) -\end{itemize} - -\end{frame} - -\begin{frame}{MST -- Optimality} - -There is a~provably optimal comparison-based algorithm \\\[Pettie \& Ramachandran 2002] - -~ - -However, there is a catch\alt<2->{:}{ \dots} \pause Nobody knows its complexity. - -~ - -We know that it is $\O({\cal T}(m,n))$ where ${\cal T}(m,n)$ is the depth -of the optimum MST decision tree. Any other algorithm provides an upper bound. - -\pause -~ - -\begin{corollary} -It runs on the PM, so we know that if there is a~linear-time algorithm, -it does not need any special RAM data structures. (They can however -help us to find it.) -\end{corollary} - -\end{frame} - -\begin{frame}{MST -- Dynamic algorithms} - -Sometimes, we need to find the MST of a~changing graph. \\ -We insert/delete edges, the structure responds with $\O(1)$ -modifications of the MST. - -\begin{itemize} -\item Unweighted cases, similar to dynamic connectivity: - \begin{itemize} - \item Incremental: $\O(\alpha(n))$ \[Tarjan 1975] - \item Fully dynamic: $\O(\log^2 n)$ \[Holm et al.~2001] - \end{itemize} -\pause -\item Weighted cases are harder: - \begin{itemize} - \item Decremental: $\O(\log^2 n)$ \[Holm et al.~2001] - \item Fully dynamic: $\O(\log^4 n)$ \[Holm et al.~2001] - \item Only~$C$ weights: $\O(C\log^2 n)$ \[M. 2008] - \end{itemize} -\pause -\item $K$ smallest spanning trees: - \begin{itemize} - \item Simple: $\O(T_{MST} + Km)$ \[Katoh et al.~1981, M.~2008] - \item Small~$K$: $\O(T_{MST} + \min(K^2, Km + K\log K))$ \[Eppst.~1992] - \item Faster: $\O(T_{MST} + \min(K^{3/2}, Km^{1/2}))$ \[Frederickson 1997] - \end{itemize} -\end{itemize} - -\end{frame} - -\begin{frame}{Back to Ranking} - -Ranking of permutations on the RAM: \[M. \& Straka 2007] - -\begin{itemize} -\item We need a DS for the subsets of $\{1,\ldots,n\}$ with ranking -\item The result can be $n!$ $\Rightarrow$ word size is $\Omega(n\log n)$ bits -\item We can represent the subsets as RAM vectors -\item This gives us an~$\O(n)$ time algorithm for (un)ranking -\end{itemize} - -~ - -Easily extendable to $k$-permutations, also in $\O(n)$ - -\end{frame} - -\begin{frame}{Restricted permutations} - -For restricted permutations (e.g., derangements): \[M. 2008] - -\begin{itemize} -\item Describe restrictions by a~bipartite graph -\item Existence of permutation reduces to network flows -\item The ranking function can be used to calculate permanents,\\ - so it is $\#\rm P$-complete -\item However, this is the only obstacle. Calculating $\O(n)$ - sub-permanents is sufficient. -\item For derangements, we have achieved $\O(n)$ time after $\O(n^2)$ time - preprocessing. -\end{itemize} - -\end{frame} - -\begin{frame}{Summary} - -Summary:\\ - -\begin{itemize} -\item Low-level algorithmic techniques on RAM and PM -\item Generalized pointer-based sorting and RAM vectors -\item Applied to a~variety of problems: - \begin{itemize} - \item A~short linear-time tree isomorphism algorithm - \item A~linear-time algorithm for MST on minor-closed classes - \item Corrected and simplified MST verification - \item Dynamic MST with small weights - \item {\it Ranking and unranking of permutations} - \end{itemize} -\item Also: - \begin{itemize} - \item A~lower bound for the Contractive Borùvka's algorithm - \item Simplified soft-heaps - \end{itemize} -\end{itemize} - -\end{frame} - -\begin{frame}{Good Bye} - -\bigskip - -\centerline{\sc\huge The End} - -\bigskip - -\begin{figure} -\pgfdeclareimage[width=0.3\hsize]{brum}{brum2.png} -\pgfuseimage{brum} -\end{figure} - -\end{frame} - -\end{document}