Cover of the abstract.

[saga.git] / mst.tex
diff --git a/mst.tex b/mst.tex

index 33852d199e77452c968ef53b4df5bb239ed21a59..5cccbb9810448adefd7980aa99b52adcebc11ac5 100644 (file)
--- a/mst.tex
+++ b/mst.tex
@@ -6,8 +6,816 @@
  
  \section{The Problem}
  
-\cite{boruvka:ojistem}
+The problem of finding a minimum spanning tree of a weighted graph is one of the
+best studied problems in the area of combinatorial optimization since its birth.
+Its colorful history (see \cite{graham:msthistory} and \cite{nesetril:history} for the full account)
+begins in~1926 with the pioneering work of Bor\o{u}vka
+\cite{boruvka:ojistem}\foot{See \cite{nesetril:boruvka} for an English translation with commentary.},
+who studied primarily an Euclidean version of the problem related to planning
+of electrical transmission lines (see \cite{boruvka:networks}), but gave an efficient
+algorithm for the general version of the problem. As it was well before the dawn of graph
+theory, the language of his paper was complicated, so we will better state the problem
+in contemporary terminology:
  
-% mention Steiner trees
+\proclaim{Problem}Given an undirected graph~$G$ with weights $w:E(G)\rightarrow {\bb R}$,
+find its minimum spanning tree, defined as follows:
+
+\defn\id{mstdef}%
+For a given graph~$G$ with weights $w:E(G)\rightarrow {\bb R}$:
+\itemize\ibull
+\:A~subgraph $H\subseteq G$ is called a \df{spanning subgraph} if $V(H)=V(G)$.
+\:A~\df{spanning tree} of~$G$ is any spanning subgraph of~$G$ that is a tree.
+\:For any subgraph $H\subseteq G$ we define its \df{weight} $w(H):=\sum_{e\in E(H)} w(e)$.
+  When comparing two weights, we will use the terms \df{lighter} and \df{heavier} in the
+  obvious sense.
+\:A~\df{minimum spanning tree (MST)} of~$G$ is a spanning tree~$T$ such that its weight $w(T)$
+  is the smallest possible among all the spanning trees of~$G$.
+\:For a disconnected graph, a \df{(minimum) spanning forest (MSF)} is defined as
+  a union of (minimum) spanning trees of its connected components.
+\endlist
+
+Bor\o{u}vka's work was further extended by Jarn\'\i{}k \cite{jarnik:ojistem}, again in
+mostly geometric setting. He has discovered another efficient algorithm. However, when
+computer science and graph theory started forming in the 1950's and the
+spanning tree problem was one of the central topics of the flourishing new
+disciplines, the previous work was not well known and the algorithms had to be
+rediscovered several times.
+
+In the next 50 years, several significantly faster algorithms were discovered, ranging
+from the $\O(m\timesbeta(m,n))$ time algorithm by Fredman and Tarjan \cite{ft:fibonacci},
+over algorithms with inverse-Ackermann type complexity by Chazelle \cite{chazelle:ackermann}
+and Pettie \cite{pettie:ackermann}, to an~algorithm by Pettie \cite{pettie:optimal}
+whose time complexity is provably optimal.
+
+In the upcoming chapters, we will explore this colorful universe of MST algorithms.
+We will meet the canonical works of the classics, the clever ideas of their successors,
+various approaches to the problem including randomization and solving of important
+special cases. At several places, we will try to contribute our little stones to this
+mosaic.
+
+%--------------------------------------------------------------------------------
+
+\section{Basic properties}\id{mstbasics}%
+
+In this section, we will examine the basic properties of spanning trees and prove
+several important theorems which will serve as a~foundation for our MST algorithms.
+We will mostly follow the theory developed by Tarjan in~\cite{tarjan:dsna}.
+
+For the whole section, we will fix a~connected graph~$G$ with edge weights~$w$ and all
+other graphs will be spanning subgraphs of~$G$. We will use the same notation
+for the subgraphs as for the corresponding sets of edges.
+
+First of all, let us show that the weights on edges are not necessary for the
+definition of the MST. We can formulate an equivalent characterization using
+an~ordering of edges instead.
+
+\defnn{Heavy and light edges}\id{heavy}%
+Let~$T$ be a~spanning tree. Then:
+\itemize\ibull
+\:For vertices $x$ and $y$, let $T[x,y]$ denote the (unique) path in~$T$ joining $x$ with~$y$.
+\:For an edge $e=xy$ we will call $T[e]:=T[x,y]$ the \df{path covered by~$e$} and
+  the edges of this path \df{edges covered by~$e$}.
+\:An edge~$e$ is called \df{light with respect to~$T$} (or just \df{$T$-light}) if it covers a~heavier edge, i.e., if there
+  is an~edge $f\in T[e]$ such that $w(f) > w(e)$.
+\:An edge~$e$ is called \df{$T$-heavy} if it covers a~lighter edge.
+\endlist
+
+\rem
+Edges of the tree~$T$ cover only themselves and thus they are neither heavy nor light.
+The same can happen if an~edge outside~$T$ covers only edges of the same weight,
+but this will be rare because all edge weights will be usually distinct.
+
+\lemman{Light edges}\id{lightlemma}%
+Let $T$ be a spanning tree. If there exists a $T$-light edge, then~$T$
+is not minimum.
+
+\proof
+If there is a $T$-light edge~$e$, then there exists an edge $e'\in T[e]$ such
+that $w(e')>w(e)$. Now $T-e'$ ($T$~with the edge~$e'$ removed) is a forest of two trees with endpoints of~$e$
+located in different components, so adding $e$ to this forest must restore
+connectivity and $T':=T-e'+e$ is another spanning tree with weight $w(T')
+= w(T)-w(e')+w(e) < w(T)$. Hence $T$ could not have been minimum.
+\qed
+
+\figure{mst2.eps}{278pt}{An edge exchange as in the proof of Lemma~\ref{lightlemma}}
+
+The converse of this lemma is also true and to prove it, we will once again use
+the technique of transforming trees by \df{exchanges of edges.} In the proof of the
+lemma, we have made use of the fact that whenever we exchange an edge~$e$ of
+a~spanning tree for another edge~$f$ covered by~$e$, the result is again
+a~spanning tree. In fact, it is possible to transform any spanning tree
+to any other spanning tree by a sequence of exchanges.
+
+\lemman{Exchange property for trees}\id{xchglemma}%
+Let $T$ and $T'$ be spanning trees of a common graph. Then there exists
+a sequence of edge exchanges that transforms $T$ to~$T'$. More formally,
+there exists a sequence of spanning trees $T=T_0,T_1,\ldots,T_k=T'$ such that
+$T_{i+1}=T_i - e_i + e_i^\prime$ where $e_i\in T_i$ and $e_i^\prime\in T'$.
+
+\proof
+By induction on $d(T,T'):=\vert T\symdiff T'\vert$. When $d(T,T')=0$,
+both trees are identical and no exchanges are needed. Otherwise, the trees are different,
+but as they have the same number of edges, there must exist an edge $e'\in T'\setminus T$.
+The cycle $T[e']+e'$ cannot be wholly contained in~$T'$, so there also must
+exist an edge $e\in T[e']\setminus T'$. Exchanging $e$ for~$e'$ yields a spanning
+tree $T^*:=T-e+e'$ such that $d(T^*,T')=d(T,T')-2$. Now we can apply the induction
+hypothesis to $T^*$ and $T'$ to get the rest of the exchange sequence.
+\qed
+
+\figure{mst1.eps}{295pt}{One step of the proof of Lemma~\ref{xchglemma}}
+
+\>In some cases, a~much stronger statement is true:
+
+\lemman{Monotone exchanges}\id{monoxchg}%
+Let $T$ be a spanning tree such that there are no $T$-light edges and $T'$
+be an arbitrary spanning tree. Then there exists a sequence of edge exchanges
+transforming $T$ to~$T'$ such that the weight of the tree does not decrease in any step.
+
+\proof
+We improve the argument from the previous proof, refining the induction step.
+When we exchange $e\in T$ for $e'\in T'\setminus T$ such that $e\in T[e']$,
+the weight never drops, since $e'$ is not a $T$-light edge and therefore
+$w(e') \ge w(e)$, so $w(T^*)=w(T)-w(e)+w(e')\ge w(T)$.
+
+To keep the induction going, we have to make sure that there are still no light
+edges with respect to~$T^*$. In fact, it is enough to avoid such edges in
+$T'\setminus T^*$, since these are the only edges considered by the induction
+steps. To accomplish that, we replace the so far arbitrary choice of $e'\in T'\setminus T$
+by picking the lightest such edge.
+
+Let us consider an edge $f\in T'\setminus T^*$. We want to show that $f$ is not
+$T^*$-light, i.e., that it is heavier than all edges on $T^*[f]$. The path $T^*[f]$ is
+either identical to the original path $T[f]$ (if $e\not\in T[f]$) or to $T[f] \symdiff C$,
+where $C$ is the cycle $T[e']+e'$. The former case is trivial, in the latter we have
+$w(f)\ge w(e')$ due to the choice of~$e'$ and all other edges on~$C$ are lighter
+than~$e'$ as $e'$ was not $T$-light.
+\qed
+
+This lemma immediately implies that Lemma \ref{lightlemma} works in both directions:
+
+\thmn{Minimality of spanning trees}\id{mstthm}%
+A~spanning tree~$T$ is minimum iff there is no $T$-light edge.
+
+\proof
+If~$T$ is minimum, then by Lemma~\ref{lightlemma} there are no $T$-light
+edges.
+Conversely, when $T$ is a spanning tree without $T$-light edges
+and $T_{min}$ is an arbitrary minimum spanning tree, then according to the Monotone
+exchange lemma (\ref{monoxchg}) there exists a non-decreasing sequence
+of exchanges transforming $T$ to $T_{min}$, so $w(T)\le w(T_{min})$
+and thus $T$~is also minimum.
+\qed
+
+In general, a single graph can have many minimum spanning trees (for example
+a complete graph on~$n$ vertices with unit edge weights has $n^{n-2}$
+minimum spanning trees according to the Cayley's formula \cite{cayley:trees}).
+However, as the following theorem shows, this is possible only if the weight
+function is not injective.
+
+\thmn{Uniqueness of MST}%
+If all edge weights are distinct, then the minimum spanning tree is unique.
+
+\proof
+Consider two minimum spanning trees $T_1$ and~$T_2$. According to the previous
+theorem, there are no light edges with respect to neither of them, so by the
+Monotone exchange lemma (\ref{monoxchg}) there exists a sequence of non-decreasing
+edge exchanges going from $T_1$ to $T_2$. As all edge weights all distinct,
+these edge exchanges must be in fact strictly increasing. On the other hand,
+we know that $w(T_1)=w(T_2)$, so the exchange sequence must be empty and indeed
+$T_1$ and $T_2$ must be identical.
+\looseness=1  %%HACK%%
+\qed
+
+\nota\id{mstnota}%
+When $G$ is a graph with distinct edge weights, we will use $\mst(G)$ to denote
+its unique minimum spanning tree.
+
+Also the following trivial lemma will be often invaluable:
+
+\lemman{Edge removal}
+Let~$G$ be a~graph with distinct edge weights and $e \in G\setminus\mst(G)$.
+Then $\mst(G-e) = \mst(G)$.
+
+\proof
+The tree $T=\mst(G)$ is also a~MST of~$G-e$, because every $T$-light
+edge in~$G-e$ is also $T$-light in~$G$. Then we apply the uniqueness of
+the MST of~$G-e$.
+\qed
+
+\paran{Comparison oracles}\id{edgeoracle}%
+To simplify the description of MST algorithms, we will assume that the weights
+of all edges are distinct and that instead of numeric weights we are given a~comparison oracle.
+The oracle is a~function that answers questions of type ``Is $w(e)<w(f)$?'' in
+constant time. This will conveniently shield us from problems with representation
+of real numbers in algorithms and in the few cases where we need a more concrete
+input, we will explicitly state so.
+
+In case the weights are not distinct, we can easily break ties by comparing some
+unique identifiers of edges. According to our characterization of minimum spanning
+trees, the unique MST of the new graph will still be a~MST of the original graph.
+Sometimes, we could be interested in finding all solutions, but as this is an~uncommon
+problem, we will postpone it until Section \ref{kbestsect}. For the time being,
+we will always assume distinct weights.
+
+\obs
+If all edge weights are distinct and $T$~is an~arbitrary spanning tree, then every edge of~$G$
+is either $T$-heavy, or $T$-light, or contained in~$T$.
+
+\paran{Monotone isomorphism}%
+Another useful consequence of the Minimality theorem is that whenever two graphs are isomorphic and the
+isomorphism preserves the relative order of weights, the isomorphism applies to their MST's as well:
+
+\defn
+A~\df{monotone isomorphism} between two weighted graphs $G_1=(V_1,E_1,w_1)$ and
+$G_2=(V_2,E_2,w_2)$ is a bijection $\pi: V_1\rightarrow V_2$ such that
+for each $u,v\in V_1: uv\in E_1 \Leftrightarrow \pi(u)\pi(v)\in E_2$ and
+for each $e,f\in E_1: w_1(e)<w_1(f) \Leftrightarrow w_2(\pi[e]) < w_2(\pi[f])$.
+
+\lemman{MST of isomorphic graphs}\id{mstiso}%
+Let~$G_1$ and $G_2$ be two weighted graphs with distinct edge weights and $\pi$
+a~monotone isomorphism between them. Then $\mst(G_2) = \pi[\mst(G_1)]$.
+
+\proof
+The isomorphism~$\pi$ maps spanning trees to spanning trees bijectively and it preserves
+the relation of covering. Since it is monotone, it preserves the property of
+being a light edge (an~edge $e\in E(G_1)$ is $T$-light $\Leftrightarrow$
+the edge $\pi[e]\in E(G_2)$ is~$f[T]$-light). Therefore by the Minimality Theorem
+(\ref{mstthm}), $T$ is the MST of~$G_1$ if and only if $\pi[T]$ is the MST of~$G_2$.
+\qed
+
+%--------------------------------------------------------------------------------
+
+\section{The Red-Blue meta-algorithm}
+
+Most MST algorithms can be described as special cases of the following procedure
+(again following Tarjan \cite{tarjan:dsna}):
+
+\algn{Red-Blue Meta-Algorithm}\id{rbma}%
+\algo
+\algin A~graph $G$ with an edge comparison oracle (see \ref{edgeoracle})
+\:At the beginning, all edges are colored black.
+\:Apply rules as long as possible:
+\::Either pick a cut~$C$ such that its lightest edge is not blue \hfil\break and color this edge blue, \cmt{Blue rule}
+\::or pick a cycle~$C$ such that its heaviest edge is not red \hfil\break and color this edge \rack{blue.}{red.\hfil} \cmt{Red rule}
+\algout Minimum spanning tree of~$G$ consisting of edges colored blue.
+\endalgo
+
+\para
+This procedure is not a proper algorithm, since it does not specify how to choose
+the rule to apply. We will however prove that no matter how the rules are applied,
+the procedure always stops and it gives the correct result. Also, it will turn out
+that each of the classical MST algorithms can be described as a specific way
+of choosing the rules in this procedure, which justifies the name meta-algorithm.
+
+\nota
+We will denote the unique minimum spanning tree of the input graph by~$T_{min}$.
+We intend to prove that this is also the output of the procedure.
+
+\paran{Correctness}%
+Let us prove that the meta-algorithm is correct. First we show that the edges colored
+blue in any step of the procedure always belong to~$T_{min}$ and that the edges colored
+red are guaranteed to be outside~$T_{min}$. Then we demonstrate that the procedure
+always stops. Some parts of the proof will turn out to be useful in the upcoming chapters,
+so we will state them in a~slightly more general way.
+
+\lemman{Blue lemma, also known as the Cut rule}\id{bluelemma}%
+The lightest edge of every cut is contained in the MST.
+
+\proof
+By contradiction. Let $e$ be the lightest edge of a cut~$C$.
+If $e\not\in T_{min}$, then there must exist an edge $e'\in T_{min}$ that is
+contained in~$C$ (take any pair of vertices separated by~$C$: the path
+in~$T_{min}$ joining these vertices must cross~$C$ at least once). Exchanging
+$e$ for $e'$ in $T_{min}$ yields an even lighter spanning tree since
+$w(e)<w(e')$.
+\qed
+
+\lemman{Red lemma, also known as the Cycle rule}\id{redlemma}%
+An~edge~$e$ is not contained in the MST iff it is the heaviest on some cycle.
+
+\proof
+The implication from the left to the right follows directly from the Minimality
+theorem: if~$e\not\in T_{min}$, then $e$~is $T_{min}$-heavy and so it is the heaviest
+edge on the cycle $T_{min}[e]+e$.
+
+We will prove the other implication again by contradiction. Suppose that $e$ is the heaviest edge of
+a cycle~$C$ and that $e\in T_{min}$. Removing $e$ causes $T_{min}$ to split
+to two components, let us call them $T_x$ and~$T_y$. Some vertices of~$C$ now lie in~$T_x$, the
+others in~$T_y$, so there must exist in edge $e'\ne e$ such that its endpoints lie in different
+components. Since $w(e')<w(e)$, exchanging $e$ for~$e'$ yields a~spanning tree lighter than
+$T_{min}$.
+\qed
+
+\figure{mst-rb.eps}{289pt}{Proof of the Blue (left) and Red (right) lemma}
+
+\lemman{Black lemma}%
+As long as there exists a black edge, at least one rule can be applied.
+
+\proof
+Assume that $e=xy$ is a black edge. Let us define~$M$ as the set of vertices
+reachable from~$x$ using only blue edges. If $y$~lies in~$M$, then $e$ together
+with some blue path between $x$ and $y$ forms a cycle and $e$~must be the heaviest
+edge on this cycle. This holds because all blue edges have been already proven
+to be in $T_{min}$ and there can be no $T_{min}$-light edges.
+In this case, we can apply the Red rule.
+
+On the other hand, if $y\not\in M$, then the cut formed by all edges between $M$
+and $V\setminus M$ contains no blue edges, therefore we can use the Blue rule.
+\qed
+
+\figure{mst-bez.eps}{295pt}{Configurations in the proof of the Black lemma}
+
+\nota\id{deltanota}%
+We will use $\delta(M)$ to denote the cut separating~$M$ from its complement.
+That is, $\delta(M) = E \cap (M \times (V\setminus M))$. We will also abbreviate
+$\delta(\{v\})$ as~$\delta(v)$.
+
+\thmn{Red-Blue correctness}%
+For any selection of rules, the Red-Blue procedure stops and the blue edges form
+the minimum spanning tree of the input graph.
+
+\proof
+To prove that the procedure stops, let us notice that no edge is ever recolored,
+so we must run out of black edges after at most~$m$ steps. Recoloring
+to the same color is avoided by the conditions built in the rules, recoloring to
+a different color would mean that the edge would be both inside and outside~$T_{min}$
+due to our Red and Blue lemmata.
+
+When no further rules can be applied, the Black lemma guarantees that all edges
+are colored, so by the Blue lemma all blue edges are in~$T_{min}$ and by the Red
+lemma all other (red) edges are outside~$T_{min}$. Thus the blue edges are exactly~$T_{min}$.
+\qed
+
+\rem
+The MST problem is a~special case of the problem of finding the minimum basis
+of a~weighted matroid. Surprisingly, when we modify the Red-Blue procedure to
+use the standard definitions of cycles and cuts in matroids, it will always
+find the minimum basis. Some of the other MST algorithms also easily generalize to
+matroids and in some sense matroids are exactly the objects where ``the greedy approach works''. We
+will however not pursue this direction in our work, referring the reader to the Oxley's monograph
+\cite{oxley:matroids} instead.
+
+%--------------------------------------------------------------------------------
+
+\section{Classical algorithms}\id{classalg}%
+
+The three classical MST algorithms (Bor\o{u}vka's, Jarn\'\i{}k's, and Kruskal's) can be easily
+stated in terms of the Red-Blue meta-algorithm. For each of them, we first show the general version
+of the algorithm, then we prove that it gives the correct result and finally we discuss the time
+complexity of various implementations.
+
+\paran{Bor\o{u}vka's algorithm}%
+The oldest MST algorithm is based on a~simple idea: grow a~forest in a~sequence of
+iterations until it becomes connected. We start with a~forest of isolated
+vertices. In each iteration, we let each tree of the forest select the lightest
+edge of those having exactly one endpoint in the tree (we will call such edges
+the \df{neighboring edges} of the tree). We add all such edges to the forest and
+proceed with the next iteration.
+
+\algn{Bor\o{u}vka \cite{boruvka:ojistem}, Choquet \cite{choquet:mst}, Sollin \cite{sollin:mst}, and others}
+\algo
+\algin A~graph~$G$ with an edge comparison oracle.
+\:$T\=$ a forest consisting of vertices of~$G$ and no edges.
+\:While $T$ is not connected:
+\::For each component $T_i$ of~$T$, choose the lightest edge $e_i$ from the cut
+   separating $T_i$ from the rest of~$T$.
+\::Add all $e_i$'s to~$T$.
+\algout Minimum spanning tree~$T$.
+\endalgo
+
+\lemma\id{boruvkadrop}%
+In each iteration of the algorithm, the number of trees in~$T$ decreases by at least
+a~factor of two.
+
+\proof
+Each tree gets merged with at least one of its neighbors, so each of the new trees
+contains two or more original trees.
+\qed
+
+\cor
+The algorithm stops in $\O(\log n)$ iterations.
+
+\lemma\id{borcorr}%
+The Bor\o{u}vka's algorithm outputs the MST of the input graph.
+
+\proof
+In every iteration of the algorithm, $T$ is a blue subgraph,
+because every addition of some edge~$e_i$ to~$T$ is a straightforward
+application of the Blue rule. We stop when the blue subgraph is connected, so
+we do not need the Red rule to explicitly exclude edges.
+
+It remains to show that adding the edges simultaneously does not
+produce a~cycle. Consider the first iteration of the algorithm where $T$ contains a~cycle~$C$. Without
+loss of generality we can assume that:
+$$C=T_1[u_1,v_1]\,v_1u_2\,T_2[u_2,v_2]\,v_2u_3\,T_3[u_3,v_3]\, \ldots \,T_k[u_k,v_k]\,v_ku_1.$$
+Each component $T_i$ has chosen its lightest incident edge~$e_i$ as either the edge $v_iu_{i+1}$
+or $v_{i-1}u_i$ (indexing cyclically). Suppose that $e_1=v_1u_2$ (otherwise we reverse the orientation
+of the cycle). Then $e_2=v_2u_3$ and $w(e_2)<w(e_1)$ and we can continue in the same way,
+getting $w(e_1)>w(e_2)>\ldots>w(e_k)>w(e_1)$, which is a~contradiction.
+(Note that distinctness of edge weights was crucial here.)
+\qed
+
+\lemma\id{boruvkaiter}%
+Each iteration can be carried out in time $\O(m)$.
+
+\proof
+We assign a label to each tree and we keep a mapping from vertices to the
+labels of the trees they belong to. We scan all edges, map their endpoints
+to the particular trees and for each tree we maintain the lightest incident edge
+so far encountered. Instead of merging the trees one by one (which would be too
+slow), we build an auxiliary graph whose vertices are the labels of the original
+trees and edges correspond to the chosen lightest inter-tree edges. We find the connected
+components of this graph, and these determine how are the original labels translated
+to the new labels.
+\qed
+
+\thm
+The Bor\o{u}vka's algorithm finds the MST in time $\O(m\log n)$.
+
+\proof
+Follows from the previous lemmata.
+\qed
+
+\paran{Jarn\'\i{}k's algorithm}%
+The next algorithm, discovered independently by Jarn\'\i{}k, Prim and Dijkstra, is similar
+to the Bor\o{u}vka's algorithm, but instead of the whole forest it concentrates on
+a~single tree. It starts with a~single vertex and it repeatedly extends the tree
+by the lightest neighboring edge until the tree spans the whole graph.
+
+\algn{Jarn\'\i{}k \cite{jarnik:ojistem}, Prim \cite{prim:mst}, Dijkstra \cite{dijkstra:mst}}\id{jarnik}%
+\algo
+\algin A~graph~$G$ with an edge comparison oracle.
+\:$T\=$ a single-vertex tree containing an~arbitrary vertex of~$G$.
+\:While there are vertices outside $T$:
+\::Pick the lightest edge $uv$ such that $u\in V(T)$ and $v\not\in V(T)$.
+\::$T\=T+uv$.
+\algout Minimum spanning tree~$T$.
+\endalgo
+
+\lemma
+The Jarn\'\i{}k's algorithm computes the MST of the input graph.
+
+\proof
+If~$G$ is connected, the algorithm always stops. In every step of
+the algorithm, $T$ is always a blue tree. because Step~4 corresponds to applying
+the Blue rule to the cut $\delta(T)$ separating~$T$ from the rest of the given graph. We need not care about
+the remaining edges, since for a connected graph the algorithm always stops with the right
+number of blue edges.
+\qed
+
+\impl\id{jarnimpl}%
+The most important part of the algorithm is finding the \em{neighboring edges.}
+In a~straightforward implementation, searching for the lightest neighboring
+edge takes $\Theta(m)$ time, so the whole algorithm runs in time $\Theta(mn)$.
+
+We can do much better by using a binary
+heap to hold all neighboring edges. In each iteration, we find and delete the
+minimum edge from the heap and once we expand the tree, we insert the newly discovered
+neighboring edges to the heap and delete the neighboring edges that became
+internal to the new tree. Since there are always at most~$m$ edges in the heap,
+each heap operation takes $\O(\log m)=\O(\log n)$ time. For every edge, we perform
+at most one insertion and at most one deletion, so we spend $\O(m\log n)$ time in total.
+From this, we can conclude:
+
+\thm
+The Jarn\'\i{}k's algorithm computes the MST of a~given graph in time $\O(m\log n)$.
+
+\rem
+We will show several faster implementations in Section \ref{iteralg}.
+
+\paran{Kruskal's algorithm}%
+The last of the three classical algorithms processes the edges of the
+graph~$G$ greedily. It starts with an~empty forest and it takes the edges of~$G$
+in order of their increasing weights. For every edge, it checks whether its
+addition to the forest produces a~cycle and if it does not, the edge is added.
+Otherwise, the edge is dropped and not considered again.
+
+\algn{Kruskal \cite{kruskal:mst}}
+\algo
+\algin A~graph~$G$ with an edge comparison oracle.
+\:Sort edges of~$G$ by their increasing weights.
+\:$T\=\hbox{an empty spanning subgraph}$.
+\:For all edges $e$ in their sorted order:
+\::If $T+e$ is acyclic, add~$e$ to~$T$.
+\::Otherwise drop~$e$.
+\algout Minimum spanning tree~$T$.
+\endalgo
+
+\lemma
+The Kruskal's algorithm returns the MST of the input graph.
+
+\proof
+In every step, $T$ is a forest of blue trees. Adding~$e$ to~$T$
+in step~4 applies the Blue rule on the cut separating some pair of components of~$T$ ($e$ is the lightest,
+because all other edges of the cut have not been considered yet). Dropping~$e$ in step~5 corresponds
+to the Red rule on the cycle found ($e$~must be the heaviest, since all other edges of the
+cycle have been already processed). At the end of the algorithm, all edges are colored,
+so~$T$ must be the~MST.
+\qed
+
+\impl
+Except for the initial sorting, which in general requires $\Theta(m\log m)$ time, the only
+other non-trivial operation is the detection of cycles. What we need is a~data structure
+for maintaining connected components, which supports queries and edge insertion.
+This is closely related to the well-known Disjoint Set Union problem:
+
+\problemn{Disjoint Set Union, DSU}
+Maintain an~equivalence relation on a~finite set under a~sequence of operations \<Union>
+and \<Find>. The \<Find> operation tests whether two elements are equivalent and \<Union>
+joins two different equivalence classes into one.
+
+\para
+We can maintain the connected components of our forest~$T$ as equivalence classes. When we want
+to add an~edge~$uv$, we first call $\<Find>(u,v)$ to check if both endpoints of the edge lie in
+the same component. If they do not, addition of this edge connects both components into one,
+so we perform $\<Union>(u,v)$ to merge the equivalence classes.
+
+Tarjan has shown that there is a~data structure for the DSU problem
+of surprising efficiency:
+
+\thmn{Disjoint Set Union, Tarjan \cite{tarjan:setunion}}\id{dfu}%
+Starting with a~trivial equivalence with single-element classes, a~sequence of operations
+comprising of $n$~\<Union>s intermixed with $m\ge n$~\<Find>s can be processed in time
+$\O(m\timesalpha(m,n))$, where $\alpha(m,n)$ is a~certain inverse of the Ackermann's function
+(see Definition \ref{ackerinv}).
+
+\proof
+See \cite{tarjan:setunion}.
+\qed
+
+This completes the following theorem:
+
+\thm\id{kruskal}%
+The Kruskal's algorithm finds the MST of a given graph in time $\O(m\log n)$.
+If the edges are already sorted by their weights, the time drops to
+$\O(m\timesalpha(m,n))$.
+
+\proof
+We spend $\O(m\log n)$ time on sorting, $\O(m\timesalpha(m,n))$ on processing the sequence
+of \<Union>s and \<Find>s, and $\O(m)$ on all other work.
+\qed
+
+\rem
+The cost of the \<Union> and \<Find> operations is of course dwarfed by the complexity
+of sorting, so a much simpler (at least in terms of its analysis) data
+structure would be sufficient, as long as it has $\O(\log n)$ amortized complexity
+per operation. For example, we can label vertices with identifiers of the
+corresponding components and always relabel the smaller of the two components.
+
+We will study dynamic maintenance of connected components in more detail in Chapter~\ref{dynchap}.
+
+%--------------------------------------------------------------------------------
+
+\section{Contractive algorithms}\id{contalg}%
+
+While the classical algorithms are based on growing suitable trees, they
+can be also reformulated in terms of edge contraction. Instead of keeping
+a~forest of trees, we can keep each tree contracted to a single vertex.
+This replaces the relatively complex tree-edge incidencies by simple
+vertex-edge incidencies, potentially speeding up the calculation at the
+expense of having to perform the contractions.
+
+We will show a contractive version of the Bor\o{u}vka's algorithm
+in which these costs are carefully balanced, leading for example to
+a linear-time algorithm for MST in planar graphs.
+
+There are two definitions of edge contraction that differ when an edge of
+a~triangle is contracted. Either we unify the other two edges to a single edge
+or we keep them as two parallel edges, leaving us with a~multigraph. We will
+use the multigraph version and we will show that we can easily reduce the multigraph
+to a~simple graph later. (See \ref{contract} for the exact definitions.)
+
+We only need to be able to map edges of the contracted graph to the original
+edges, so we let each edge carry a unique label $\ell(e)$ that will be preserved by
+contractions.
+
+\lemman{Flattening a multigraph}\id{flattening}%
+Let $G$ be a multigraph and $G'$ its subgraph obtaining by removing loops
+and replacing each bundle of parallel edges by its lightest edge.
+Then $G'$~has the same MST as~$G$.
+
+\proof
+Every spanning tree of~$G'$ is a spanning tree of~$G$. In the other direction:
+Loops can be never contained in a spanning tree. If there is a spanning tree~$T$
+containing a~removed edge~$e$ parallel to an edge~$e'\in G'$, exchanging $e'$
+for~$e$ makes~$T$ lighter. (This is indeed the multigraph version of the Red
+lemma applied to a~two-edge cycle, as we will see in \ref{multimst}.)
+\qed
+
+\algn{Contractive version of Bor\o{u}vka's algorithm}\id{contbor}
+\algo
+\algin A~graph~$G$ with an edge comparison oracle.
+\:$T\=\emptyset$.
+\:$\ell(e)\=e$ for all edges~$e$. \cmt{Initialize the labels.}
+\:While $n(G)>1$:
+\::For each vertex $v_k$ of~$G$, let $e_k$ be the lightest edge incident to~$v_k$.
+\::$T\=T\cup \{ \ell(e_1),\ldots,\ell(e_n) \}$.\hfil\break\cmt{Remember labels of all selected edges.}
+\::Contract all edges $e_k$, inheriting labels and weights.\foot{In other words, we will ask the comparison oracle for the edge $\ell(e)$ instead of~$e$.}
+\::Flatten $G$ (remove parallel edges and loops).
+\algout Minimum spanning tree~$T$.
+\endalgo
+
+\nota
+For the analysis of the algorithm, we will denote the graph considered by the algorithm
+at the beginning of the $i$-th iteration by $G_i$ (starting with $G_0=G$) and the number
+of vertices and edges of this graph by $n_i$ and $m_i$ respectively. A~single iteration
+of the algorithm will be called a~\df{Bor\o{u}vka step}.
+
+\lemma\id{contiter}%
+The $i$-th Bor\o{u}vka step can be carried out in time~$\O(m_i)$.
+
+\proof
+The only non-trivial parts are steps 6 and~7. Contractions can be handled similarly
+to the unions in the original Bor\o{u}vka's algorithm (see \ref{boruvkaiter}):
+We build an~auxiliary graph containing only the selected edges~$e_k$, find
+connected components of this graph and renumber vertices in each component to
+the identifier of the component. This takes $\O(m_i)$ time.
+
+Flattening is performed by first removing the loops and then bucket-sorting the edges
+(as ordered pairs of vertex identifiers) lexicographically, which brings parallel
+edges together. The bucket sort uses two passes with $n_i$~buckets, so it takes
+$\O(n_i+m_i)=\O(m_i)$.
+\qed
+
+\thm\id{contborthm}%
+The Contractive Bor\o{u}vka's algorithm finds the MST of the input graph in
+time $\O(\min(n^2,m\log n))$.
+
+\proof
+As in the original Bor\o{u}vka's algorithm, the number of iterations is $\O(\log n)$.
+When combined with the previous lemma, it gives an~$\O(m\log n)$ upper bound.
+
+To get the $\O(n^2)$ bound, we observe that the number of trees in the non-contracting
+version of the algorithm drops at least by a factor of two in each iteration (Lemma \ref{boruvkadrop})
+and the same must hold for the number of vertices in the contracting version.
+Therefore $n_i\le n/2^i$. While the number of edges need not decrease geometrically,
+we still have $m_i\le n_i^2$ as the graphs~$G_i$ are simple (we explicitly removed multiple
+edges and loops at the end of the previous iteration). Hence the total time spent
+in all iterations is $\O(\sum_i n_i^2) = \O(\sum_i n^2/4^i) = \O(n^2)$.
+\qed
+
+On planar graphs, the algorithm runs much faster:
+
+\thmn{Contractive Bor\o{u}vka on planar graphs}\id{planarbor}%
+When the input graph is planar, the Contractive Bor\o{u}vka's algorithm runs in
+time $\O(n)$.
+
+\proof
+Let us refine the previous proof. We already know that $n_i \le n/2^i$. We will
+prove that when~$G$ is planar, the $m_i$'s are decreasing geometrically. We know that every
+$G_i$ is planar, because the class of planar graphs is closed under edge deletion and
+contraction. Moreover, $G_i$~is also simple, so we can use the standard bound on
+the number of edges of planar simple graphs (see for example \cite{diestel:gt}) to get $m_i\le 3n_i \le 3n/2^i$.
+The total time complexity of the algorithm is therefore $\O(\sum_i m_i)=\O(\sum_i n/2^i)=\O(n)$.
+\qed
+
+\rem
+There are several other possibilities how to find the MST of a planar graph in linear time.
+For example, Matsui \cite{matsui:planar} has described an algorithm based on simultaneously
+working on the graph and its topological dual. The advantage of our approach is that we do not need
+to construct the planar embedding explicitly. We will show another simpler linear-time algorithm
+in section~\ref{minorclosed}.
+
+\rem
+To achieve the linear time complexity, the algorithm needs a very careful implementation,
+but we defer the technical details to section~\ref{bucketsort}.
+
+\paran{General contractions}%
+Graph contractions are indeed a~very powerful tool and they can be used in other MST
+algorithms as well. The following lemma shows the gist:
+
+\lemman{Contraction of MST edges}\id{contlemma}%
+Let $G$ be a weighted graph, $e$~an arbitrary edge of~$\mst(G)$, $G/e$ the multigraph
+produced by contracting~$e$ in~$G$, and $\pi$ the bijection between edges of~$G-e$ and
+their counterparts in~$G/e$. Then $\mst(G) = \pi^{-1}[\mst(G/e)] + e.$
+
+\proof
+% We seem not to need this lemma for multigraphs...
+%If there are any loops or parallel edges in~$G$, we can flatten the graph. According to the
+%Flattening lemma (\ref{flattening}), the MST stays the same and if we remove a parallel edge
+%or loop~$f$, then $\pi(f)$ would be removed when flattening~$G/e$, so $f$ never participates
+%in a MST.
+The right-hand side of the equality is a spanning tree of~$G$. Let us denote it by~$T$ and
+the MST of $G/e$ by~$T'$. If $T$ were not minimum, there would exist a $T$-light edge~$f$ in~$G$
+(by the Minimality Theorem, \ref{mstthm}). If the path $T[f]$ covered by~$f$ does not contain~$e$,
+then $\pi[T[f]]$ is a path covered by~$\pi(f)$ in~$T'$. Otherwise $\pi(T[f]-e)$ is such a path.
+In both cases, $f$ is $T'$-light, which contradicts the minimality of~$T'$. (We do not have
+a~multigraph version of the theorem, but the direction we need is a~straightforward edge exchange,
+which obviously works in multigraphs as well as in simple graphs.)
+\qed
+
+\rem
+In the Contractive Bor\o{u}vka's algorithm, the role of the mapping~$\pi^{-1}$ is of course played by the edge labels~$\ell$.
+
+\paran{A~lower bound}%
+Finally, we will show a family of graphs for which the $\O(m\log n)$ bound on time complexity
+is tight. The graphs do not have unique weights, but they are constructed in a way that
+the algorithm never compares two edges with the same weight. Therefore, when two such
+graphs are monotonically isomorphic (see~\ref{mstiso}), the algorithm processes them in the same way.
+
+\defn
+A~\df{distractor of order~$k$,} denoted by~$D_k$, is a path on $n=2^k$~vertices $v_1,\ldots,v_n$,
+where each edge $v_iv_{i+1}$ has its weight equal to the number of trailing zeroes in the binary
+representation of the number~$i$. The vertex $v_1$ is called a~\df{base} of the distractor.
+
+\figure{distractor.eps}{\epsfxsize}{A~distractor $D_3$ and its evolution (bold edges are contracted)}
+
+\rem
+Alternatively, we can use a recursive definition: $D_0$ is a single vertex, $D_{k+1}$ consists
+of two disjoint copies of~$D_k$ joined by an edge of weight~$k$.
+
+\lemma
+A~single iteration of the contractive algorithm reduces the distractor~$D_k$ to a~graph isomorphic with~$D_{k-1}$.
+
+\proof
+Each vertex~$v$ of~$D_k$ is incident with a single edge of weight~1. The algorithm therefore
+selects all weight~1 edges and contracts them. This produces a~graph that is
+equal to $D_{k-1}$ with all weights increased by~1, which does not change the relative order of edges.
+\qed
+
+\defn
+A~\df{hedgehog}~$H_{a,k}$ is a graph consisting of $a$~distractors $D_k^1,\ldots,D_k^a$ of order~$k$
+together with edges of a complete graph on the bases of these distractors. The additional edges
+have arbitrary weights that are heavier than the edges of all the distractors.
+
+\figure{hedgehog.eps}{\epsfxsize}{A~hedgehog $H_{5,2}$ (quills bent to fit in the picture)}
+
+\lemma
+A~single iteration of the contractive algorithm reduces~$H_{a,k}$ to a~graph isomorphic with $H_{a,k-1}$.
+
+\proof
+Each vertex is incident with an edge of some distractor, so the algorithm does not select
+any edge of the complete graph. Contraction therefore reduces each distractor to a smaller
+distractor (modulo an additive factor in weight) and it leaves the complete graph intact.
+The resulting graph is monotonely isomorphic to $H_{a,k-1}$.
+\qed
+
+When we set the parameters appropriately, we get the following lower bound:
+
+\thmn{Lower bound for Contractive Bor\o{u}vka}%
+For each $n$ there exists a graph on $\Theta(n)$ vertices and $\Theta(n)$ edges
+such that the Contractive Bor\o{u}vka's algorithm spends time $\Omega(n\log n)$ on it.
+
+\proof
+Consider the hedgehog $H_{a,k}$ for $a=\lceil\sqrt n\rceil$ and $k=\lceil\log_2 a\rceil$.
+It has $a\cdot 2^k = \Theta(n)$ vertices and ${a \choose 2} + a\cdot 2^k = \Theta(a^2) + \Theta(a^2) = \Theta(n)$ edges
+as we wanted.
+
+By the previous lemma, the algorithm proceeds through a sequence of hedgehogs $H_{a,k},
+H_{a,k-1}, \ldots, H_{a,0}$ (up to monotone isomorphism), so it needs a logarithmic number of iterations plus some more
+to finish on the remaining complete graph. Each iteration runs on a graph with $\Omega(n)$
+edges as every $H_{a,k}$ contains a complete graph on~$a$ vertices.
+\qed
+
+%--------------------------------------------------------------------------------
+
+\section{Lifting restrictions}
+
+In order to have a~simple and neat theory, we have introduced several restrictions
+on the graphs in which we search for the MST. As in some rare cases we are going to
+meet graphs that do not fit into this simplified world, let us quickly examine what
+happens when the restrictions are lifted.
+
+\paran{Disconnected graphs}\id{disconn}%
+The basic properties of minimum spanning trees and the algorithms presented in
+this chapter apply to minimum spanning forests of disconnected graphs, too.
+The proofs of our theorems and the steps of our algorithms are based on adjacency
+of vertices and existence of paths, so they are always local to a~single
+connected component. The Bor\o{u}vka's and Kruskal's algorithm need no changes,
+the Jarn\'\i{}k's algorithm has to be invoked separately for each component.
+
+We can also extend the notion of light and heavy edges with respect
+to a~tree to forests: When an~edge~$e$ connects two vertices lying in the same
+tree~$T$ of a~forest~$F$, it is $F$-heavy iff it is $T$-heavy (similarly
+for $F$-light). Edges connecting two different trees are always considered
+$F$-light. Again, a~spanning forest~$F$ is minimum iff there are no $F$-light
+edges.
+
+\paran{Multigraphs}\id{multimst}%
+All theorems and algorithms from this chapter work for multigraphs as well,
+only the notation sometimes gets crabbed, which we preferred to avoid. The Minimality
+theorem and the Blue rule stay unchanged. The Red rule is naturally extended to
+self-loops (which are never in the MST) and two-edge cycles (where the heavier
+edge can be dropped) as already suggested in the Flattening lemma (\ref{flattening}).
+
+\paran{Multiple edges of the same weight}\id{multiweight}%
+In case when the edge weights are not distinct, the characterization of minimum
+spanning trees using light edges is still correct, but the MST is no longer unique
+(as already mentioned, there can be as much as~$n^{n-2}$ MST's).
+
+In the Red-Blue procedure, we have to avoid being too zealous. The Blue lemma cannot
+guarantee that when a~cut contains multiple edges of the minimum weight, all of them
+are in the MST. It will however tell that if we pick one of these edges, an~arbitrary
+MST can be modified to another MST that contains this edge. Therefore the Blue rule
+will change to ``Pick a~cut~$C$ such that it does not contain any blue edge and color
+one of its lightest edges blue.'' The Red lemma and the Red rule can be handled
+in a~similar manner. The modified algorithm will be then guaranteed to find one of
+the possible MST's.
+
+The Kruskal's and Jarn\'\i{}k's algorithms keep working. This is however not the case of the
+Bor\o{u}vka's algorithm, whose proof of correctness in Lemma \ref{borcorr} explicitly referred to
+distinct weights and indeed, if they are not distinct, the algorithm will occasionally produce
+cycles. To avoid the cycles, the ties in edge weight comparisons have to be broken in a~systematic
+way. The same applies to the contractive version of this algorithm.
  
  \endpart