Intro on RAM data structures.

[saga.git] / mst.tex
diff --git a/mst.tex b/mst.tex

index 33852d199e77452c968ef53b4df5bb239ed21a59..7e5be9097813a4ab59f639eaa4c520afab9527d8 100644 (file)
--- a/mst.tex
+++ b/mst.tex
@@ -6,8 +6,666 @@
  
  \section{The Problem}
  
-\cite{boruvka:ojistem}
+The problem of finding a minimum spanning tree of a weighted graph is one of the
+best studied problems in the area of combinatorial optimization since its birth.
+Its colorful history (see \cite{graham:msthistory} and \cite{nesetril:history} for the full account)
+begins in~1926 with the pioneering work of Bor\o{u}vka
+\cite{boruvka:ojistem}\foot{See \cite{nesetril:boruvka} for an English translation with commentary.},
+who studied primarily an Euclidean version of the problem related to planning
+of electrical transmission lines (see \cite{boruvka:networks}), but gave an efficient
+algorithm for the general version of the problem. As it was well before the dawn of graph
+theory, the language of his paper was complicated, so we will better state the problem
+in contemporary terminology:
  
-% mention Steiner trees
+\proclaim{Problem}Given an undirected graph~$G$ with weights $w:E(G)\rightarrow {\bb R}$,
+find its minimum spanning tree, defined as follows:
+
+\defn\id{mstdef}%
+For a given graph~$G$ with weights $w:E(G)\rightarrow {\bb R}$:
+\itemize\ibull
+\:A~subgraph $H\subseteq G$ is called a \df{spanning subgraph} if $V(H)=V(G)$.
+\:A~\df{spanning tree} of $G$ is any its spanning subgraph which is a tree.
+\:For any subgraph $H\subseteq G$ we define its \df{weight} $w(H):=\sum_{e\in E(H)} w(e)$.
+  When comparing two weights, we will use the terms \df{lighter} and \df{heavier} in the
+  obvious sense.
+\:A~\df{minimum spanning tree (MST)} of~$G$ is a spanning tree~$T$ such that its weight $w(T)$
+  is the smallest possible of all the spanning trees of~$G$.
+\:For a disconnected graph, a \df{(minimum) spanning forest (MSF)} is defined as
+  a union of (minimum) spanning trees of its connected components.
+\endlist
+
+Bor\o{u}vka's work was further extended by Jarn\'\i{}k \cite{jarnik:ojistem}, again in
+mostly geometric setting, giving another efficient algorithm. However, when
+computer science and graph theory started forming in the 1950's and the
+spanning tree problem was one of the central topics of the flourishing new
+disciplines, the previous work was not well known and the algorithms had to be
+rediscovered several times.
+
+Recently, several significantly faster algorithms were discovered, most notably the
+$\O(m\timesbeta(m,n))$-time algorithm by Fredman and Tarjan \cite{ft:fibonacci} and
+algorithms with inverse-Ackermann type complexity by Chazelle \cite{chazelle:ackermann}
+and Pettie \cite{pettie:ackermann}.
+
+\FIXME{Write the rest of the history.}
+
+This chapter attempts to survery the important algorithms for finding the MST and it
+also presents several new ones.
+
+%--------------------------------------------------------------------------------
+
+\section{Basic properties}
+
+In this section, we will examine the basic properties of spanning trees and prove
+several important theorems to base the algorithms upon. We will follow the theory
+developed by Tarjan in~\cite{tarjan:dsna}.
+
+For the whole section, we will fix a graph~$G$ with edge weights~$w$ and all
+other graphs will be spanning subgraphs of~$G$. We will use the same notation
+for the subgraphs as for the corresponding sets of edges.
+
+First of all, let us show that the weights on edges are not necessary for the
+definition of the MST. We can formulate an equivalent characterization using
+an ordering of edges instead.
+
+\defnn{Heavy and light edges}\id{heavy}%
+Let~$T$ be a~spanning tree. Then:
+\itemize\ibull
+\:For vertices $x$ and $y$, let $T[x,y]$ denote the (unique) path in~$T$ joining $x$ and~$y$.
+\:For an edge $e=xy$ we will call $T[e]:=T[x,y]$ the \df{path covered by~$e$} and
+  the edges of this path \df{edges covered by~$e$}.
+\:An edge~$e$ is called \df{light with respect to~$T$} (or just \df{$T$-light}) if it covers a heavier edge, i.e., if there
+  is an edge $f\in T[e]$ such that $w(f) > w(e)$.
+\:An edge~$e$ is called \df{$T$-heavy} if it is not $T$-light.
+\endlist
+
+\rem
+Please note that the above properties also apply to tree edges
+which by definition cover only themselves and therefore they are always heavy.
+
+\lemman{Light edges}\id{lightlemma}%
+Let $T$ be a spanning tree. If there exists a $T$-light edge, then~$T$
+is not minimum.
+
+\proof
+If there is a $T$-light edge~$e$, then there exists an edge $e'\in T[e]$ such
+that $w(e')>w(e)$. Now $T-e'$ is a forest of two trees with endpoints of~$e$
+located in different components, so adding $e$ to this forest must restore
+connectivity and $T':=T-e'+e$ is another spanning tree with weight $w(T')
+= w(T)-w(e')+w(e) < w(T)$. Hence $T$ could not have been minimum.
+\qed
+
+\figure{mst2.eps}{278pt}{An edge exchange as in the proof of Lemma~\ref{lightlemma}}
+
+The converse of this lemma is also true and to prove it, we will once again use
+technique of transforming trees by \df{exchanges} of edges. In the proof of the
+lemma, we have made use of the fact that whenever we exchange an edge~$e$ of
+a spanning tree for another edge~$f$ covered by~$e$, the result is again
+a spanning tree. In fact, it is possible to transform any spanning tree
+to any other spanning tree by a sequence of exchanges.
+
+\lemman{Exchange property for trees}\id{xchglemma}%
+Let $T$ and $T'$ be spanning trees of a common graph. Then there exists
+a sequence of edge exchanges which transforms $T$ to~$T'$. More formally,
+there exists a sequence of spanning trees $T=T_0,T_1,\ldots,T_k=T'$ such that
+$T_{i+1}=T_i - e_i + e_i^\prime$ where $e_i\in T_i$ and $e_i^\prime\in T'$.
+
+\proof
+By induction on $d(T,T'):=\vert T\symdiff T'\vert$. When $d(T,T')=0$,
+both trees are identical and no exchanges are needed. Otherwise, the trees are different,
+but as they are of the same size, there must exist an edge $e'\in T'\setminus T$.
+The cycle $T[e']+e'$ cannot be wholly contained in~$T'$, so there also must
+exist an edge $e\in T[e']\setminus T'$. Exchanging $e$ for~$e'$ yields a spanning
+tree $T^*:=T-e+e'$ such that $d(T^*,T')=d(T,T')-2$ and we can apply the induction
+hypothesis to $T^*$ and $T'$ to get the rest of the exchange sequence.
+\qed
+
+\figure{mst1.eps}{295pt}{One step of the proof of Lemma~\ref{xchglemma}}
+
+\lemman{Monotone exchanges}\id{monoxchg}%
+Let $T$ be a spanning tree such that there are no $T$-light edges and $T'$
+be an arbitrary spanning tree. Then there exists a sequence of edge exchanges
+transforming $T$ to~$T'$ such that the weight does not increase in any step.
+
+\proof
+We improve the argument from the previous proof, refining the induction step.
+When we exchange $e\in T$ for $e'\in T'\setminus T$ such that $e\in T[e']$,
+the weight never drops, since $e'$ is not a $T$-light edge and therefore
+$w(e') \ge w(e)$, so $w(T^*)=w(T)-w(e)+w(e')\ge w(T)$.
+
+To keep the induction going, we have to make sure that there are still no light
+edges with respect to~$T^*$. In fact, it is enough to avoid such edges in
+$T'\setminus T^*$, since these are the only edges considered by the induction
+steps. To accomplish that, we replace the so far arbitrary choice of $e'\in T'\setminus T$
+by picking the lightest such edge.
+
+Now consider an edge $f\in T'\setminus T^*$. We want to show that $f$ is not
+$T^*$-light, i.e., that it is heavier than all edges on $T^*[f]$. The path $T^*[f]$ is
+either equal to the original path $T[f]$ (if $e\not\in T[f]$) or to $T[f] \symdiff C$,
+where $C$ is the cycle $T[e']+e'$. The former case is trivial, in the latter one
+$w(f)\ge w(e')$ due to the choice of $e'$ and all other edges on~$C$ are lighter
+than~$e'$ as $e'$ was not $T$-light.
+\qed
+
+\thmn{Minimality by order}\id{mstthm}%
+A~spanning tree~$T$ is minimum iff there is no $T$-light edge.
+
+\proof
+If~$T$ is minimum, then by Lemma~\ref{lightlemma} there are no $T$-light
+edges.
+Conversely, when $T$ is a spanning tree without $T$-light edges
+and $T_{min}$ is an arbitrary minimum spanning tree, then according to the Monotone
+exchange lemma (\ref{monoxchg}) there exists a non-decreasing sequence
+of exchanges transforming $T$ to $T_{min}$, so $w(T)\le w(T_{min})$
+and thus $T$~is also minimum.
+\qed
+
+In general, a single graph can have many minimum spanning trees (for example
+a complete graph on~$n$ vertices and unit edge weights has $n^{n-2}$
+minimum spanning trees according to the Cayley's formula \cite{cayley:trees}).
+However, as the following theorem shows, this is possible only if the weight
+function is not injective.
+
+\thmn{MST uniqueness}%
+If all edge weights are distinct, then the minimum spanning tree is unique.
+
+\proof
+Consider two minimum spanning trees $T_1$ and~$T_2$. According to the previous
+theorem, there are no light edges with respect to neither of them, so by the
+Monotone exchange lemma (\ref{monoxchg}) there exists a sequence of non-decreasing
+edge exchanges going from $T_1$ to $T_2$. As all edge weights all distinct,
+these edge exchanges must be in fact strictly increasing. On the other hand,
+we know that $w(T_1)=w(T_2)$, so the exchange sequence must be empty and indeed
+$T_1$ and $T_2$ must be identical.
+\qed
+
+\rem\id{edgeoracle}%
+To simplify the description of MST algorithms, we will expect that the weights
+of all edges are distinct and that instead of numeric weights (usually accompanied
+by problems with representation of real numbers in algorithms) we will be given
+a comparison oracle, that is a function which answers questions ``$w(e)<w(f)$?'' in
+constant time. In case the weights are not distinct, we can easily break ties by
+comparing some unique edge identifiers and according to our characterization of
+minimum spanning trees, the unique MST of the new graph will still be a MST of the
+original graph. In the few cases where we need a more concrete input, we will
+explicitly state so.
+
+\nota\id{mstnota}%
+When $G$ is a graph with distinct edge weights, we will use $\mst(G)$ to denote
+its unique minimum spanning tree.
+
+Another useful consequence is that whenever two graphs are isomorphic and the
+isomorphism preserves weight order, the isomorphism applies to their MST's
+as well:
+
+\defn
+A~\df{monotone isomorphism} of two weighted graphs $G_1=(V_1,E_1,w_1)$ and
+$G_2=(V_2,E_2,w_2)$ is a bijection $\pi: V_1\rightarrow V_2$ such that
+for each $u,v\in V_1: uv\in E_1 \Leftrightarrow \pi(u)\pi(v)\in E_2$ and
+for each $e,f\in E_1: w_1(e)<w_1(f) \Leftrightarrow w_2(\pi[e]) < w_2(\pi[f])$.
+
+\lemman{MST of isomorphic graphs}\id{mstiso}%
+Let~$G_1$ and $G_2$ be two weighted graphs with unique edge weights and $\pi$
+their monotone isomorphism. Then $\mst(G_2) = \pi[\mst(G_1)]$.
+
+\proof
+The isomorphism~$\pi$ maps spanning trees onto spanning trees and it preserves
+the relation of covering. Since it is monotone, it preserves the property of
+being a light edge (an~edge $e\in E(G_1)$ is $T$-light $\Leftrightarrow$
+the edge $\pi[e]\in E(G_2)$ is~$f[T]$-light). Therefore by Theorem~\ref{mstthm}, $T$
+is the MST of~$G_1$ if and only if $\pi[T]$ is the MST of~$G_2$.
+\qed
+
+%--------------------------------------------------------------------------------
+
+\section{The Red-Blue meta-algorithm}
+
+Most MST algorithms can be described as special cases of the following procedure
+(again following \cite{tarjan:dsna}):
+
+\algn{Red-Blue Meta-Algorithm}\id{rbma}%
+\algo
+\algin A~graph $G$ with an edge comparison oracle (see \ref{edgeoracle})
+\:In the beginning, all edges are colored black.
+\:Apply rules as long as possible:
+\::Either pick a cut~$C$ such that its lightest edge is not blue \hfil\break and color this edge blue, \cmt{Blue rule}
+\::Or pick a cycle~$C$ such that its heaviest edge is not red \hfil\break and color this edge \hphantas{red.}{blue.} \cmt{Red rule}
+\algout Minimum spanning tree of~$G$ consisting of edges colored blue.
+\endalgo
+
+\rem
+This procedure is not a proper algorithm, since it does not specify how to choose
+the rule to apply. We will however prove that no matter how the rules are applied,
+the procedure always stops and gives the correct result. Also, it will turn out
+that each of the classical MST algorithms can be described as a specific way
+of choosing the rules in this procedure, which justifies the name meta-algorithm.
+
+\nota
+We will denote the unique minimum spanning tree of the input graph by~$T_{min}$.
+We intend to prove that this is also the output of the procedure.
+
+\lemman{Blue lemma}%
+When an edge is colored blue in any step of the procedure, it is contained in the minimum spanning tree.
+
+\proof
+By contradiction. Let $e$ be an edge painted blue as the lightest edge of a cut~$C$.
+If $e\not\in T_{min}$, then there must exist an edge $e'\in T_{min}$ which is
+contained in~$C$ (take any pair of vertices separated by~$C$, the path
+in~$T_{min}$ joining these vertices must cross~$C$ at least once). Exchanging
+$e$ for $e'$ in $T_{min}$ yields an even lighter spanning tree since
+$w(e)<w(e')$. \qed
+
+\lemman{Red lemma}%
+When an edge is colored red in any step of the procedure, it is not contained in the minimum spanning tree.
+
+\proof
+Again by contradiction. Suppose that $e$ is an edge painted red as the heaviest edge
+of a cycle~$C$ and that $e\in T_{min}$. Removing $e$ causes $T_{min}$ to split to two
+components, let us call them $T_x$ and $T_y$. Some vertices of~$C$ now lie in $T_x$,
+the others in $T_y$, so there must exist in edge $e'\ne e$ such that its endpoints
+lie in different components. Since $w(e')<w(e)$, exchanging $e$ for~$e'$ yields
+a lighter spanning tree than $T_{min}$.
+\qed
+
+\figure{mst-rb.eps}{289pt}{Proof of the Blue (left) and Red (right) lemma}
+
+\lemman{Black lemma}%
+As long as there exists a black edge, at least one rule can be applied.
+
+\proof
+Assume that $e=xy$ be a black edge. Let us denote $M$ the set of vertices
+reachable from~$x$ using only blue edges. If $y$~lies in~$M$, then $e$ together
+with some blue path between $x$ and $y$ forms a cycle and it must be the heaviest
+edge on this cycle. This holds because all blue edges have been already proven
+to be in $T_{min}$ and there can be no $T_{min}$-light edges (see Theorem~\ref{mstthm}).
+In this case we can apply the red rule.
+
+On the other hand, if $y\not\in M$, then the cut formed by all edges between $M$
+and $V(G)\setminus M$ contains no blue edges, therefore we can use the blue rule.
+\qed
+
+\figure{mst-bez.eps}{295pt}{Configurations in the proof of the Black lemma}
+
+\thmn{Red-Blue correctness}%
+For any selection of rules, the Red-Blue procedure stops and the blue edges form
+the minimum spanning tree of the input graph.
+
+\proof
+To prove that the procedure stops, let us notice that no edge is ever recolored,
+so we must run out of black edges after at most~$m$ steps. Recoloring
+to the same color is avoided by the conditions built in the rules, recoloring to
+a different color would mean that the an edge would be both inside and outside~$T_{min}$
+due to our Red and Blue lemmata.
+
+When no further rules can be applied, the Black lemma guarantees that all edges
+are colored, so by the Blue lemma all blue edges are in~$T_{min}$ and by the Red
+lemma all other (red) edges are outside~$T_{min}$, so the blue edges are exactly~$T_{min}$.
+\qed
+
+%--------------------------------------------------------------------------------
+
+\section{Classical algorithms}
+
+The three classical MST algorithms can be easily stated in terms of the Red-Blue meta-algorithm.
+For each of them, we first show the general version of the algorithm, then we prove that
+it gives the correct result and finally we discuss the time complexity of various
+implementations.
+
+\algn{Bor\o{u}vka \cite{boruvka:ojistem}, Choquet \cite{choquet:mst}, Sollin \cite{sollin:mst} and others}
+\algo
+\algin A~graph~$G$ with an edge comparison oracle.
+\:$T\=$ a forest consisting of vertices of~$G$ and no edges.
+\:While $T$ is not connected:
+\::For each component $T_i$ of~$T$, choose the lightest edge $e_i$ from the cut
+   separating $T_i$ from the rest of~$T$.
+\::Add all $e_i$'s to~$T$.
+\algout Minimum spanning tree~$T$.
+\endalgo
+
+\lemma\id{boruvkadrop}%
+In each iteration of the algorithm, the number of trees in~$T$ drops at least twice.
+
+\proof
+Each tree gets merged with at least one of its neighbors, so each of the new trees
+contains two or more original trees.
+\qed
+
+\cor
+The algorithm stops in $\O(\log n)$ iterations.
+
+\lemma
+Bor\o{u}vka's algorithm outputs the MST of the input graph.
+
+\proof
+In every iteration of the algorithm, $T$ is a blue subgraph,
+because every addition of some edge~$e_i$ to~$T$ is a straightforward
+application of the Blue rule. We stop when the blue subgraph is connected, so
+we do not need the Red rule to explicitly exclude edges.
+
+It remains to show that adding the edges simultaneously does not
+produce a cycle. Consider the first iteration of the algorithm where $T$ contains a~cycle~$C$. Without
+loss of generality we can assume that $C=T_1[u_1v_1]\,v_1u_2\,T_2[u_2v_2]\,v_2u_3\,T_3[u_3v_3]\, \ldots \,T_k[u_kv_k]\,v_ku_1$.
+Each component $T_i$ has chosen its lightest incident edge~$e_i$ as either the edge $v_iu_{i+1}$
+or $v_{i-1}u_i$ (indexing cyclically). Suppose that $e_1=v_1u_2$ (otherwise we reverse the orientation
+of the cycle). Then $e_2=v_2u_3$ and $w(e_2)<w(e_1)$ and we can continue in the same way,
+getting $w(e_1)>w(e_2)>\ldots>w(e_k)>w(e_1)$, which is a contradiction.
+(Note that distinctness of edge weights was crucial here.)
+\qed
+
+\lemma\id{boruvkaiter}%
+Each iteration can be carried out in time $\O(m)$.
+
+\proof
+We assign a label to each tree and we keep a mapping from vertices to the
+labels of the trees they belong to. We scan all edges, map their endpoints
+to the particular trees and for each tree we maintain the lightest incident edge
+so far encountered. Instead of merging the trees one by one (which would be too
+slow), we build an auxilliary graph whose vertices are the labels of the original
+trees and edges correspond to the chosen lightest inter-tree edges. We find connected
+components of this graph, these determine how are the original labels translated
+to the new labels.
+\qed
+
+\thm
+Bor\o{u}vka's algorithm finds the MST in time $\O(m\log n)$.
+
+\proof
+Follows from the previous lemmata.
+\qed
+
+\algn{Jarn\'\i{}k \cite{jarnik:ojistem}, Prim \cite{prim:mst}, Dijkstra \cite{dijkstra:mst}}\id{jarnik}%
+\algo
+\algin A~graph~$G$ with an edge comparison oracle.
+\:$T\=$ a single-vertex tree containing an~arbitrary vertex of~$G$.
+\:While there are vertices outside $T$:
+\::Pick the lightest edge $uv$ such that $u\in V(T)$ and $v\not\in V(T)$.
+\::$T\=T+uv$.
+\algout Minimum spanning tree~$T$.
+\endalgo
+
+\lemma
+Jarn\'\i{}k's algorithm computers the MST of the input graph.
+
+\proof
+If~$G$ is connected, the algorithm always stops. Let us prove that in every step of
+the algorithm, $T$ is always a blue tree. Step~4 corresponds to applying
+the Blue rule to the cut $\delta(T)$ separating~$T$ from the rest of the given graph. We need not care about
+the remaining edges, since for a connected graph the algorithm always stops with the right
+number of blue edges.
+\qed
+
+\impl
+The most important part of the algorithm is finding \em{neighboring edges,} i.e., edges
+of the cut $\delta(T)$. In a~straightforward implementation,
+searching for the lightest neighboring edge takes $\Theta(m)$ time, so the whole
+algorithm runs in time $\Theta(mn)$.
+
+We can do much better by using a binary
+heap to hold all neighboring edges. In each iteration, we find and delete the
+minimum edge from the heap and once we expand the tree, we insert the newly discovered
+neighboring edges to the heap while deleting the neighboring edges which become
+internal to the new tree. Since there are always at most~$m$ edges in the heap,
+each heap operation takes $\O(\log m)=\O(\log n)$ time. For every edge, we perform
+at most one insertion and at most one deletion, so we spend $\O(m\log n)$ time in total.
+From this, we can conclude:
+
+\thm
+Jarn\'\i{}k's algorithm finds the MST of a~given graph in time $\O(m\log n)$.
+
+\rem
+We will show several faster implementations in section \ref{fibonacci}.
+
+\algn{Kruskal \cite{kruskal:mst}, the Greedy algorithm}
+\algo
+\algin A~graph~$G$ with an edge comparison oracle.
+\:Sort edges of~$G$ by their increasing weight.
+\:$T\=\emptyset$. \cmt{an empty spanning subgraph}
+\:For all edges $e$ in their sorted order:
+\::If $T+e$ is acyclic, add~$e$ to~$T$.
+\::Otherwise drop~$e$.
+\algout Minimum spanning tree~$T$.
+\endalgo
+
+\lemma
+Kruskal's algorithm returns the MST of the input graph.
+
+\proof
+In every step, $T$ is a forest of blue trees. Adding~$e$ to~$T$
+in step~4 applies the Blue rule on the cut separating some pair of components of~$T$ ($e$ is the lightest,
+because all other edges of the cut have not been considered yet). Dropping~$e$ in step~5 corresponds
+to the Red rule on the cycle found ($e$~must be the heaviest, since all other edges of the
+cycle have been already processed). At the end of the algorithm, all edges are colored,
+so~$T$ must be the~MST.
+\qed
+
+\impl
+Except for the initial sorting, which in general takes $\Theta(m\log m)$ time, the only
+other non-trivial operation is the detection of cycles. What we need is a data structure
+for maintaining connected components, which supports queries and edge insertion.
+(This is also known under the name Disjoint Set Union problem, i.e., maintenance
+of an~equivalence relation on a~set with queries on whether two elements are equivalent
+and the operation of joining two equivalence classes into one.)
+The following theorem shows that it can be done with surprising efficiency.
+
+\thmn{Incremental connectivity}%
+When only edge insertions and connectivity queries are allowed, connected components
+can be maintained in $\O(\alpha(n))$ time amortized per operation.
+
+\proof
+Proven by Tarjan and van Leeuwen in \cite{tarjan:setunion}.
+\qed
+
+\FIXME{Define Ackermann's function. Use $\alpha(m,n)$?}
+
+\rem
+The cost of the operations on components is of course dwarfed by the complexity
+of sorting, so a much simpler (at least in terms of its analysis) data
+structure would be sufficient, as long as it has $\O(\log n)$ amortized complexity
+per operation. For example, we can label vertices with identifiers of the
+corresponding components and always recolor the smaller of the two components.
+
+\thm
+Kruskal's algorithm finds the MST of a given graph in time $\O(m\log n)$
+or $\O(m\timesalpha(n))$ if the edges are already sorted by their weights.
+
+\proof
+Follows from the above analysis.
+\qed
+
+%--------------------------------------------------------------------------------
+
+\section{Contractive algorithms}\id{contalg}%
+
+While the classical algorithms are based on growing suitable trees, they
+can be also reformulated in terms of edge contraction. Instead of keeping
+a forest of trees, we can keep each tree contracted to a single vertex.
+This replaces the relatively complex tree-edge incidencies by simple
+vertex-edge incidencies, potentially speeding up the calculation at the
+expense of having to perform the contractions.
+
+We will show a contractive version of the Bor\o{u}vka's algorithm
+in which these costs are carefully balanced, leading for example to
+a linear-time algorithm for MST in planar graphs.
+
+There are two definitions of edge contraction which differ when an edge of a
+triangle is contracted. Either we unify the other two edges to a single edge
+or we keep them as two parallel edges, leaving us with a~multigraph. We will
+use the multigraph version and we will show that we can easily reduce the multigraph
+to a simple graph later. (See \ref{contract} for the exact definitions.)
+
+We only need to be able to map edges of the contracted graph to the original
+edges, so each edge will carry a unique label $\ell(e)$ that will be preserved by
+contractions.
+
+\lemman{Flattening a multigraph}\id{flattening}%
+Let $G$ be a multigraph and $G'$ its subgraph such that all loops have been
+removed and each bundle of parallel edges replaced by its lightest edge.
+Then $G'$~has the same MST as~$G$.
+
+\proof
+Every spanning tree of~$G'$ is a spanning tree of~$G$. In the other direction:
+Loops can be never contained in a spanning tree. If there is a spanning tree~$T$
+containing a removed edge~$e$ parallel to an edge~$e'\in G'$, exchaning $e'$
+for~$e$ makes~$T$ lighter. \qed
+
+\rem Removal of the heavier of a pair of parallel edges can be also viewed
+as an application of the Red rule on a two-edge cycle. And indeed it is, the
+Red-Blue procedure works on multigraphs as well as on simple graphs and all the
+classical algorithms also do. We would only have to be more careful in the
+formulations and proofs, which we preferred to avoid.
+
+\algn{Contractive version of Bor\o{u}vka's algorithm}\id{contbor}
+\algo
+\algin A~graph~$G$ with an edge comparison oracle.
+\:$T\=\emptyset$.
+\:$\ell(e)\=e$ for all edges~$e$. \cmt{Initialize the labels.}
+\:While $n(G)>1$:
+\::For each vertex $v_i$ of~$G$, let $e_i$ be the lightest edge incident to~$v_i$.
+\::$T\=T\cup \{ \ell(e_i) \}$. \cmt{Remember labels of all selected edges.}
+\::Contract $G$ along all edges $e_i$, inheriting labels and weights.\foot{In other words, we ask the comparison oracle for the edge $\ell(e)$ instead of~$e$.}
+\::Flatten $G$, removing parallel edges and loops.
+\algout Minimum spanning tree~$T$.
+\endalgo
+
+\lemma\id{contiter}%
+Each iteration of the algorithm can be carried out in time~$\O(m)$.
+
+\proof
+The only non-trivial parts are steps 6 and~7. Contractions can be handled similarly
+to the unions in the original Bor\o{u}vka's algorithm (see \ref{boruvkaiter}):
+We build an auxillary graph containing only the selected edges~$e_i$, find
+connected components of this graph and renumber vertices in each component to
+the identifier of the component. This takes $\O(m)$ time.
+
+Flattening is performed by first removing the loops and then bucket-sorting the edges
+(as ordered pairs of vertex identifiers) lexicographically, which brings parallel
+edges together. The bucket sort uses two passes with $n$~buckets, so it takes
+$\O(n+m)=\O(m)$.
+\qed
+
+\thm
+The Contractive Bor\o{u}vka's algorithm finds the MST in time $\O(m\log n)$.
+
+\proof
+As in the original Bor\o{u}vka's algorithm, the number of iterations is $\O(\log n)$.
+Then apply the previous lemma.
+\qed
+
+\thmn{\cite{mm:mst}}\id{planarbor}%
+When the input graph is planar, the Contractive Bor\o{u}vka's algorithm runs in
+time $\O(n)$.
+
+\proof
+Let us denote the graph considered by the algorithm at the beginning of the $i$-th
+iteration by $G_i$ (starting with $G_0=G$) and its number of vertices and edges
+by $n_i$ and $m_i$ respectively. As we already know from the previous lemma,
+the $i$-th iteration takes $\O(m_i)$ time. We are going to prove that the
+$m_i$'s are decreasing geometrically.
+
+The number of trees in the non-contracting version of the algorithm drops
+at least by a factor of two in each iteration (Lemma \ref{boruvkadrop}) and the
+same must hold for the number of vertices in the contracting version.
+Therefore $n_i\le n/2^i$.
+
+However, every $G_i$ is planar, because the class of planar graphs is closed
+under edge deletion and contraction. The~$G_i$ is also simple as we explicitly removed multiple edges and
+loops at the end of the previous iteration. Hence we can use the standard theorem on
+the number of edges of planar simple graphs (see for example \cite{diestel:gt}) to get $m_i\le 3n_i \le 3n/2^i$.
+From this we get that the total time complexity is $\O(\sum_i m_i)=\O(\sum_i n/2^i)=\O(n)$.
+\qed
+
+\rem
+There are several other possibilities how to find the MST of a planar graph in linear time.
+For example, Matsui \cite{matsui:planar} has described an algorithm based on simultaneously
+working on the graph and its topological dual. The advantage of our approach is that we do not need
+to construct the planar embedding explicitly. We will show one more linear algorithm
+in section~\ref{minorclosed}.
+
+\rem
+To achieve the linear time complexity, the algorithm needs a very careful implementation,
+but we defer the technical details to section~\ref{bucketsort}.
+
+\para
+Graph contractions are indeed a~very powerful tool and they can be used in other MST
+algorithms as well. The following lemma shows the gist:
+
+\lemman{Contraction of MST edges}\id{contlemma}%
+Let $G$ be a weighted graph, $e$~an arbitrary edge of~$\mst(G)$, $G/e$ the multigraph
+produced by contracting $G$ along~$e$, and $\pi$ the bijection between edges of~$G-e$ and
+their counterparts in~$G/e$. Then: $$\mst(G) = \pi^{-1}[\mst(G/e)] + e.$$
+
+\proof
+% We seem not to need this lemma for multigraphs...
+%If there are any loops or parallel edges in~$G$, we can flatten the graph. According to the
+%Flattening lemma (\ref{flattening}), the MST stays the same and if we remove a parallel edge
+%or loop~$f$, then $\pi(f)$ would be removed when flattening~$G/e$, so $f$ never participates
+%in a MST.
+The right-hand side of the equality is a spanning tree of~$G$, let us denote it by~$T$ and
+the MST of $G/e$ by~$T'$. If $T$ were not minimum, there would exist a $T$-light edge~$f$ in~$G$
+(by Theorem \ref{mstthm}). If the path $T[f]$ covered by~$f$ does not contain~$e$,
+then $\pi[T[f]]$ is a path covered by~$\pi(f)$ in~$T'$. Otherwise $\pi(T[f]-e)$ is such a path.
+In both cases, $f$ is $T'$-light, which contradicts the minimality of~$T'$. (We do not have
+a~multigraph version of the theorem, but the side we need is a~straightforward edge exchange,
+which obviously works in multigraphs as well.)
+\qed
+
+\rem
+In the previous algorithm, the role of the mapping~$\pi^{-1}$ is of course played by the edge labels~$\ell$.
+
+Finally, we will show a family of graphs where the $\O(m\log n)$ bound on time complexity
+is tight. The graphs do not have unique weights, but they are constructed in a way that
+the algorithm never compares two edges with the same weight. Therefore, when two such
+graphs are monotonely isomorphic (see~\ref{mstiso}), the algorithm processes them in the same way.
+
+\defn
+A~\df{distractor of order~$k$,} denoted by~$D_k$, is a path on $n=2^k$~vertices $v_1,\ldots,v_n$
+where each edge $v_iv_{i+1}$ has its weight equal to the number of trailing zeroes in the binary
+representation of the number~$i$. The vertex $v_1$ is called a~\df{base} of the distractor.
+
+\rem
+Alternatively, we can use a recursive definition: $D_0$ is a single vertex, $D_{k+1}$ consists
+of two disjoint copies of~$D_k$ joined by an edge of weight~$k$.
+
+\figure{distractor.eps}{\epsfxsize}{A~distractor $D_3$ and its evolution (bold edges are contracted)}
+
+\lemma
+A~single iteration of the contractive algorithm reduces~$D_k$ to a graph isomorphic with~$D_{k-1}$.
+
+\proof
+Each vertex~$v$ of~$D_k$ is incident with a single edge of weight~1. The algorithm therefore
+selects all weight~1 edges and contracts them. This produces a graph which is
+exactly $D_{k-1}$ with all weights increased by~1, which does not change the relative order of edges.
+\qed
+
+\defn
+A~\df{hedgehog}~$H_{a,k}$ is a graph consisting of $a$~distractors $D_k^1,\ldots,D_k^a$ of order~$k$
+together with edges of a complete graph on the bases of the distractors. These additional edges
+have arbitrary weights, but heavier than the edges of all distractors.
+
+\figure{hedgehog.eps}{\epsfxsize}{A~hedgehog $H_{5,2}$ (quills bent to fit in the picture)}
+
+\lemma
+A~single iteration of the contractive algorithm reduces~$H_{a,k}$ to a graph isomorphic with $H_{a,k-1}$.
+
+\proof
+Each vertex is incident with an edge of some distractor, so the algorithm does not select
+any edge of the complete graph. Contraction therefore reduces each distractor to a smaller
+distractor (modulo an additive factor in weight) and leaves the complete graph intact.
+This is monotonely isomorphic to $H_{a,k-1}$.
+\qed
+
+\thmn{Lower bound for Contractive Bor\o{u}vka}%
+For each $n$ there exists a graph on $\Theta(n)$ vertices and $\Theta(n)$ edges
+such that the Contractive Bor\o{u}vka's algorithm spends time $\Omega(n\log n)$ on it.
+
+\proof
+Consider the hedgehog $H_{a,k}$ for $a=\lceil\sqrt n\rceil$ and $k=\lceil\log_2 a\rceil$.
+It has $a\cdot 2^k = \Theta(n)$ vertices and ${a \choose 2} + a\cdot 2^k = \Theta(a^2) + \Theta(a^2) = \Theta(n)$ edges
+as we wanted.
+
+By the previous lemma, the algorithm proceeds through a sequence of hedgehogs $H_{a,k},
+H_{a,k-1}, \ldots, H_{a,0}$ (up to monotone isomorphism), so it needs a logarithmic number of iterations plus some more
+to finish on the remaining complete graph. Each iteration runs on a graph with $\Omega(n)$
+edges as every $H_{a,k}$ contains a complete graph on~$a$ vertices.
+\qed
  
  \endpart