X-Git-Url: http://mj.ucw.cz/gitweb/?a=blobdiff_plain;f=mst.tex;h=3f715954dd69cdaeb575751369442f8f562971f9;hb=2fcd9085adda5a37aa64731121c0206780586cca;hp=de4ffa939239270879bfc88716670b65d7e544a8;hpb=b4632abd2cb4f9ea27f2bf8d027a6c2cce37616f;p=saga.git diff --git a/mst.tex b/mst.tex index de4ffa9..3f71595 100644 --- a/mst.tex +++ b/mst.tex @@ -24,7 +24,7 @@ find its minimum spanning tree, defined as follows: For a given graph~$G$ with weights $w:E(G)\rightarrow {\bb R}$: \itemize\ibull \:A~subgraph $H\subseteq G$ is called a \df{spanning subgraph} if $V(H)=V(G)$. -\:A~\df{spanning tree} of $G$ is any its spanning subgraph which is a tree. +\:A~\df{spanning tree} of $G$ is any its spanning subgraph that is a tree. \:For any subgraph $H\subseteq G$ we define its \df{weight} $w(H):=\sum_{e\in E(H)} w(e)$. When comparing two weights, we will use the terms \df{lighter} and \df{heavier} in the obvious sense. @@ -41,25 +41,26 @@ spanning tree problem was one of the central topics of the flourishing new disciplines, the previous work was not well known and the algorithms had to be rediscovered several times. -Recently, several significantly faster algorithms were discovered, most notably the -$\O(m\timesbeta(m,n))$-time algorithm by Fredman and Tarjan \cite{ft:fibonacci} and -algorithms with inverse-Ackermann type complexity by Chazelle \cite{chazelle:ackermann} -and Pettie \cite{pettie:ackermann}. +In the next 50 years, several significantly faster algorithms were discovered, ranging +from the $\O(m\timesbeta(m,n))$ time algorithm by Fredman and Tarjan \cite{ft:fibonacci}, +over algorithms with inverse-Ackermann type complexity by Chazelle \cite{chazelle:ackermann} +and Pettie \cite{pettie:ackermann}, to another algorithm by Pettie \cite{pettie:optimal} +whose time complexity is provably optimal. -\FIXME{Write the rest of the history.} - -This chapter attempts to survey the important algorithms for finding the MST and it -also presents several new ones. +In the upcoming chapters, we will explore this colorful universe of MST algorithms. +We will meet the standard works of the classics, the clever ideas of their successors, +various approaches to the problem including randomization and solving of important +special cases. %-------------------------------------------------------------------------------- -\section{Basic properties} +\section{Basic properties}\id{mstbasics}% In this section, we will examine the basic properties of spanning trees and prove several important theorems to base the algorithms upon. We will follow the theory developed by Tarjan in~\cite{tarjan:dsna}. -For the whole section, we will fix a graph~$G$ with edge weights~$w$ and all +For the whole section, we will fix a~connected graph~$G$ with edge weights~$w$ and all other graphs will be spanning subgraphs of~$G$. We will use the same notation for the subgraphs as for the corresponding sets of edges. @@ -75,12 +76,13 @@ Let~$T$ be a~spanning tree. Then: the edges of this path \df{edges covered by~$e$}. \:An edge~$e$ is called \df{light with respect to~$T$} (or just \df{$T$-light}) if it covers a heavier edge, i.e., if there is an edge $f\in T[e]$ such that $w(f) > w(e)$. -\:An edge~$e$ is called \df{$T$-heavy} if it is not $T$-light. +\:An edge~$e$ is called \df{$T$-heavy} if it covers a~lighter edge. \endlist \rem -Please note that the above properties also apply to tree edges -which by definition cover only themselves and therefore they are always heavy. +Edges of the tree~$T$ itself are neither heavy nor light. We will sometimes +use the name \df{non-heavy} for edges which are either light or contained +in the tree. \lemman{Light edges}\id{lightlemma}% Let $T$ be a spanning tree. If there exists a $T$-light edge, then~$T$ @@ -105,7 +107,7 @@ to any other spanning tree by a sequence of exchanges. \lemman{Exchange property for trees}\id{xchglemma}% Let $T$ and $T'$ be spanning trees of a common graph. Then there exists -a sequence of edge exchanges which transforms $T$ to~$T'$. More formally, +a sequence of edge exchanges that transforms $T$ to~$T'$. More formally, there exists a sequence of spanning trees $T=T_0,T_1,\ldots,T_k=T'$ such that $T_{i+1}=T_i - e_i + e_i^\prime$ where $e_i\in T_i$ and $e_i^\prime\in T'$. @@ -243,18 +245,18 @@ of choosing the rules in this procedure, which justifies the name meta-algorithm We will denote the unique minimum spanning tree of the input graph by~$T_{min}$. We intend to prove that this is also the output of the procedure. -\lemman{Blue lemma}% +\lemman{Blue lemma}\id{bluelemma}% When an edge is colored blue in any step of the procedure, it is contained in the minimum spanning tree. \proof By contradiction. Let $e$ be an edge painted blue as the lightest edge of a cut~$C$. -If $e\not\in T_{min}$, then there must exist an edge $e'\in T_{min}$ which is +If $e\not\in T_{min}$, then there must exist an edge $e'\in T_{min}$ that is contained in~$C$ (take any pair of vertices separated by~$C$, the path in~$T_{min}$ joining these vertices must cross~$C$ at least once). Exchanging $e$ for $e'$ in $T_{min}$ yields an even lighter spanning tree since $w(e) +and \. The \ operation tests whether two elements are equivalent and \ +joins two different equivalence classes into one. + +\para +We can maintain the connected components of our forest~$T$ as equivalence classes. When we want +to add an~edge~$uv$, we first call $\(u,v)$ to check if both endpoints of the edge lie in +the same components. If they do not, addition of this edge connects both components into one, +so we perform $\(u,v)$ to merge the equivalence classes. -\thmn{Incremental connectivity}% -When only edge insertions and connectivity queries are allowed, connected components -can be maintained in $\O(\alpha(n))$ time amortized per operation. +Tarjan and van Leeuwen have shown that there is a~data structure for the DSU problem +with surprising efficiency: + +\thmn{Disjoint Set Union, Tarjan and van Leeuwen \cite{tarjan:setunion}}\id{dfu}% +Starting with a~trivial equivalence with single-element classes, a~sequence of operations +comprising of $n$~\s intermixed with $m\ge n$~\s can be processed in time +$\O(m\timesalpha(m,n))$, where $\alpha(m,n)$ is a~certain inverse of the Ackermann's function +(see Definition \ref{ackerinv}). \proof -Proven by Tarjan and van Leeuwen in \cite{tarjan:setunion}. +See \cite{tarjan:setunion}. \qed -\FIXME{Define Ackermann's function. Use $\alpha(m,n)$?} +This completes the following theorem: + +\thm\id{kruskal}% +Kruskal's algorithm finds the MST of a given graph in time $\O(m\log n)$. +If the edges are already sorted by their weights, the time drops to +$\O(m\timesalpha(m,n))$. + +\proof +We spend $\O(m\log n)$ on sorting, $\O(m\timesalpha(m,n))$ on processing the sequence +of \s and \s, and $\O(m)$ on all other work. +\qed \rem -The cost of the operations on components is of course dwarfed by the complexity +The cost of the \ and \ operations is of course dwarfed by the complexity of sorting, so a much simpler (at least in terms of its analysis) data structure would be sufficient, as long as it has $\O(\log n)$ amortized complexity per operation. For example, we can label vertices with identifiers of the -corresponding components and always recolor the smaller of the two components. - -\thm -Kruskal's algorithm finds the MST of a given graph in time $\O(m\log n)$ -or $\O(m\timesalpha(n))$ if the edges are already sorted by their weights. +corresponding components and always relabel the smaller of the two components. -\proof -Follows from the above analysis. -\qed +We will study dynamic maintenance of connected components in more detail in Chapter~\ref{dynchap}. %-------------------------------------------------------------------------------- @@ -486,7 +547,7 @@ We will show a contractive version of the Bor\o{u}vka's algorithm in which these costs are carefully balanced, leading for example to a linear-time algorithm for MST in planar graphs. -There are two definitions of edge contraction which differ when an edge of +There are two definitions of edge contraction that differ when an edge of a~triangle is contracted. Either we unify the other two edges to a single edge or we keep them as two parallel edges, leaving us with a~multigraph. We will use the multigraph version and we will show that we can easily reduce the multigraph @@ -519,35 +580,50 @@ formulations and proofs, which we preferred to avoid. \:$T\=\emptyset$. \:$\ell(e)\=e$ for all edges~$e$. \cmt{Initialize the labels.} \:While $n(G)>1$: -\::For each vertex $v_i$ of~$G$, let $e_i$ be the lightest edge incident to~$v_i$. -\::$T\=T\cup \{ \ell(e_i) \}$. \cmt{Remember labels of all selected edges.} -\::Contract $G$ along all edges $e_i$, inheriting labels and weights.\foot{In other words, we ask the comparison oracle for the edge $\ell(e)$ instead of~$e$.} +\::For each vertex $v_k$ of~$G$, let $e_k$ be the lightest edge incident to~$v_k$. +\::$T\=T\cup \{ \ell(e_k) \}$. \cmt{Remember labels of all selected edges.} +\::Contract $G$ along all edges $e_k$, inheriting labels and weights.\foot{In other words, we ask the comparison oracle for the edge $\ell(e)$ instead of~$e$.} \::Flatten $G$, removing parallel edges and loops. \algout Minimum spanning tree~$T$. \endalgo +\nota +For the analysis of the algorithm, we will denote the graph considered by the algorithm +at the beginning of the $i$-th iteration by $G_i$ (starting with $G_0=G$) and the number +of vertices and edges of this graph by $n_i$ and $m_i$ respectively. + \lemma\id{contiter}% -Each iteration of the algorithm can be carried out in time~$\O(m)$. +The $i$-th iteration of the algorithm (also called the \df{Bor\o{u}vka step}) can be carried +out in time~$\O(m_i)$. \proof The only non-trivial parts are steps 6 and~7. Contractions can be handled similarly to the unions in the original Bor\o{u}vka's algorithm (see \ref{boruvkaiter}): -We build an auxillary graph containing only the selected edges~$e_i$, find +We build an auxillary graph containing only the selected edges~$e_k$, find connected components of this graph and renumber vertices in each component to -the identifier of the component. This takes $\O(m)$ time. +the identifier of the component. This takes $\O(m_i)$ time. Flattening is performed by first removing the loops and then bucket-sorting the edges (as ordered pairs of vertex identifiers) lexicographically, which brings parallel -edges together. The bucket sort uses two passes with $n$~buckets, so it takes -$\O(n+m)=\O(m)$. +edges together. The bucket sort uses two passes with $n_i$~buckets, so it takes +$\O(n_i+m_i)=\O(m_i)$. \qed -\thm -The Contractive Bor\o{u}vka's algorithm finds the MST in time $\O(m\log n)$. +\thm\id{contborthm}% +The Contractive Bor\o{u}vka's algorithm finds the MST of the input graph in +time $\O(\min(n^2,m\log n))$. \proof As in the original Bor\o{u}vka's algorithm, the number of iterations is $\O(\log n)$. -Then apply the previous lemma. +When combined with the previous lemma, it gives an~$\O(m\log n)$ upper bound. + +To get the $\O(n^2)$ bound, we observe that the number of trees in the non-contracting +version of the algorithm drops at least by a factor of two in each iteration (Lemma \ref{boruvkadrop}) +and the same must hold for the number of vertices in the contracting version. +Therefore $n_i\le n/2^i$. While the number of edges need not decrease geometrically, +we still have $m_i\le n_i^2$ as the graphs~$G_i$ are simple (we explicitly removed multiple +edges and loops at the end of the previous iteration). Hence the total time spent +in all iterations is $\O(\sum_i n_i^2) = \O(\sum_i n^2/4^i) = \O(n^2)$. \qed \thmn{Contractive Bor\o{u}vka on planar graphs, \cite{mm:mst}}\id{planarbor}% @@ -555,22 +631,12 @@ When the input graph is planar, the Contractive Bor\o{u}vka's algorithm runs in time $\O(n)$. \proof -Let us denote the graph considered by the algorithm at the beginning of the $i$-th -iteration by $G_i$ (starting with $G_0=G$) and its number of vertices and edges -by $n_i$ and $m_i$ respectively. As we already know from the previous lemma, -the $i$-th iteration takes $\O(m_i)$ time. We are going to prove that the -$m_i$'s are decreasing geometrically. - -The number of trees in the non-contracting version of the algorithm drops -at least by a factor of two in each iteration (Lemma \ref{boruvkadrop}) and the -same must hold for the number of vertices in the contracting version. -Therefore $n_i\le n/2^i$. - -However, every $G_i$ is planar, because the class of planar graphs is closed -under edge deletion and contraction. The~$G_i$ is also simple as we explicitly removed multiple edges and -loops at the end of the previous iteration. Hence we can use the standard theorem on +Let us refine the previous proof. We already know that $n_i \le n/2^i$. We will +prove that when~$G$ is planar, the $m_i$'s are decreasing geometrically. We know that every +$G_i$ is planar, because the class of planar graphs is closed under edge deletion and +contraction. Moreover, $G_i$~is also simple, so we can use the standard theorem on the number of edges of planar simple graphs (see for example \cite{diestel:gt}) to get $m_i\le 3n_i \le 3n/2^i$. -From this we get that the total time complexity is $\O(\sum_i m_i)=\O(\sum_i n/2^i)=\O(n)$. +The total time complexity of the algorithm is therefore $\O(\sum_i m_i)=\O(\sum_i n/2^i)=\O(n)$. \qed \rem @@ -584,7 +650,7 @@ in section~\ref{minorclosed}. To achieve the linear time complexity, the algorithm needs a very careful implementation, but we defer the technical details to section~\ref{bucketsort}. -\para +\paran{General contractions}% Graph contractions are indeed a~very powerful tool and they can be used in other MST algorithms as well. The following lemma shows the gist: @@ -611,7 +677,7 @@ which obviously works in multigraphs as well.) \rem In the previous algorithm, the role of the mapping~$\pi^{-1}$ is of course played by the edge labels~$\ell$. -\para +\paran{A~lower bound}% Finally, we will show a family of graphs where the $\O(m\log n)$ bound on time complexity is tight. The graphs do not have unique weights, but they are constructed in a way that the algorithm never compares two edges with the same weight. Therefore, when two such @@ -669,4 +735,19 @@ to finish on the remaining complete graph. Each iteration runs on a graph with $ edges as every $H_{a,k}$ contains a complete graph on~$a$ vertices. \qed +\remn{Disconnected graphs}\id{disconn}% +The basic properties of minimum spanning trees and the algorithms presented in +this chapter apply to minimum spanning forests of disconnected graphs, too. +The proofs of our theorems and the steps of our algorithms are based on adjacency +of vertices and existence of paths, so they are always local to a~single +connected component. The Bor\o{u}vka's and Kruskal's algorithm need no changes, +the Jarn\'\i{}k's algorithm has to be invoked separately for each component. + +We can also extend the notion of light and heavy edges with respect +to a~tree to forests: When an~edge~$e$ connects two vertices lying in the same +tree~$T$ of a~forest~$F$, it is $F$-heavy iff it is $T$-heavy (similarly +for $F$-light). Edges connecting two different trees are always considered +$F$-light. Again, a~spanning forest~$F$ is minimum iff there are no $F$-light +edges. + \endpart