5 \chapter{Advanced MST Algorithms}
7 \section{Minor-closed graph classes}
9 The contractive algorithm given in section~\ref{contalg} has been found to perform
10 well on planar graphs, but in the general case its time complexity was not linear.
11 Can we find any broader class of graphs where the algorithm is still efficient?
12 The right context turns out to be the minor-closed graph classes, which are
13 closed under contractions and have bounded density.
16 A~graph~$H$ is a \df{minor} of a~graph~$G$ iff it can be obtained
17 from a subgraph of~$G$ by a sequence of simple graph contractions (see \ref{simpcont}).
20 A~class~$\cal C$ of graphs is \df{minor-closed}, when for every $G\in\cal C$ and
21 its every minor~$H$, the graph~$H$ lies in~$\cal C$ as well. A~class~$\cal C$ is called
22 \df{non-trivial} if at least one graph lies in~$\cal C$ and at least one lies outside~$\cal C$.
25 Non-trivial minor-closed classes include planar graphs and more generally graphs
26 embeddable in any fixed surface. Many nice properties of planar graphs extend
27 to these classes, too, most notably the linearity of the number of edges.
30 Let $\cal C$ be a class of graphs. We define its \df{edge density} $\varrho(\cal C)$
31 to be the infimum of all~$\varrho$'s such that $m(G) \le \varrho\cdot n(G)$
32 holds for every $G\in\cal C$.
34 \thmn{Density of minor-closed classes}
35 A~minor-closed class of graphs has finite edge density if and only if it is
39 See Theorem 6.1 in \cite{nesetril:minors}, which also lists some other equivalent conditions.
42 \thmn{MST on minor-closed classes \cite{mm:mst}}\id{mstmcc}%
43 For any fixed non-trivial minor-closed class~$\cal C$ of graphs, Algorithm \ref{contbor} finds
44 the MST of any graph in this class in time $\O(n)$. (The constant hidden in the~$\O$
45 depends on the class.)
48 Following the proof for planar graphs (\ref{planarbor}), we denote the graph considered
49 by the algorithm at the beginning of the $i$-th iteration by~$G_i$ and its number of vertices
50 and edges by $n_i$ and $m_i$ respectively. Again the $i$-th phase runs in time $\O(m_i)$
51 and $n_i \le n/2^i$, so it remains to show a linear bound for the $m_i$'s.
53 Since each $G_i$ is produced from~$G_{i-1}$ by a sequence of edge contractions,
54 all $G_i$'s are minors of~$G$.\foot{Technically, these are multigraph contractions,
55 but followed by flattening, so they are equivalent to contractions on simple graphs.}
56 So they also belong to~$\cal C$ and by the previous theorem $m_i\le \varrho({\cal C})\cdot n_i$.
60 The contractive algorithm uses ``batch processing'' to perform many contractions
61 in a single step. It is also possible to perform contractions one edge at a~time,
62 batching only the flattenings. A~contraction of an edge~$uv$ can be done
63 in time~$\O(\deg(u))$ by removing all edges incident with~$u$ and inserting them back
64 with $u$ replaced by~$v$. Therefore we need to find a lot of vertices with small
65 degrees. The following lemma shows that this is always the case in minor-closed
68 \lemman{Low-degree vertices}\id{lowdeg}%
69 Let $\cal C$ be a graph class with density~$\varrho$ and $G\in\cal C$ a~graph
70 with $n$~vertices. Then at least $n/2$ vertices of~$G$ have degree at most~$4\varrho$.
73 Assume the contrary: Let there be at least $n/2$ vertices with degree
74 greater than~$4\varrho$. Then $\sum_v \deg(v) > n/2
75 \cdot 4\varrho = 2\varrho n$, which is in contradiction with the number
76 of edges being at most $\varrho n$.
80 The proof can be also viewed
81 probabilistically: let $X$ be the degree of a vertex of~$G$ chosen uniformly at
82 random. Then ${\bb E}X \le 2\varrho$, hence by the Markov's inequality
83 ${\rm Pr}[X > 4\varrho] < 1/2$, so for at least $n/2$ vertices~$v$ we have
84 $\deg(v)\le 4\varrho$.
86 \algn{Local Bor\o{u}vka's Algorithm \cite{mm:mst}}%
88 \algin A~graph~$G$ with an edge comparison oracle and a~parameter~$t\in{\bb N}$.
90 \:$\ell(e)\=e$ for all edges~$e$.
92 \::While there exists a~vertex~$v$ such that $\deg(v)\le t$:
93 \:::Select the lightest edge~$e$ incident with~$v$.
94 \:::Contract~$G$ along~$e$.
96 \::Flatten $G$, removing parallel edges and loops.
97 \algout Minimum spanning tree~$T$.
101 When $\cal C$ is a minor-closed class of graphs with density~$\varrho$, the
102 Local Bor\o{u}vka's Algorithm with the parameter~$t$ set to~$4\varrho$
103 finds the MST of any graph from this class in time $\O(n)$. (The constant
104 in the~$\O$ depends on~the class.)
107 Let us denote by $G_i$, $n_i$ and $m_i$ the graph considered by the
108 algorithm at the beginning of the $i$-th iteration of the outer loop,
109 and the number of its vertices and edges respectively. As in the proof
110 of the previous algorithm (\ref{mstmcc}), we observe that all the $G_i$'s
111 are minors of the graph~$G$ given as the input.
113 For the choice $t=4\varrho$, the Lemma on low-degree vertices (\ref{lowdeg})
114 guarantees that at least $n_i/2$ edges get selected in the $i$-th iteration.
115 Hence at least a half of the vertices participates in contractions, so
116 $n_i\le 3/4\cdot n_{i-1}$. Therefore $n_i\le n\cdot (3/4)^i$ and the algorithm terminates
117 after $\O(\log n)$ iterations.
119 Each selected edge belongs to $\mst(G)$, because it is the lightest edge of
120 the trivial cut $\delta(v)$ (see the Blue Rule in \ref{rbma}).
121 The steps 6 and~7 therefore correspond to the operation
122 described by the Lemma on contraction of MST edges (\ref{contlemma}) and when
123 the algorithm stops, $T$~is indeed the minimum spanning tree.
125 It remains to analyse the time complexity of the algorithm. Since $G_i\in{\cal C}$, we have
126 $m_i\le \varrho n_i \le \varrho n/2^i$.
127 We will show that the $i$-th iteration is carried out in time $\O(m_i)$.
128 Steps 5 and~6 run in time $\O(\deg(v))=\O(t)$ for each~$v$, so summed
129 over all $v$'s they take $\O(tn_i)$, which is linear for a fixed class~$\cal C$.
130 Flattening takes $\O(m_i)$, as already noted in the analysis of the Contracting
131 Bor\o{u}vka's Algorithm (see \ref{contiter}).
133 The whole algorithm therefore runs in time $\O(\sum_i m_i) = \O(\sum_i n/2^i) = \O(n)$.
137 For planar graphs, we can get a sharper version of the low-degree lemma,
138 showing that the algorithm works with $t=8$ as well (we had $t=12$ as
139 $\varrho=3$). While this does not change the asymptotic time complexity
140 of the algorithm, the constant-factor speedup can still delight the hearts of
143 \lemman{Low-degree vertices in planar graphs}%
144 Let $G$ be a planar graph with $n$~vertices. Then at least $n/2$ vertices of~$v$
145 have degree at most~8.
148 It suffices to show that the lemma holds for triangulations (if there
149 are any edges missing, the situation can only get better) with at
150 least 3 vertices. Since $G$ is planar, $\sum_v \deg(v) < 6n$.
151 The numbers $d(v):=\deg(v)-3$ are non-negative and $\sum_v d(v) < 3n$,
152 so by the same argument as in the proof of the general lemma, for at least $n/2$
153 vertices~$v$ it holds that $d(v) < 6$, hence $\deg(v) \le 8$.
157 The constant~8 in the previous lemma is the best we can have.
158 Consider a $k\times k$ triangular grid. It has $n=k^2$ vertices, $\O(k)$ of them
159 lie on the outer face and have degrees at most~6, the remaining $n-\O(k)$ interior
160 vertices have degree exactly~6. Therefore the number of faces~$f$ is $6/3\cdot n=2n$,
161 ignoring terms of order $\O(k)$. All interior triangles can be properly colored with
162 two colors, black and white. Now add a~new vertex inside each white face and connect
163 it to all three vertices on the boundary of that face. This adds $f/2 \approx n$
164 vertices of degree~3 and it increases the degrees of the original $\approx n$ interior
165 vertices to~9, therefore about a half of the vertices of the new planar graph
168 \figure{hexangle.eps}{\epsfxsize}{The construction from Remark~\ref{hexa}}
170 %--------------------------------------------------------------------------------
172 \section{Using Fibonacci heaps}
175 We have seen that the Jarn\'\i{}k's Algorithm \ref{jarnik} runs in $\O(m\log n)$ time
176 (and this bound can be easily shown to be tight). Fredman and Tarjan have shown a~faster
177 implementation in~\cite{ft:fibonacci} using their Fibonacci heaps. In this section,
178 we convey their results and we show several interesting consequences.
180 The previous implementation of the algorithm used a binary heap to store all neighboring
181 edges of the cut~$\delta(T)$. Instead of that, we will remember the vertices adjacent
182 to~$T$ and for each such vertex~$v$ we will keep the lightest edge~$uv$ such that $u$~lies
183 in~$T$. We will call these edges \df{active edges} and keep them in a~heap, ordered by weight.
185 When we want to extend~$T$ by the lightest edge of~$\delta(T)$, it is sufficient to
186 find the lightest active edge~$uv$ and add this edge to~$T$ together with a new vertex~$v$.
187 Then we have to update the active edges as follows. The edge~$uv$ has just ceased to
188 be active. We scan all neighbors~$w$ of the vertex~$v$. When $w$~is in~$T$, no action
189 is needed. If $w$~is outside~$T$ and it was not adjacent to~$T$ (there is no active edge
190 remembered for it so far), we set the edge~$vw$ as active. Otherwise we check the existing
191 active edge for~$w$ and replace it by~$vw$ if the new edge is lighter.
193 The following algorithm shows how these operations translate to insertions, decreases
194 and deletions on the heap.
196 \algn{Jarn\'\i{}k with active edges; Fredman and Tarjan \cite{ft:fibonacci}}\id{jarniktwo}%
198 \algin A~graph~$G$ with an edge comparison oracle.
199 \:$v_0\=$ an~arbitrary vertex of~$G$.
200 \:$T\=$ a tree containing just the vertex~$v_0$.
201 \:$H\=$ a~heap of active edges stored as pairs $(u,v)$ where $u\in T,v\not\in T$, ordered by the weights $w(vw)$, initially empty.
202 \:$A\=$ an~auxiliary array mapping vertices outside~$T$ to their active edges in the heap; initially all elements undefined.
203 \:\<Insert> all edges incident with~$v_0$ to~$H$ and update~$A$ accordingly.
204 \:While $H$ is not empty:
205 \::$(u,v)\=\<DeleteMin>(H)$.
207 \::For all edges $vw$ such that $w\not\in T$:
208 \:::If there exists an~active edge~$A(w)$:
209 \::::If $vw$ is lighter than~$A(w)$, \<Decrease> $A(w)$ to~$(v,w)$ in~$H$.
210 \:::If there is no such edge, then \<Insert> $(v,w)$ to~$H$ and set~$A(w)$.
211 \algout Minimum spanning tree~$T$.
214 \thmn{Fibonacci heaps} The~Fibonacci heap performs the following operations
215 with the indicated amortized time complexities:
217 \:\<Insert> (insertion of a~new element) in $\O(1)$,
218 \:\<Decrease> (decreasing value of an~existing element) in $\O(1)$,
219 \:\<Merge> (merging of two heaps into one) in $\O(1)$,
220 \:\<DeleteMin> (deletion of the minimal element) in $\O(\log n)$,
221 \:\<Delete> (deletion of an~arbitrary element) in $\O(\log n)$,
223 \>where $n$ is the maximum number of elements present in the heap at the time of
227 See Fredman and Tarjan \cite{ft:fibonacci} for both the description of the Fibonacci
228 heap and the proof of this theorem.
232 Algorithm~\ref{jarniktwo} with a~Fibonacci heap finds the MST of the input graph in time~$\O(m+n\log n)$.
235 The algorithm always stops, because every edge enters the heap~$H$ at most once.
236 As it selects exactly the same edges as the original Jarn\'\i{}k's algorithm,
237 it gives the correct answer.
239 The time complexity is $\O(m)$ plus the cost of the heap operations. The algorithm
240 performs at most one \<Insert> or \<Decrease> per edge and exactly one \<DeleteMin>
241 per vertex and there are at most $n$ elements in the heap at any given time,
242 so by the previous theorem the operations take $\O(m+n\log n)$ time in total.
246 For graphs with edge density at least $\log n$, this algorithm runs in linear time.
249 We can consider using other kinds of heaps which have the property that inserts
250 and decreases are faster than deletes. Of course, the Fibonacci heaps are asymptotically
251 optimal (by the standard $\Omega(n\log n)$ lower bound on sorting by comparisons, see
252 for example \cite{clrs}), so the other data structures can improve only
253 multiplicative constants or offer an~easier implementation.
255 A~nice example is a~\df{$d$-regular heap} --- a~variant of the usual binary heap
256 in the form of a~complete $d$-regular tree. \<Insert>, \<Decrease> and other operations
257 involving bubbling the values up spend $\O(1)$ time at a~single level, so they run
258 in~$\O(\log_d n)$ time. \<Delete> and \<DeleteMin> require bubbling down, which incurs
259 comparison with all~$d$ sons at every level, so they run in~$\O(d\log_d n)$.
260 With this structure, the time complexity of the whole algorithm
261 is $\O(nd\log_d n + m\log_d n)$, which suggests setting $d=m/n$, giving $\O(m\log_{m/n}n)$.
262 This is still linear for graphs with density at~least~$n^{1+\varepsilon}$.
264 Another possibility is to use the 2-3-heaps \cite{takaoka:twothree} or Trinomial
265 heaps \cite{takaoka:trinomial}. Both have the same asymptotic complexity as Fibonacci
266 heaps (the latter even in worst case, but it does not matter here) and their
267 authors claim implementation advantages.
269 \FIXME{Mention Thorup's Fibonacci-like heaps for integers?}
272 As we already noted, the improved Jarn\'\i{}k's algorithm runs in linear time
273 for sufficiently dense graphs. In some cases, it is useful to combine it with
274 another MST algorithm, which identifies a~part of the MST edges and contracts
275 the graph to increase its density. For example, we can perform several
276 iterations of the Contractive Bor\o{u}vka's algorithm and find the rest of the
277 MST by the above version of Jarn\'\i{}k's algorithm.
279 \algn{Mixed Bor\o{u}vka-Jarn\'\i{}k}
281 \algin A~graph~$G$ with an edge comparison oracle.
282 \:Run $\log\log n$ iterations of the Contractive Bor\o{u}vka's algorithm (\ref{contbor}),
284 \:Run the Jarn\'\i{}k's algorithm with active edges (\ref{jarniktwo}) on the resulting
285 graph, getting a~MST~$T_2$.
286 \:Combine $T_1$ and~$T_2$ to~$T$ as in the Contraction lemma (\ref{contlemma}).
287 \algout Minimum spanning tree~$T$.
291 The Mixed Bor\o{u}vka-Jarn\'\i{}k algorithm finds the MST of the input graph in time $\O(m\log\log n)$.
294 Correctness follows from the Contraction lemma and from the proofs of correctness of the respective algorithms.
295 As~for time complexity: The first step takes $\O(m\log\log n)$ time
296 (by Lemma~\ref{contiter}) and it gradually contracts~$G$ to a~graph~$G'$ of size
297 $m'\le m$ and $n'\le n/\log n$. The second step then runs in time $\O(m'+n'\log n') = \O(m)$
298 and both trees can be combined in linear time, too.
302 Actually, there is a~much better choice of the algorithms to combine: use the
303 improved Jarn\'\i{}k's algorithm multiple times, each time stopping after a~while.
304 The good choice of the stopping condition is to place a~limit on the size of the heap.
305 Start with an~arbitrary vertex, grow the tree as usually and once the heap gets too large,
306 conserve the current tree and start with a~different vertex and an~empty heap. When this
307 process runs out of vertices, it has identified a~sub-forest of the MST, so we can
308 contract the graph along the edges of~this forest and iterate.
310 \algn{Iterated Jarn\'\i{}k; Fredman and Tarjan \cite{ft:fibonacci}}
312 \algin A~graph~$G$ with an edge comparison oracle.
313 \:$T\=\emptyset$. \cmt{edges of the MST}
314 \:$\ell(e)\=e$ for all edges~$e$. \cmt{edge labels as usually}
316 \:While $n>1$: \cmt{We will call iterations of this loop \df{phases}.}
317 \::$F\=\emptyset$. \cmt{forest built in the current phase}
318 \::$t\=2^{2m_0/n}$. \cmt{the limit on heap size}
319 \::While there is a~vertex $v_0\not\in F$:
320 \:::Run the improved Jarn\'\i{}k's algorithm (\ref{jarniktwo}) from~$v_0$, stop when:
321 \::::all vertices have been processed, or
322 \::::a~vertex of~$F$ has been added to the tree, or
323 \::::the heap had more than~$t$ elements.
324 \:::Denote the resulting tree~$R$.
326 \::$T\=T\cup \ell[F]$. \cmt{Remember MST edges found in this phase.}
327 \::Contract~$G$ along all edges of~$F$ and flatten it.
328 \algout Minimum spanning tree~$T$.
332 For analysis of the algorithm, let us denote the graph entering the $i$-th
333 phase by~$G_i$ and likewise with the other parameters. The trees from which
334 $F_i$~has been constructed will be called $R_i^1, \ldots, R_i^{z_i}$. The
335 non-indexed $G$, $m$ and~$n$ will correspond to the graph given as~input.
338 However the choice of the parameter~$t$ can seem mysterious, the following
339 lemma makes the reason clear:
342 The $i$-th phase of the Iterated Jarn\'\i{}k's algorithm runs in time~$\O(m)$.
345 During the phase, the heap always contains at most~$t_i$ elements, so it takes
346 time~$\O(\log t_i)=\O(m/n_i)$ to delete an~element from the heap. The trees~$R_i^j$
347 are disjoint, so there are at most~$n_i$ \<DeleteMin>'s over the course of the phase.
348 Each edge is considered at most twice (once per its endpoint), so the number
349 of the other heap operations is~$\O(m_i)$. Together, it equals $\O(m_i + n_i\log t_i) = \O(m_i+m) = \O(m)$.
353 Unless the $i$-th phase is final, the forest~$F_i$ consists of at most $2m_i/t_i$ trees.
356 As every edge of~$G_i$ is incident with at most two trees of~$F_i$, it is sufficient
357 to establish that there are at least~$t_i$ edges incident with the vertices of every
360 The forest~$F_i$ evolves by additions of the trees~$R_i^j$. Let us consider the possibilities
361 how the algorithm could have stopped growing the tree~$R_i^j$:
363 \:the heap had more than~$t_i$ elements (step~10): since the elements stored in the heap correspond
364 to some of the edges incident with vertices of~$R_i^j$, the condition~(*) is fulfilled;
365 \:the algorithm just added a~vertex of~$F_i$ to~$R_i^j$ (step~9): in this case, an~existing
366 tree of~$F_i$ is extended, so the number of edges incident with it cannot decrease;\foot{%
367 To make this true, we counted the edges incident with the \em{vertices} of the tree
368 instead of edges incident with the tree itself, because we needed the tree edges
369 to be counted as well.}
370 \:all vertices have been processed (step~8): this can happen only in the final phase.
375 The Iterated Jarn\'\i{}k's algorithm finds the MST of the input graph in time
376 $\O(m\timesbeta(m,n))$, where $\beta(m,n):=\min\{ i: \log^{(i)}n < m/n \}$.
379 Phases are finite and in every phase at least one edge is contracted, so the outer
380 loop is eventually terminated. The resulting subgraph~$T$ is equal to $\mst(G)$, because each $F_i$ is
381 a~subgraph of~$\mst(G_i)$ and the $F_i$'s are glued together according to the Contraction
382 lemma (\ref{contlemma}).
384 Let us bound the sizes of the graphs processed in individual phases. As the vertices
385 of~$G_{i+1}$ correspond to the components of~$F_i$, by the previous lemma $n_{i+1}\le
386 2m_i/t_i$. Then $t_{i+1} = 2^{2m/n_{i+1}} \ge 2^{2m/(2m_i/t_i)} = 2^{(m/m_i)\cdot t_i} \ge 2^{t_i}$,
389 \left. \vcenter{\hbox{$\displaystyle t_i \ge 2^{2^{\scriptstyle 2^{\scriptstyle\vdots^{\scriptstyle m/n}}}} $}}\;\right\}
390 \,\hbox{a~tower of~$i$ exponentials.}
392 As soon as~$t_i\ge n$, the $i$-th phase must be final, because at that time
393 there is enough space in the heap to process the whole graph. So~there are
394 at most~$\beta(m,n)$ phases and we already know (Lemma~\ref{ijphase}) that each
395 phase runs in linear time.
399 The Iterated Jarn\'\i{}k's algorithm runs in time $\O(m\log^* n)$.
402 $\beta(m,n) \le \beta(1,n) = \log^* n$.
406 When we use the Iterated Jarn\'\i{}k's algorithm on graphs with edge density
407 at least~$\log^{(k)} n$ for some $k\in{\bb N}^+$, it runs in time~$\O(km)$.
410 If $m/n \ge \log^{(k)} n$, then $\beta(m,n)\le k$.
414 Gabow et al.~\cite{gabow:mst} have shown how to speed this algorithm up to~$\O(m\log\beta(m,n))$.
415 They split the adjacency lists of the vertices to small buckets, keep each bucket
416 sorted and consider only the lightest edge in each bucket until it is removed.
417 The mechanics of the algorithm is complex and there is a~lot of technical details
418 which need careful handling, so we omit the description of this algorithm.
420 \FIXME{Reference to Chazelle.}
422 \FIXME{Reference to Q-Heaps.}
424 %--------------------------------------------------------------------------------
426 \section{Verification of minimality}