7 \section{Special cases and related problems}
9 Towards the end of our story of the minimum spanning trees, we will now focus our attention
10 on various special cases of our problem and also to several related problems that
11 frequently arise in practice.
13 \paran{Graphs with sorted edges}
14 When the edges are already sorted by their weights, we can use the Kruskal's
15 algorithm to find the MST in time $\O(m\timesalpha(n))$ (Theorem \ref{kruskal}).
16 We however can do better: As the minimality of a~spanning tree depends only on the
17 order of weights and not on the actual values (The Minimality Theorem, \ref{mstthm}), we can
18 renumber the weights to $1, \ldots, m$ and find the MST using the Fredman-Willard
19 algorithm for integer weights. According to Theorem \ref{intmst} it runs in
20 time $\O(m)$ on the Word-RAM.
22 \paran{Graphs with a~small number of distinct weights}
23 When the weights of edges are drawn from a~set of a~fixed size~$U$, we can
24 sort them in linear time and so reduce the problem to the previous case.
25 A~more practical way is to use the Jarn\'\i{}k's algorithm (\ref{jarnimpl}),
26 but replace the heap by an~array of $U$~buckets. As the number of buckets
27 is constant, we can find the minimum in constant time and hence the whole
28 algorithm runs in time $\O(m)$, even on the Pointer Machine. For large
29 values of~$U,$ we can build a~binary search tree or the van Emde-Boas
30 tree (see Section \ref{ramdssect} and \cite{boas:vebt}) on the top of the buckets to bring the complexity
31 of finding the minimum down to $\O(\log U)$ or $\O(\log\log U)$ respectively.
33 \paran{Graphs with floating-point weights}
34 A~common case of non-integer weights are rational numbers in floating-point (FP)
35 representation. Even in this case we will be able to find the MST in linear time.
36 The most common representation of binary FP numbers specified by the IEEE
37 standard 754-1985 \cite{ieee:binfp} has a~useful property: When the
38 bit strings encoding non-negative FP numbers are read as ordinary integers,
39 the order of these integers is the same as of the original FP numbers. We can
40 therefore once again replace the edge weights by integers and use the linear-time
41 integer algorithm. While the other FP representations (see \cite{dgoldberg:fp} for
42 an~overview) need not have this property, the corresponding integers can be adjusted
43 in $\O(1)$ time to the format we need. (More advanced tricks of this type have been
44 employed by Thorup in \cite{thorup:floatint} to extend his linear-time algorithm
45 for single-source shortest paths to FP edge lengths.)
47 \paran{Graphs with bounded degrees}
48 For graphs with vertex degrees bounded by a~constant~$\Delta$, the problem is either
49 trivial (if $\Delta<3$) or as hard as for arbitrary graphs. There is a~simple linear-time
50 transform of arbitrary graphs to graphs with maximum degree~3 which preserves the MST:
52 \lemman{Degree reduction}\id{degred}%
53 For every graph~$G$ there exists a~graph~$G'$ with maximum degree at most~3 and
54 a~function $\pi: E(G)\rightarrow E(G')$ such that $\mst(G) = \pi^{-1}(\mst(G'))$.
55 The graph $G'$ and the embedding~$\pi$ can be constructed in time $\O(m)$.
57 \figure{french.eps}{\epsfxsize}{Degree reduction in Lemma~\ref{degred}}
60 We show how to eliminate a~single vertex~$v$ of degree $d>3$ and then apply
63 Assume that $v$~has neighbors $w_1,\ldots,w_d$. We replace~$v$ and the edges~$vw_i$
64 by $d$~new vertices $v_1,\ldots,v_d$, joined by a~path $v_1v_2\ldots v_d$, and
65 edges~$v_iw_i$. Each edge of the path will receive a~weight smaller than all
66 original weights, the other edges will inherit the weights of the edges $vw_i$
67 they replace. The edges of the path will therefore lie in the MST (this is
68 obvious from the Kruskal's algorithm) and as~$G$ can be obtained from the
69 new~$G'$ by contracting the path, the rest follows from the Contraction lemma
72 This step can be carried out in time $\O(d)$. As it replaces a high-degree
73 vertex by vertices of degree~3, the whole procedure stops in at most~$n$ such
74 steps, so it takes time $\O(\sum_{v\in V}\deg(v)) = \O(m)$ including the
75 time needed to find the high-degree vertices at the beginning.
79 The MST also has its counterparts in the realm of geometric algorithms. Suppose
80 that we have $n$~points $x_1,\ldots,x_n$ in the plane and we want to find the
81 shortest system of segments connecting these points. If we want the segments to
82 touch only in the given points, this is equivalent to finding the MST of the
83 complete graph on the vertices $V=\{x_1,\ldots,x_n\}$ with edge weights
84 defined as the Euclidean distances of the points. Since the graph is dense, many of the MST
85 algorithms discussed run in linear time with the size of the graph, hence
88 There is a~more efficient method based on the observation that the MST
89 is always a~subgraph of the Delaunay's tesselation for the given points
90 (this was first noted by Shamos and Hoey in~\cite{shamos:closest}). The
91 tesselation is a~planar graph, which guarantees that it has $\O(n)$ edges,
92 and it is a~dual graph of the Voronoi diagram of the given points, which can
93 be constructed in time $\O(n\log n)$ using for example the Fortune's
94 algorithm \cite{fortune:voronoi}. We can therefore reduce the problem
95 to finding the MST of the tesselation for which $\O(n\log n)$ time
96 is more than sufficient.
98 This approach fails for non-Euclidean metrics, but in some cases
99 (in particular for the rectilinear metric) the $\O(n\log n)$ time bound
100 is also achievable by the algorithm of Zhou et al.~\cite{zhou:nodel}
101 based on the sweep-line technique and the Red rule. For other
102 variations on the geometric MST, see Eppstein's survey paper
103 \cite{eppstein:spanning}.
105 \paran{Steiner trees}
106 The constraint that the segments in the previous example are allowed to touch
107 each other only in the given points looks artificial and it is indeed uncommon in
108 practical applications (including the problem of designing electrical transmission
109 lines originally studied by Bor\o{u}vka). If we lift this restriction, we get
110 the problem known by the name Steiner tree.\foot{It is named after the Swiss mathematician
111 Jacob Steiner who studied a~special case of this problem in the 19th century.}
112 We can also define it in terms of graphs:
114 \defn A~\df{Steiner tree} of a~weighted graph~$(G,w)$ with a~set~$M\subseteq V$
115 of \df{mandatory notes} is a~tree~$T\subseteq G$ that contains all the mandatory
116 vertices and its weight is minimum possible.
118 For $M=V$ the Steiner tree is identical to the MST, but if we allow an~arbitrary
119 choice of the mandatory vertices, it is NP-hard. This has been proven by Garey and Johnson
120 \cite{garey:steiner,garey:rectisteiner} for not only the graph version with
121 weights $\{1,2\}$, but also for the planar version with Euclidean or rectilinear
122 metric. There is a~polynomial approximation algorithm with ratio $5/3$ for
123 graphs due to Pr\"omel and Steger \cite{proemel:steiner} and a~polynomial-time
124 approximation scheme for the Euclidean Steiner tree in an~arbitrary dimension
125 by Arora \cite{arora:tspapx}.
127 \paran{Approximating the weight of the MST}
128 Sometimes we are not interested in the actual edges forming the MST and only
129 the weight matters. If we are willing to put up with a~randomized approximation,
130 we can even achieve sub-linear complexity. Chazelle et al.~\cite{chazelle:mstapprox}
131 have shown an~algorithm which, given $0 < \varepsilon < 1/2$, approximates
132 the weight of the MST of a~graph with average degree~$d$ and edge weights from the set
133 $\{1,\ldots,w\}$ in time $\O(dw\varepsilon^{-2}\cdot\log(dw/\varepsilon))$,
134 producing a~weight which has relative error at most~$\varepsilon$ with probability
135 at least $3/4$. They have also proven an~almost matching lower bound $\Omega(dw\varepsilon^{-2})$.
137 For the $d$-dimensional Euclidean case, there is a~randomized approximation
138 algorithm by Czumaj et al.~\cite{czumaj:euclidean} which with high probability
139 produces a~spanning tree within relative error~$\varepsilon$ in $\widetilde\O(\sqrt{n}\cdot \poly(1/\varepsilon))$\foot{%
140 $\widetilde\O(f) = \O(f\cdot\log^{\O(1)} f)$ and $\poly(n)=n^{\O(1)}$.}
141 queries to a~data structure containing the points. The data structure is expected
142 to answer orthogonal range queries and cone approximate nearest neighbor queries.
143 There is also a~$\widetilde\O(n\cdot \poly(1/\varepsilon))$ time approximation
144 algorithm for the MST weight in arbitrary metric spaces by Czumaj and Sohler \cite{czumaj:metric}.
145 (This is still sub-linear since the corresponding graph has roughly $n^2$ edges.)