From: Martin Mares <mj@ucw.cz>
Date: Wed, 5 Mar 2008 16:58:05 +0000 (+0100)
Subject: Special cases.
X-Git-Tag: printed~189
X-Git-Url: http://mj.ucw.cz/gitweb/?a=commitdiff_plain;h=b66b5869a0ccfd85ada5d012fe70bfc0119bda11;p=saga.git

Special cases.
---

diff --git a/PLAN b/PLAN
index 9b562ec..40a1092 100644
--- a/PLAN
+++ b/PLAN
@@ -21,7 +21,7 @@
   .  Randomized algorithms
   .  ?? Chazelle ??
   .  ?? Pettie ??
-  .  Other classes of graphs
+  o  Special cases and related problems
 
 *  Ranking combinatorial objects
 
@@ -48,17 +48,15 @@ Spanning trees:
 
 - cite Eisner's tutorial \cite{eisner:tutorial}
 - \cite{pettie:onlineverify} online lower bound
-- mention Steiner trees
 - mention matroids
 - mention disconnected graphs
-- Euclidean MST
 - Some algorithms (most notably Fredman-Tarjan) do not need flattening
 - reference to mixed Boruvka-Jarnik
 - use the notation for contraction by a set
 - practical considerations: katriel:cycle, moret:practice (mention pairing heaps)
 - parallel algorithms: p243-cole (are there others?)
-- mention 3-regular graphs; bounded expansion?
-- floating-point weights
+- bounded expansion classes?
+- restricted cases and arborescences
 
 Models:
 
@@ -79,7 +77,6 @@ Ranking:
 
 Notation:
 
-- \O(...) as a set?
 - G has to be connected, so m=O(n)
 - impedance mismatch in terminology: contraction of G along e vs. contraction of e.
 - use \delta(X) notation
diff --git a/adv.tex b/adv.tex
index 77ac939..007217b 100644
--- a/adv.tex
+++ b/adv.tex
@@ -545,10 +545,10 @@ which need careful handling, so we omit the description of this algorithm.
 
 %--------------------------------------------------------------------------------
 
-\section{Special classes of graphs}
+\section{Special cases and related problems}
 
-Finally, we will focus our attention on various special classes of graphs
-which frequently occur in practice.
+Finally, we will focus our attention on various special cases of the minimum
+spanning tree problems, which frequently arise in practice.
 
 \examplen{Graphs with sorted edges}
 When the edges are already sorted by their weights, we can use the Kruskal's
@@ -559,19 +559,125 @@ renumber the weights to $1, \ldots, m$ and find the MST using the Fredman-Willar
 algorithm for integer weights. According to Theorem \ref{intmst} it runs in
 time $\O(m)$ on the Word-RAM.
 
-\examplen{Graphs with non-unique edge weights}
-
 \examplen{Graphs with a~small number of distinct weights}
+When the weights of edges are drawn from a~set of a~fixed size~$U$, we can
+sort them in linear time and so reduce the problem to the previous case.
+A~more practical way is to use the Jarn\'\i{}k's algorithm (\ref{jarnimpl}),
+but replace the heap by an~array of $U$~buckets. As the number of buckets
+is constant, we can find the minimum in constant time and hence the whole
+algorithm runs in time $\O(m)$, even on the Pointer Machine. For large
+values of~$U$, we can build a~binary search tree or the van Emde-Boas
+tree (see Section \ref{ramdssect} and \cite{boas:vebt}) on the top of the buckets to bring the complexity
+of finding the minimum down to $\O(\log U)$ or $\O(\log\log U)$ respectively.
 
 \examplen{Graphs with floating-point weights}
+A~common case of non-integer weights are rational numbers in floating-point (FP)
+representation. Even in this case we will be able to find the MST in linear time.
+The most common representation of binary FP numbers (as specified by the IEEE
+standard 754-1985 \cite{ieee:binfp}) has the nice property that when the
+bit strings encoding non-negative FP numbers are read as ordinary integers,
+the order of these integers is the same as of the original FP numbers. We can
+therefore once again replace the edge weights by integers and use the linear-time
+integer algorithm. While the other FP representations (see \cite{dgoldberg:fp} for
+an~overview) do not have this property, the corresponding integers can be adjusted
+in $\O(1)$ time to the format we need. (More advanced tricks of this type been
+employed by Thorup in \cite{thorup:floatint} to extend the integer algorithm
+for single-source shortest paths to FP numbers.)
 
 \examplen{Graphs with bounded degrees}
+For graphs with vertex degrees bounded by a~constant~$\Delta$, the problem is either
+trivial (if $\Delta<3$) or as hard as for arbitrary graphs. There is a~simple linear-time
+transform of arbitrary graphs to graphs with maximum degree~3 which preserves the MST:
 
-\examplen{Euclidean MST}
+\lemman{Degree reduction}
+For every graph~$G$ there exists a~graph~$G'$ with maximum degree at most~3 and
+a~function $\pi: E(G)\rightarrow E(G')$ such that $\mst(G) = \pi^{-1}(\mst(G'))$.
+The graph $G'$ and the embedding~$\pi$ can be constructed in time $\O(m)$.
+
+\proof
+We show how to eliminate a~single vertex~$v$ of degree $d>3$, the rest
+will follow by induction.
+
+Assume that $v$~has neighbors $w_1,\ldots,w_d$. We replace~$v$ and the edges~$vw_i$
+by a~path $v_1v_2\ldots v_d$ and edges~$v_iw_i$. Each edge of the path will receive
+weight smaller than all other weights, the other edges will inherit the weights
+of the edges $vw_i$ they replace. The edges of the path will therefore lie in the
+MST (this is obvious from the Kruskal's algorithm) and as~$G$ can be obtained from
+the new~$G'$ by contracting the path, the rest follows from the Contraction lemma
+(\ref{contlemma}).
+
+This step can be carried out in time $\O(d)$. As it replaces a high-degree
+vertex by vertices of degree~3, the procedure stops in at most~$n$ such
+steps, so it takes time $\O(\sum_{v\in V}\deg(v)) = \O(m)$ including the
+time needed to find the high-degree vertices at the beginning.
+\qed
 
-\examplen{Approximating the MST}
-\cite{chazelle:mstapprox},
-\cite{czumaj:euclidean},
-\cite{czumaj:metric}.
+\examplen{Euclidean MST}
+The MST also has its analogies in the realm of geometric algorithms. Suppose
+that we have $n$~points $x_1,\ldots,x_n$ in the plane and we want to find the
+shortest system of segments connecting these points. If we want the segments to
+touch only in the given points, this is equivalent to finding a~MST of the
+complete graph on the vertices $V=\{v_1,\ldots,v_n\}$ with edge weights
+equal to the Euclidean distances. Since the graph is dense, many of the MST
+algorithms discussed run in linear time with the size of the graph, hence
+in time $\O(n^2)$.
+
+There is a~more efficient method based on the observation that the MST
+is always a~subgraph of the Delaunay's tesselation for the given points
+(this was first noted by Shamos and Hoey in~\cite{shamos:closest}). The
+tesselation is a~planar graph, which guarantees that it has $\O(n)$ edges,
+and it is a~dual graph to the Voronoi diagram of the points, which can
+be constructed in time $\O(n\log n)$ using for example the Fortune's
+algorithm \cite{fortune:voronoi}. We can therefore reduce the problem
+to finding the MST of the tesselation for which $\O(n\log n)$ time
+is more than enough.
+
+This approach fails for non-Euclidean metrics, but in some cases
+(in particular for the rectilinear metric) the $\O(n\log n)$ time
+is also achievable by the algorithm of Zhou et al.~\cite{zhou:nodel}
+based on the sweep-line technique and the Red rule. For other
+variations on the geometric MST, see Eppstein's survey paper
+\cite{eppstein:spanning}.
+
+\examplen{Steiner trees}
+The constraint that the segments in the previous example are allowed to touch
+each other only in the given points looks artificial and it is is uncommon in
+practical applications (including the application on electrical transmission
+lines originally studied by Bor\o{u}vka). If we lift this restriction, we get
+the problem known under the name Steiner tree.\foot{It is named after the Swiss mathematician
+Jacob Steiner who studied a~special case of this problem in the 19th century.}
+We can also define it in terms of graphs:
+
+\defn A~\df{Steiner tree} of a~weighted graph~$(G,w)$ with a~set~$M\subseteq V$
+of \df{mandatory notes} is a~tree~$T\subseteq G$ which contains all the mandatory
+vertices and its weight is minimum possible.
+
+Finding the Steiner tree has been proven to be NP-hard by Garey and
+Johnson \cite{garey:steiner,garey:rectisteiner} in both the graph version
+(even for weights $\{1,2\}$) and in the planar version with Euclidean or rectilinear
+metric. There is a~polynomial approximation algorithm with ratio $5/3$ for
+graphs due to Pr\"omel and Steger \cite{proemel:steiner} and a~polynomial-time
+approximation scheme for the Euclidean Steiner tree in an~arbitrary dimension
+by Arora \cite{arora:tspapx}.
+
+\examplen{Approximating the weight of the MST}
+Sometimes we are not interested in the actual edges forming the MST and only
+the weight matters. If we are willing to put up with a~randomized approximation,
+we can even achieve sub-linear complexity. Chazelle et al.~\cite{chazelle:mstapprox}
+have shown an~algorithm which, given $0 < \varepsilon < 1/2$, approximates
+the weight of the MST of a~graph with average degree~$d$ and edge weights from the set
+$\{1,\ldots,w\}$ in time $\O(dw\varepsilon^{-2}\cdot\log(dw/\varepsilon))$,
+producing a~weight which has relative error at most~$\varepsilon$ with probability
+at least $3/4$. They have also proven an~almost matching lower bound $\Omega(dw\varepsilon^{-2})$.
+
+For the $d$-dimensional Euclidean case, there is a~randomized approximation
+algorithm by Czumaj et al.~\cite{czumaj:euclidean} which with high probability
+produces a~spanning tree within relative error~$\varepsilon$ in $\widetilde\O(\sqrt{n}\cdot \poly(1/\varepsilon))$\foot{%
+$\widetilde\O(f(n)) = \O(f(n)\cdot\log^{\O(1)} f(n))$ and $\poly(n)=n^{\O(1)}$.}
+queries to a~data structure containing the points. The data structure is expected
+to answer orthogonal range queries and cone approximate nearest neighbor queries.
+There is also $\widetilde\O(n\cdot \poly(1/\varepsilon))$ time approximation
+algorithm for MST weight in arbitrary metric spaces by Czumaj and Sohler \cite{czumaj:metric}.
+(This is still sub-linear since the corresponding graph has roughly $n^2$ edges.)
 
 \endpart
diff --git a/biblio.bib b/biblio.bib
index be57017..6d9df00 100644
--- a/biblio.bib
+++ b/biblio.bib
@@ -927,3 +927,109 @@ inproceedings{ pettie:minirand,
  publisher = {Society for Industrial and Applied Mathematics},
  address = {Philadelphia, PA, USA},
 }
+
+@book{ ieee:binfp,
+  author = {IEEE},
+  title = {IEEE Standard 754-1985 for Binary Floating-point Arithmetic},
+  year = {1985},
+  publisher = {IEEE}
+}
+
+@article{ dgoldberg:fp,
+  title={{What every computer scientist should know about floating-point arithmetic}},
+  author={Goldberg, D.},
+  journal={ACM Computing Surveys (CSUR)},
+  volume={23},
+  number={1},
+  pages={5--48},
+  year={1991},
+  publisher={ACM Press New York, NY, USA}
+}
+
+@article{ thorup:floatint,
+  title={{Floats, Integers, and Single Source Shortest Paths}},
+  author={Thorup, M.},
+  journal={Journal of Algorithms},
+  volume={35},
+  number={2},
+  pages={189--201},
+  year={2000},
+  publisher={Academic Press}
+}
+
+@article{ shamos:closest,
+  title={{Closest-point problems}},
+  author={Shamos, M. I. and Hoey, D.},
+  journal={Proceedings of the 16th Annual IEEE Symposium on Foundations of Computer Science},
+  pages={151--162},
+  year={1975}
+}
+
+@article{ fortune:voronoi,
+  title={{A sweepline algorithm for Voronoi diagrams}},
+  author={Fortune, S.},
+  journal={Algorithmica},
+  volume={2},
+  number={1},
+  pages={153--174},
+  year={1987},
+  publisher={Springer}
+}
+
+@article{ zhou:nodel,
+  title={{Efficient minimum spanning tree construction without Delaunay triangulation}},
+  author={Zhou, H. and Shenoy, N. and Nicholls, W.},
+  journal={Information Processing Letters},
+  volume={81},
+  number={5},
+  pages={271--276},
+  year={2002},
+  publisher={Elsevier}
+}
+
+@techreport{ eppstein:spanning,
+  title={{Spanning Trees and Spanners}},
+  author={Eppstein, D.},
+  year={1996},
+  institution={Information and Computer Science, University of California, Irvine}
+}
+
+@article{ garey:steiner,
+  title={{The Complexity of Computing Steiner Minimal Trees}},
+  author={Garey, M.R. and Graham, R.L. and Johnson, D.S.},
+  journal={SIAM Journal on Applied Mathematics},
+  volume={32},
+  number={4},
+  pages={835--859},
+  year={1977},
+}
+
+@article{ garey:rectisteiner,
+  title={{The Rectilinear Steiner Tree Problem is NP-Complete}},
+  author={Garey, M.R. and Johnson, D.S.},
+  journal={SIAM Journal on Applied Mathematics},
+  volume={32},
+  number={4},
+  pages={826--834},
+  year={1977},
+}
+
+@article{ proemel:steiner,
+  title={{A new approximation algorithm for the Steiner tree problem with performance ratio $5/3$}},
+  author={Promel, H.J. and Steger, A.},
+  journal={Journal of Algorithms},
+  volume={36},
+  pages={89--101},
+  year={2000}
+}
+
+@article{ arora:tspapx,
+  title={{Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems}},
+  author={Arora, S.},
+  journal={Journal of the ACM (JACM)},
+  volume={45},
+  number={5},
+  pages={753--782},
+  year={1998},
+  publisher={ACM Press New York, NY, USA}
+}
diff --git a/macros.tex b/macros.tex
index 79998b1..c0ac011 100644
--- a/macros.tex
+++ b/macros.tex
@@ -46,6 +46,7 @@
 \def\Forb{{\rm Forb}}
 \def\minorof{\preccurlyeq}
 \def\per{\mathop{\rm per}}
+\def\poly{\mathop{\rm poly}}
 
 % Bit strings
 \def\0{{\bf 0}}
diff --git a/mst.tex b/mst.tex
index 866bcc1..857399a 100644
--- a/mst.tex
+++ b/mst.tex
@@ -393,7 +393,7 @@ the remaining edges, since for a connected graph the algorithm always stops with
 number of blue edges.
 \qed
 
-\impl
+\impl\id{jarnimpl}%
 The most important part of the algorithm is finding \em{neighboring edges,} i.e., edges
 of the cut $\delta(T)$. In a~straightforward implementation,
 searching for the lightest neighboring edge takes $\Theta(m)$ time, so the whole
diff --git a/notation.tex b/notation.tex
index bbf7a53..f2c84dd 100644
--- a/notation.tex
+++ b/notation.tex
@@ -10,6 +10,11 @@
 \n{$\bb R$}{the set of all real numbers}
 \n{$\bb N$}{the set of all natural numbers, including 0}
 \n{${\bb N}^+$}{the set of all positive integers}
+\n{$\O(g)$}{asymptotic~$O$: $f=\O(g)$ iff $\exists c>0: f(n)\le g(n)$ for all~$n\ge n_0$}
+\n{$\Omega(g)$}{asymptotic~$\Omega$: $f=\Omega(g)$ iff $\exists c>0: f(n)\ge g(n)$ for all~$n\ge n_0$}
+\n{$\Theta(g)$}{asymptotic~$\Theta$: $f=\Theta(g)$ iff $f=\O(g)$ and $f=\Omega(g)$}
+\n{$\widetilde\O(g)$}{$f=\widetilde\O(g)$ iff $f=\O(g\cdot\log^{\O(1)} g)$}
+\n{$\poly(n)$}{$f=\poly(n)$ iff $f=\O(n^c)$ for some $c$}
 \n{$T[u,v]$}{the path in a tree~$T$ joining vertices $u$ and $v$ \[heavy]}
 \n{$T[e]$}{the path in a tree~$T$ joining the endpoints of an~edge~$e$ \[heavy]}
 \n{$A\symdiff B$}{symetric difference of sets: $(A\setminus B) \cup (B\setminus A)$}
diff --git a/ram.tex b/ram.tex
index 824382e..d39d9ba 100644
--- a/ram.tex
+++ b/ram.tex
@@ -307,6 +307,7 @@ scanning all~$n$ buckets takes $\O(n+m)$ time.
 %--------------------------------------------------------------------------------
 
 \section{Data structures on the RAM}
+\id{ramdssect}
 
 There is a~lot of data structures designed specifically for the RAM, taking
 advantage of both indexing and arithmetics. In many cases, they surpass the known
@@ -676,8 +677,8 @@ to precompute a~table of the values of~$f$ for all arguments whose size is $\O(k
 
 \proof
 There are $2^{\O(k^3)}$ possible combinations of arguments of the given size and for each of
-them we spend $\O(k^c)$ time by calculating the function (for some~$c\ge 1$). It remains
-to observe that $2^{\O(k^3)}\cdot \O(k^c) = \O(2^{k^4})$.
+them we spend $\poly(k)$ time on calculating the function. It remains
+to observe that $2^{\O(k^3)}\cdot \poly(k) = \O(2^{k^4})$.
 \qed
 
 \para