From: Martin Mares <mj@ucw.cz>
Date: Mon, 2 May 2022 17:49:28 +0000 (+0200)
Subject: Hand-out for decision trees and the optimal algorithm
X-Git-Url: http://mj.ucw.cz/gitweb/?a=commitdiff_plain;h=f48cac06005370384b032133026c6978e10439c6;p=saga.git

Hand-out for decision trees and the optimal algorithm
---

diff --git a/Makefile b/Makefile
index b00abff..ed6570d 100644
--- a/Makefile
+++ b/Makefile
@@ -1,4 +1,4 @@
-all: saga.ps abstract.ps abscover.ps pubs.ps
+all: saga.pdf abstract.pdf abscover.pdf pubs.pdf
 
 CHAPTERS=cover pref mst ram adv opt dyn appl rank epilog notation
 
diff --git a/macros.tex b/macros.tex
index 3ab39f7..0a1efc1 100644
--- a/macros.tex
+++ b/macros.tex
@@ -308,6 +308,8 @@
 \edef\currentid{\currentchap.\the\seccount.\the\thmcount}
 \noindent {\bo \currentid.\enspace}}
 
+\def\elide{\advance\thmcount by 1}
+
 \def\proclaim#1{\para {\bo #1.\enspace}}
 
 \def\thm{\proclaim{Theorem}}
diff --git a/opt.tex b/opt.tex
index d71f983..a9300bf 100644
--- a/opt.tex
+++ b/opt.tex
@@ -787,117 +787,28 @@ and additionally $\O(n)$ on identifying the live vertices.
 
 %--------------------------------------------------------------------------------
 
+\vfill\eject
+
+\pageno=1
+
 \section{Decision trees}\id{dtsect}%
 
-The Pettie's and Ramachandran's algorithm combines the idea of robust partitioning with optimal decision
-trees constructed by brute force for very small subgraphs. In this section, we will
-explain the basics of the decision trees and prove several lemmata which will
-turn out to be useful for the analysis of time complexity of the final algorithm.
-
-Let us consider all computations of some comparison-based MST algorithm when we
-run it on a~fixed graph~$G$ with all possible permutations of edge weights.
-The computations can be described by a~binary tree. The root of the tree corresponds to the first
-comparison performed by the algorithm and depending to its result, the computation
-continues either in the left subtree or in the right subtree. There it encounters
-another comparison and so on, until it arrives to a~leaf of the tree where the
-spanning tree found by the algorithm is recorded.
-
-Formally, the decision trees are defined as follows:
-
-\defnn{Decision trees and their complexity}\id{decdef}%
-A~\df{MSF decision tree} for a~graph~$G$ is a~binary tree. Its internal vertices
-are labeled with pairs of $G$'s edges to be compared, each of the two outgoing tree edges
-corresponds to one possible result of the comparison.\foot{There are two possible
-outcomes since there is no reason to compare an~edge with itself and we, as usually,
-expect that the edge weights are distinct.}
-Leaves of the tree are labeled with spanning trees of the graph~$G$.
-
-A~\df{computation} of the decision tree on a~specific permutation of edge weights
-in~$G$ is the path from the root to a~leaf such that the outcome of every comparison
-agrees with the edge weights. The result of the computation is the spanning tree
-assigned to its final leaf.
-A~decision tree is \df{correct} iff for every permutation the corresponding
-computation results in the real MSF of~$G$ with the particular weights.
-
-The \df{time complexity} of a~decision tree is defined as its depth. It therefore
-bounds the number of comparisons spent on every path. (It need not be equal since
-some paths need not correspond to an~actual computation --- the sequence of outcomes
-on the path could be unsatisfiable.)
-
-A~decision tree is called \df{optimal} if it is correct and its depth is minimum possible
-among the correct decision trees for the given graph.
+\para
 We will denote an~arbitrary optimal decision tree for~$G$ by~${\cal D}(G)$ and its
 complexity by~$D(G)$.
-
 The \df{decision tree complexity} $D(m,n)$ of the MSF problem is the maximum of~$D(G)$
 over all graphs~$G$ with $n$~vertices and~$m$ edges.
 
-\obs
-Decision trees are the most general deterministic comparison-based computation model possible.
-The only operations that count in its time complexity are comparisons. All
-other computation is free, including solving NP-complete problems or having
-access to an~unlimited source of non-uniform constants. The decision tree
-complexity is therefore an~obvious lower bound on the time complexity of the
-problem in all other comparison-based models.
-
-The downside is that we do not know any explicit construction of the optimal
-decision trees, or at least a~non-constructive proof of their complexity.
-On the other hand, the complexity of any existing comparison-based algorithm
-can be used as an~upper bound on the decision tree complexity. For example:
-
 \lemma
 $D(m,n) \le 4/3 \cdot n^2$.
 
-\proof
-Let us count the comparisons performed by the Contractive Bor\o{u}vka's algorithm
-(\ref{contbor}), tightening up the constants in its previous analysis in Theorem
-\ref{contborthm}. In the first iteration, each edge participates in two comparisons
-(one per endpoint), so the algorithm performs at most $2m \le 2{n\choose 2} \le n^2$
-comparisons. Then the number of vertices drops at least by a~factor of two, so
-the subsequent iterations spend at most $(n/2)^2, (n/4)^2, \ldots$ comparisons, which sums
-to less than $n^2\cdot\sum_{i=0}^\infty (1/4)^i = 4/3 \cdot n^2$. Between the Bor\o{u}vka steps,
-we flatten the multigraph to a~simple graph, which also needs some comparisons,
-but for every such comparison we remove one of the participating edges, which saves
-at least one comparison in the subsequent steps.
-\qed
-
-\para
-Of course we can get sharper bounds from the better algorithms, but we will first
-show how to find the optimal trees using brute force. The complexity of the search
-will be of course enormous, but as we already promised, we will need the optimal
-trees only for very small subgraphs.
+\elide
 
 \lemman{Construction of optimal decision trees}\id{odtconst}%
 An~optimal MST decision tree for a~graph~$G$ on~$n$ vertices can be constructed on
 the Pointer Machine in time $\O(2^{2^{4n^2}})$.
 
-\proof
-We will try all possible decision trees of depth at most $2n^2$
-(we know from the previous lemma that the desired optimal tree is shallower). We can obtain
-any such tree by taking the complete binary tree of exactly this depth
-and labeling its $2\cdot 2^{2n^2}-1$ vertices with comparisons and spanning trees. Those labeled
-with comparisons become internal vertices of the decision tree, the others
-become leaves and the parts of the tree below them are removed. There are less
-than $n^4$ possible comparisons and less than $2^{n^2}$ spanning trees of~$G$,
-so the number of candidate decision trees is bounded by
-$(n^4+2^{n^2})^{2^{2n^2+1}} \le 2^{(n^2+1)\cdot 2^{2n^2+1}} \le 2^{2^{2n^2+2}} \le 2^{2^{3n^2}}$.
-
-We enumerate the trees in an~arbitrary order, test each tree for correctness and
-find the shallowest tree among those correct. Testing can be accomplished by running
-through all possible permutations of edges, each time calculating the MSF using any
-of the known algorithms and comparing it with the result given by the decision tree.
-The number of permutations does not exceed $(n^2)! \le (n^2)^{n^2} \le n^{2n^2} \le 2^{n^3}$
-for sufficiently large~$n$ and each one can be checked in time $\O(\poly(n))$.
-
-On the Pointer Machine, trees and permutations can be certainly enumerated in time
-$\O(\poly(n))$ per object. The time complexity of the whole algorithm is therefore
-$\O(2^{2^{3n^2}} \cdot 2^{n^3} \cdot \poly(n)) = \O(2^{2^{4n^2}})$.
-\qed
-
-\paran{Basic properties of decision trees}%
-The following properties will be useful for analysis of algorithms based
-on precomputed decision trees. We will omit some technical details, referring
-the reader to section 5.1 of the Pettie's article \cite{pettie:optimal}.
+\elide
 
 \lemma\id{dtbasic}%
 The decision tree complexity $D(m,n)$ of the MSF satisfies:
@@ -906,21 +817,6 @@ The decision tree complexity $D(m,n)$ of the MSF satisfies:
 \:$D(m',n') \ge D(m,n)$ whenever $m'\ge m$ and $n'\ge n$.
 \endlist
 
-\proof
-For every $m,n>2$ there is a~graph on $n$~vertices and $m$~edges such that
-every edge lies on a~cycle. Every correct MSF decision tree for this graph
-has to compare each edge at least once. Otherwise the decision tree cannot
-distinguish between the case when an~edge has the lowest of all weights (and
-thus it is forced to belong to the MSF) and when it has the highest weight (so
-it is forced out of the MSF).
-
-Decision trees for graphs on $n'$~vertices can be used for graphs with $n$~vertices
-as well --- it suffices to add isolated vertices, which does not change the MSF.
-Similarly, we can increase $m$ to~$m'$ by adding edges parallel to an~existing
-edge and making them heavier than the rest of the graph, so that they can never
-belong to the MSF.
-\qed
-
 \defn
 Subgraphs $C_1,\ldots,C_k$ of a~graph~$G$ are called the \df{compartments} of~$G$
 iff they are edge-disjoint, their union is the whole graph~$G$ and
@@ -931,105 +827,24 @@ The clusters $C_1,\ldots,C_k$ generated by the Partition procedure of the
 previous section (Algorithm \ref{partition}) are compartments of the graph
 $H=\bigcup_i C_i$.
 
-\proof
-The first and second condition of the definition of compartments follow
-from the Partitioning theorem (\ref{partthm}), so it remains to show that $\msf(H)$
-is the union of the MSF's of the individual compartments. By the Cycle rule
-(Lemma \ref{redlemma}), an~edge $h\in H$ is not contained in $\msf(H)$ if and only if
-it is the heaviest edge on some cycle. It is therefore sufficient to prove that
-every cycle in~$H$ is contained within a~single~$C_i$.
-
-Let us consider a~cycle $K\subseteq H$ and a~cluster~$C_i$ such that it contains
-an~edge~$e$ of~$K$ and all clusters constructed later by the procedure do not contain
-any. If $K$~is not fully contained in~$C_i$, we can extend the edge~$e$ to a~maximal
-path contained in both~$K$ and~$C_i$. Since $C_i$ shares at most one vertex with the
-earlier clusters, there can be at most one edge from~$K$ adjacent to the maximal path,
-which is impossible.
-\qed
-
 \lemma
 Let $C_1,\ldots,C_k$ be compartments of a~graph~$G$. Then there exists an~optimal
 MSF decision tree for~$G$ that does not compare edges of distinct compartments.
 
-\proofsketch
-Consider a~subset~$\cal P$ of edge weight permutations~$w$ that satisfy $w(e) < w(f)$
-whenever $e\in C_i, f\in C_j, i<j$. For such permutations, no decision tree can
-gain any information on relations between edge weights in a~single compartment by
-inter-compartment comparisons --- the results of all such comparisons are determined
-in advance.
-
-Let us take an~arbitrary correct decision tree for~$G$ and restrict it to
-vertices reachable by computations on~$\cal P$. Whenever a~vertex contained
-an~inter-compartment comparison, it has lost one of its sons, so we can remove it
-by contracting its only outgoing edge. We observe that we get a~decision tree
-satisfying the desired condition and that this tree is correct.
-
-As for the correctness, the MSF of a~single~$C_i$ is uniquely determined by
-comparisons of its weights and the set~$\cal P$ contains all combinations
-of orderings of weights inside individual compartments. Therefore every
-spanning tree of every~$C_i$ and thus also of~$H$ is properly recognized.
-\qed
-
 \lemma\id{compartsum}%
 Let $C_1,\ldots,C_k$ be compartments of a~graph~$G$. Then $D(G) = \sum_i D(C_i)$.
 
-\proofsketch
-A~collection of decision trees for the individual compartments can be ``glued together''
-to a~decision tree for~$G$. We take the decision tree for~$C_1$, replace every its leaf
-by a~copy of the tree for~$C_2$ and so on. Every leaf~$\ell$ of the compound tree will be
-labeled with the union of labels of the original leaves encountered on the path from
-the root to~$\ell$. This proves that $D(G) \le \sum_i D(C_i)$.
-
-The other inequality requires more effort. We use the previous lemma to transform
-the optimal decision tree for~$G$ to another of the same depth, but without inter-compartment
-comparisons. Then we prove by induction on~$k$ and then on the depth of the tree
-that this tree can be re-arranged, so that every computation first compares edges
-from~$C_1$, then from~$C_2$ and so on. This means that the tree can be decomposed
-to decision trees for the $C_i$'s. Also, without loss of efficiency all trees for
-a~single~$C_i$ can be made isomorphic to~${\cal D}(C_i)$.
-\qed
-
 \cor\id{dtpart}%
 If $C_1,\ldots,C_k$ are the clusters generated by the Partition procedure (Algorithm \ref{partition}),
 then $D(\bigcup_i C_i) = \sum_i D(C_i)$.
 
-\proof
-Lemma \ref{partiscomp} tells us that $C_1,\ldots,C_k$ are compartments of the graph
-$\bigcup C_i$, so we can apply Lemma \ref{compartsum} on them.
-\qed
-
 \cor\id{dttwice}%
 $2D(m,n) \le D(2m,2n)$ for every $m,n$.
 
-\proof
-For an~arbitrary graph~$G$ with $m$~edges and $n$~vertices, we create a~graph~$G_2$
-consisting of two copies of~$G$ sharing a~single vertex. The copies of~$G$ are obviously
-compartments of~$G_2$, so by Lemma \ref{compartsum} it holds that $D(G_2) = 2D(G)$.
-Taking a~maximum over all choices of~$G$ yields $D(2m,2n) \ge \max_G D(G_2) = 2D(m,n)$.
-\qed
-
 %--------------------------------------------------------------------------------
 
 \section{An optimal algorithm}\id{optalgsect}%
 
-Once we have developed the soft heaps, partitioning and MST decision trees,
-it is now simple to state the Pettie's and Ramachandran's MST algorithm
-and prove that it is asymptotically optimal among all MST algorithms in
-comparison-based models. Several standard MST algorithms from the previous
-chapters will also play their roles.
-
-We will describe the algorithm as a~recursive procedure. When the procedure is
-called on a~graph~$G$, it sets the parameter~$t$ to roughly $\log^{(3)} n$ and
-it calls the \<Partition> procedure to split the graph into a~collection of
-clusters of size~$t$ and a~set of corrupted edges. Then it uses precomputed decision
-trees to find the MSF of the clusters. The graph obtained by contracting
-the clusters is on the other hand dense enough, so that the Iterated Jarn\'\i{}k's
-algorithm runs on it in linear time. Afterwards we combine the MSF's of the clusters
-and of the contracted graphs, we mix in the corrupted edges and run two iterations
-of the Contractive Bor\o{u}vka's algorithm. This guarantees reduction in the number of
-both vertices and edges by a~constant factor, so we can efficiently recurse on the
-resulting graph.
-
 \algn{Optimal MST algorithm, Pettie and Ramachandran \cite{pettie:optimal}}\id{optimal}%
 \algo
 \algin A~connected graph~$G$ with an~edge comparison oracle.
@@ -1048,10 +863,6 @@ resulting graph.
 \algout The minimum spanning tree of~$G$.
 \endalgo
 
-Correctness of this algorithm immediately follows from the Partitioning theorem (\ref{partthm})
-and from the proofs of the respective algorithms used as subroutines. Let us take a~look at
-the time complexity. We will be careful to use only the operations offered by the Pointer Machine.
-
 \lemma\id{optlemma}%
 The time complexity $T(m,n)$ of the Optimal algorithm satisfies the following recurrence:
 $$
@@ -1060,48 +871,7 @@ $$
 where~$c_1$ and~$c_2$ are some positive constants and $D$~is the decision tree complexity
 from the previous section.
 
-\proof
-The first two steps of the algorithm are trivial as we have linear time at our
-disposal.
-
-By the Partitioning theorem (\ref{partthm}), the call to \<Partition> with~$\varepsilon$
-set to a~constant takes $\O(m)$ time and it produces a~collection of clusters of size
-at most~$t$ and at most $m/4$ corrupted edges. It also guarantees that the
-connected components of the union of the $C_i$'s have at least~$t$ vertices
-(unless there is just a~single component).
-
-To apply the decision trees, we will use the framework of topological computations developed
-in Section \ref{bucketsort}. We pad all clusters in~$\C$ with isolated vertices, so that they
-have exactly~$t$ vertices. We use a~computation that labels the graph with a~pointer to
-its optimal decision tree. Then we apply Theorem \ref{topothm} combined with the
-brute-force construction of optimal decision trees from Lemma \ref{odtconst}. Together they guarantee
-that we can assign the decision trees to the clusters in time:
-$$\O\Bigl(\Vert\C\Vert + t^{t(t+2)} \cdot \bigl(2^{2^{4t^2}} + t^2\bigr)\Bigr)
-= \O\Bigl(m + 2^{2^{2^t}}\Bigr)
-= \O(m).$$
-Execution of the decision tree on each cluster~$C_i$ then takes $\O(D(C_i))$ steps.
-
-The contracted graph~$G_A$ has at most $n/t = \O(n / \log^{(3)}n)$ vertices and asymptotically
-the same number of edges as~$G$, so according to Corollary \ref{ijdens}, the Iterated Jarn\'\i{}k's
-algorithm runs on it in linear time.
-
-The combined graph~$G_B$ has~$n$ vertices, but less than~$n$ edges from the
-individual spanning trees and at most~$m/4$ additional edges which were
-corrupted. The Bor\o{u}vka steps on~$G_B$ take $\O(m)$
-time by Lemma \ref{boruvkaiter} and they produce a~graph~$G_C$ with at most~$n/4$
-vertices and at most $n/4 + m/4 \le m/2$ edges. (The $n$~tree edges in~$G_B$ are guaranteed
-to be reduced by the Bor\o{u}vka's algorithm.) It is easy to verify that this
-graph is still connected, so we can recurse on it.
-
-The remaining steps of the algorithm can be easily performed in linear time either directly
-or in case of the contractions by the bucket-sorting techniques of Section \ref{bucketsort}.
-\qed
-
-\paran{Optimality}%
-The properties of decision tree complexity, which we have proven in the previous
-section, will help us show that the time complexity recurrence is satisfied by a~constant
-multiple of the decision tree complexity $D(m,n)$ itself. This way, we will prove
-the following theorem:
+\elide
 
 \thmn{Optimality of the Optimal algorithm}
 The time complexity of the Optimal MST algorithm \ref{optimal} is $\Theta(D(m,n))$.
@@ -1127,6 +897,8 @@ The other inequality is obvious as $D(m,n)$ is an~asymptotic lower bound on
 the time complexity of every comparison-based algorithm.
 \qed
 
+\vfill\eject
+
 \paran{Complexity of MST}%
 As we have already noted, the exact decision tree complexity $D(m,n)$ of the MST problem
 is still open and so therefore is the time complexity of the optimal algorithm. However,