From: Martin Mares Date: Wed, 2 Apr 2008 22:18:44 +0000 (+0200) Subject: Decision trees started. X-Git-Tag: printed~140 X-Git-Url: http://mj.ucw.cz/gitweb/?a=commitdiff_plain;h=dacceb3347cf2848b8a44000e8c6d19d65e1d172;p=saga.git Decision trees started. --- diff --git a/PLAN b/PLAN index 6792895..f1481d7 100644 --- a/PLAN +++ b/PLAN @@ -24,8 +24,8 @@ * Approaching Optimality - . Soft heaps - . Robust contractions + o Soft heaps + o Robust contractions . An optimal algorithm . Decision trees @@ -67,6 +67,7 @@ Spanning trees: - random sampling (propp:randommst) - mention bugs in Valeria's verification paper - Pettie's optimal algorithm runs in average linear time +- add references to other applications of decomposition Models: diff --git a/mst.tex b/mst.tex index 8b8e74f..00d818f 100644 --- a/mst.tex +++ b/mst.tex @@ -562,7 +562,7 @@ edges together. The bucket sort uses two passes with $n_i$~buckets, so it takes $\O(n_i+m_i)=\O(m_i)$. \qed -\thm +\thm\id{contborthm}% The Contractive Bor\o{u}vka's algorithm finds the MST of the input graph in time $\O(\min(n^2,m\log n))$. diff --git a/notation.tex b/notation.tex index f954ee3..e3016ae 100644 --- a/notation.tex +++ b/notation.tex @@ -85,6 +85,9 @@ \n{$\per M$}{the permanent of a~square matrix~$M$} \n{$G\crpt R$}{graph~$G$ with edges in~$R$ corrupted \[corrnota]} \n{$R_C$}{$R_C = R\cup \delta(C)$ \[corrnota]} +\n{${\cal D}(G)$}{The optimal MSF decision tree for a~graph~$G$ \[decdef]} +\n{$D(G)$}{The depth of ${\cal D}(G)$ \[decdef]} +\n{$D(m,n)$}{Decision tree complexity of MSF \[decdef]} } %-------------------------------------------------------------------------------- diff --git a/opt.tex b/opt.tex index 6ba150d..c2c87c6 100644 --- a/opt.tex +++ b/opt.tex @@ -747,6 +747,109 @@ time there. The rest of the algorithm spends $\O(m)$ time on the other operation and $\O(n)$ on identifying the live vertices. \qed +%-------------------------------------------------------------------------------- + +\section{Decision trees} + +The Pettie's algorithm combines the idea of robust partitioning with optimal decision +trees constructed by brute force for very small subgraphs. In this section, we will +explain the basics of the decision trees and prove several lemmata which will +turn out to be useful for the analysis of time complexity of the whole algorithm. + +Let us consider all computations of a~comparison-based MST algorithm when we +run it on a~fixed graph~$G$ with all possible permutations of edge weights. +They can be described by a~tree. The root of the tree corresponds to the first +comparison performed by the algorithm and depending to its result, the computation +continues either in the left subtree or in the right subtree. There it encounters +another comparison and so on, until it arrives to a~leaf of the tree where the +spanning tree found by the algorithm is recorded. + +Formally, the decision trees are defined as follows: + +\defnn{Decision trees and their complexity}\id{decdef}% +A~\df{MSF decision tree} for a~graph~$G$ is a~binary tree. Its internal vertices +are labeled with pairs of edges to be compared, each of the two outgoing edges +corresponds to one possible result of the comparison.\foot{There is no reason +to compare two identical edges.} +Leaves of the tree are labeled with spanning trees of the graph~$G$. + +A~\df{computation} of the decision tree on a~specific permutation of edge weights +in~$G$ is a~path from the root to a~leaf such that the outcome of every comparison +agrees with the edge weights. The result of the computation is the spanning tree +assigned to its final leaf. +A~decision tree is \df{correct} iff for every permutation the corresponding +computation results in the real MSF of~$G$ with the particular weights. + +The \df{time complexity} of a~decision tree is defined as its depth. It therefore +bounds the number of comparisons spent on every path. (It need not be equal since +some paths need not correspond to an~actual computation --- the sequence of outcomes +on the path can be unsatifisfiable.) + +A~decision tree is called \df{optimal} if it is correct and its depth is minimum possible +among the correct decision trees for the given graph. +We will denote an~arbitrary optimal decision tree for~$G$ by~${\cal D}(G)$ and its +complexity by~$D(G)$. + +The \df{decision tree complexity} $D(m,n)$ of the MSF problem is the maximum of~$D(G)$ +over all graphs~$G$ with $n$~vertices and~$m$ edges. + +\obs +Decision trees are the most general comparison-based computation model possible. +The only operations which count in the time complexity are the comparisons. All +other computation is free, including solving NP-complete problems or having +access to an~unlimited source of non-uniform constants. The decision tree +complexity is therefore an~obvious lower bound on the time complexity of the +problem in all other models. + +The downside is that we do not know any explicit construction of the optimal +decision trees, or at least a~non-constructive proof of their complexity. +On the other hand, the complexity of any existing comparison-based algorithm +can be used as an~upper bound for the decision tree complexity. For example: + +\lemma +$D(m,n) \le 4/3 \cdot n^2$. + +\proof +Let us count the comparisons performed by the contractive Bor\o{u}vka's algorithm. +(\ref{contbor}), tightening up the constants in the original analysis in Theorem +\ref{contborthm}. In the first phase, each edge participates in two comparisons +(one for each its endpoint), so the algorithm performs at most $2m \le 2{n\choose 2} \le n^2$ +comparisons. Then the number of vertices drops at least by a~factor of two, so +the subsequent phases spend $(n/2)^2, (n/4)^2, \ldots$ comparisons, which sums +to less than $n^2\cdot\sum_{i=0}^\infty (1/4)^i = 4/3 \cdot n^2$. +\qed + +\para +Of course we can get sharper bounds from the better algorithms, but we will first +show how to find the optimal trees using brute force. The complexity of the search +will be of course enormous, but as we already promised, we will need the optimal +trees only for very small subgraphs. + +\lemma +Optimal MST decision trees for all graphs on at most~$k$ vertices can be +constructed on the Pointer machine in time $\O(2^{2^{k^3}})$. + +\proof +There are $2^{k\choose 2} \le 2^{k^2}$ undirected graphs on~$k$ vertices. Any +graph on less than~$k$ vertices can be expanded to $k$~vertices by adding isolated +vertices, which obviously do not affect the optimal decision tree. + +For every graph, we will try all possible trees of depth at most $4/3\cdot k^2$ +(we know from the previous lemma that the optimal tree is not deeper). Each such +tree has at most $2^{2k^2}$ internal vertices and at most $2^{2k^2}$ leaves. +All internal vertices are labeled with comparisons and there are less than $k^4$ +ways how to select the edges to be compared. Each leaf is labeled with one of the +at most $2^{k^2}$ spanning trees. There are therefore at most $(2^{2k^2})^{2^{2k^2}} +\le 2^{2k^2\cdot 2^{2k^2}} \le 2^{2^{k^3}}$ candidate trees and we will enumerate them +in order of increasing depth. + +\FIXME{Continue with verifying correctness.} + +\qed + + + + %-------------------------------------------------------------------------------- \section{An optimal algorithm}