From 0e783e3fc00288b976fc0277e8948f3bc90cab98 Mon Sep 17 00:00:00 2001 From: Martin Mares Date: Tue, 18 Mar 2008 12:49:17 +0100 Subject: [PATCH] Finished KKT. --- PLAN | 7 +++---- adv.tex | 47 +++++++++++++++++++++++++++++++++++++++-------- biblio.bib | 11 +++++++++++ macros.tex | 1 + notation.tex | 2 +- 5 files changed, 55 insertions(+), 13 deletions(-) diff --git a/PLAN b/PLAN index 5554906..20a4c0d 100644 --- a/PLAN +++ b/PLAN @@ -19,7 +19,7 @@ o Fredman-Tarjan algorithm o MST verification o Linear-time verification - . A randomized algorithm + o A randomized algorithm . ?? Chazelle ?? . ?? Pettie ?? o Special cases and related problems @@ -55,11 +55,10 @@ Spanning trees: - reference to mixed Boruvka-Jarnik - use the notation for contraction by a set - practical considerations: katriel:cycle, moret:practice (mention pairing heaps) -- parallel algorithms: p243-cole (are there others?) +- parallel algorithms: p243-cole (see also remarks in Karger and pettie:minirand) - bounded expansion classes? - restricted cases and arborescences -- mention parallel algorithms (see remarks in Karger) -- Pettie's paper on random bits +- Pettie's paper on random bits (pettie:minirand) Models: diff --git a/adv.tex b/adv.tex index 58d3c5c..9928648 100644 --- a/adv.tex +++ b/adv.tex @@ -167,7 +167,7 @@ of edges being at most $\varrho n$. \rem The proof can be also viewed probabilistically: let $X$ be the degree of a vertex of~$G$ chosen uniformly at -random. Then ${\bb E}X \le 2\varrho$, hence by the Markov's inequality +random. Then $\E X \le 2\varrho$, hence by the Markov's inequality ${\rm Pr}[X > 4\varrho] < 1/2$, so for at least $n/2$ vertices~$v$ we have $\deg(v)\le 4\varrho$. @@ -1034,7 +1034,7 @@ as there are $\O(1)$ query paths per edge, the first sum is $\O(\#\hbox{comparis which is $\O(m)$ by Theorem \ref{verify}. \qed -\rem +\rem\id{pmverify}% Buchsbaum et al.~have recently shown in \cite{buchsbaum:verify} that linear-time verification can be achieved even on the pointer machine. They first solve the problem of finding the lowest common ancestors for a~set of pairs of vertices @@ -1095,7 +1095,7 @@ tails, we discard the edge. Otherwise we perform a~single step of the Kruskal's algoritm: We check if~$F+e$ contains a~cycle. If it does, we discard~$e$, otherwise we add~$e$ to~$F$. At the end, we have produced the subgraph~$H$ and its MSF~$F$. -We can modify the algorithm by swapping the check for cycles with flipping of +We can modify the algorithm by swapping the check for cycles with flipping the coin: \algo \:If $F+e$ contains a~cycle, we immediately discard~$e$ (we can flip @@ -1160,6 +1160,9 @@ at level~$i$ is therefore at most $n/4^i$ and the depth of the tree is at most $ As there are no more than~$2^i$ subproblems at level~$i$, the sum of all~$n_t$'s on that level is at most $n/2^i$, which is at most~$2n$ summed over the whole tree. +We are going to show that the worst case of the KKT algorithm is not worse than +of the plain contractive algorithm, while the average case is linear. + \lemma For every subproblem~$G_t$, the KKT algorithm spends time $\O(m_t+n_t)$ plus the time spent on the recursive calls. @@ -1204,20 +1207,48 @@ number of edges per level cannot decrease and it remains to apply the previous l The expected time complexity of the KKT algorithm on the RAM is $\O(m)$. \proof - +The structure of the recursion tree depends on the random choices taken, +but as its worst case depth is at most~$\lceil \log_4 n\rceil$, the tree +is always a~subtree of the complete binary tree of that depth. We will +therefore prove the theorem by examining the complete tree, possibly with +empty subproblems at some vertices. + +The set of all left edges in the tree (edges connecting a~parent with its left +son) form a~set of \df{left paths.} Let us consider the expected time spent on +a~single left path. When walking the path downwards from its top vertex~$r$, +the expected size of the subproblems decreases exponentially: for a~son~$\ell$ +of a~vertex~$t$, we have $n_\ell \le n_t/4$ and $\E m_\ell = 2\cdot\E m_t$. The +expected total time spend on the path is therefore $\O(n_r+m_r)$ and it remains +to sum this over all left paths. + +With the exception of the path going from the root of the tree, +the top~$r$ of a~left path is always a~right son of a~unique parent vertex~$t$. +Since the subproblem~$G_r$ has been obtained from its parent subproblem~$G_t$ +by filtering out all heavy edges, we can use the Sampling lemma to obtain +$\E m_r \le 2n_t$. The sum of the expected sizes of all top subproblems is +then $\sum_r n_r + m_r \le \sum_t 3n_t = \O(n)$. After adding the exceptional path +from the root, we get $\O(m+n)=\O(m)$. \qed -\FIXME{High probability result.} +\rem +There is also a~high-probability version of the above theorem. According to +Karger, Klein and Tarjan \cite{karger:randomized}, the time complexity +of the algorithm is $\O(m)$ with probability $1-\exp(-\Omega(m))$. The proof +again follows the recursion tree and it involves applying the Chernoff bound +\cite{chernoff} to bound the tail probabilities. \rem We could also use a~slightly different formulation of the sampling lemma suggested by Chan \cite{chan:backward}. It changes the selection of the subgraph~$H$ to choosing an~$mp$-edge subset of~$E(G)$ uniformly at random. The proof is then a~straightforward application of the backward analysis method. We however prefered -the Karger's original version, because generating a~random subset of a~given size -cannot be generally performed in bounded worst-case time. +the Karger's original version, because generating a~random subset of a~given size +requires an~unbounded number of random bits in the worst case. -\FIXME{Pointer machine.} +\rem +The only place where we needed the power of the RAM is the verification algorithm, +so we can use the pointer-machine verification algorithm mentioned in Remark \ref{pmverify} +to bring the results of this section to the~PM. %-------------------------------------------------------------------------------- diff --git a/biblio.bib b/biblio.bib index 0e406ad..01aca86 100644 --- a/biblio.bib +++ b/biblio.bib @@ -1098,3 +1098,14 @@ inproceedings{ pettie:minirand, year={1983}, publisher={Academic Press, Inc. Orlando, FL, USA} } + +@article{ chernoff, + title={{A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations}}, + author={Chernoff, H.}, + journal={The Annals of Mathematical Statistics}, + volume={23}, + number={4}, + pages={493--507}, + year={1952}, + publisher={JSTOR} +} diff --git a/macros.tex b/macros.tex index 7eed678..2ac7bbf 100644 --- a/macros.tex +++ b/macros.tex @@ -49,6 +49,7 @@ \def\minorof{\preccurlyeq} \def\per{\mathop{\rm per}} \def\poly{\mathop{\rm poly}} +\def\E{{\bb E}} % Bit strings \def\0{{\bf 0}} diff --git a/notation.tex b/notation.tex index f2c84dd..eb70c72 100644 --- a/notation.tex +++ b/notation.tex @@ -41,7 +41,7 @@ \n{$f[e]$}{as edges are two-element sets, $f[e]$ maps both endpoints of an edge~$e$} \n{$\varrho({\cal C})$}{edge density of a graph class~$\cal C$ \[density]} \n{$\deg_G(v)$}{degree of vertex~$v$ in graph~$G$; we omit $G$ if it is clear from context} -\n{${\bb E}X$}{expected value of a~random variable~$X$} +\n{${\E}X$}{expected value of a~random variable~$X$} \n{${\rm Pr}[\varphi]$}{probability that a predicate~$\varphi$ is true} \n{$\log n$}{a binary logarithm of the number~$n$} \n{$f^{(i)}$}{function~$f$ iterated $i$~times: $f^{(0)}(x):=x$, $f^{(i+1)}(x):=f(f^{(i)}(x))$} -- 2.39.5