From ee300863845190ad21b70473b3a075bf51837e28 Mon Sep 17 00:00:00 2001
From: Martin Mares <mj@ucw.cz>
Date: Mon, 17 Mar 2008 17:44:36 +0100
Subject: [PATCH] More parts of KKT.

---
 adv.tex    | 65 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 macros.tex |  1 +
 2 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/adv.tex b/adv.tex
index b932458..99d8002 100644
--- a/adv.tex
+++ b/adv.tex
@@ -1053,12 +1053,15 @@ is still open even for the RAM.
 \section{A~randomized algorithm}\id{randmst}%
 
 When we analysed the contractive Bor\o{u}vka's algorithm in Section~\ref{contalg},
-we observed that while the number of vertices per phase decreases exponentially,
+we observed that while the number of vertices per iteration decreases exponentially,
 the number of edges can remain asymptotically the same. Karger, Klein and Tarjan
 \cite{karger:randomized} have overcome this problem by combining the Bor\o{u}vka's
 algorithm with filtering based on random sampling. This leads to a~randomized
 algorithm which runs in expected linear time.
 
+As we will need the algorithm for verification of minimality from the previous
+section, we will again assume that we are working on the RAM.
+
 The principle of the filtering is simple: Let us consider any spanning tree~$T$
 of the input graph~$G$. Each edge of~$G$ that is $T$-heavy is the heaviest edge
 of some cycle, so by the Red lemma (\ref{redlemma}) it cannot participate in
@@ -1113,6 +1116,64 @@ every occurence of heads takes expected time~$1/p$, waiting for~$n$ heads
 must take $n/p$. This is the bound we wanted to achieve.
 \qed
 
+\para
+We will formulate the algorithm as a~doubly-recursive procedure. It alternatively
+peforms steps of the Bor\o{u}vka's algorithm and filtering based on the above lemma.
+The first recursive call computes the MSF of the sampled subgraph, the second one
+finds the MSF of the graph without the heavy edges.
+
+As in all contractive algorithms, we use edge labels to keep track of the
+original locations of the edges in the input graph. For the sake of simplicity,
+we do not mention it in the algorithm.
+
+\algn{MSF by random sampling --- the KKT algorithm}
+\algo
+\algin A~graph $G$ with an~edge comparison oracle.
+\:Remove isolated vertices from~$G$. If no vertices remain, stop and return an~empty forest.
+\:Perform two Bor\o{u}vka steps (iterations of Algorithm \ref{contbor}) on~$G$ and
+  remember the set~$B$ of edges contracted.
+\:Select subgraph~$H\subseteq G$ by including each edge independently with
+  probability $1/2$.
+\:$F\=\msf(H)$ calculated recursively.
+\:Construct $G'\subseteq G$ by removing all $F$-heavy edges.
+\:$R\=\msf(G')$ calculated recursively.
+\:Return $R\cup B$.
+\algout The minimum spanning forest of~$G$.
+\endalgo
+
+\nota
+Let us analyse the time complexity this algorithm by studying properties of its \df{recursion tree.}
+The tree describes the subproblems processed by the recursive calls. For any vertex~$t$
+of the tree, we denote the number of vertices and edges of the corresponding subproblem~$G_t$
+by~$n_t$ and~$m_t$ respectively. If $m_t>0$, the recursion continues: the left son of~$t$
+corresponds to the call on the sampled subgraph~$H$, the right son to the reduced
+graph~$G'$. The root of the recursion tree is obviously the original graph~$G$, the
+leaves are trivial graphs with no edges.
+
+\obs
+The Bor\o{u}vka steps together with the removal of isolated vertices guarantee that the number
+of vertices drops at least by a~factor of four in every recursive call. The size of a~subproblem~$G_t$
+at level~$\ell$ is therefore at most $n/4^\ell$ and the depth of the tree is at most $\lceil\log_4 n\rceil$.
+As there are no more than~$2^\ell$ subproblems at level~$\ell$, the sum of all~$n_t$'s
+on that level is at most $n/2^\ell$, which is at most~$2n$ summed over the whole tree.
+
+\lemma
+For every subproblem~$G_t$, the KKT algorithm spends time $\O(m_t+n_t)$ plus the time
+spent on the recursive calls.
+
+\proof
+We know from Lemma \ref{contiter} that each Bor\o{u}vka step takes time $\O(m_t+n_t)$.\foot{We
+add $n_t$ as the graph could be disconnected.}
+The selection of the edges of~$H$ is straightforward.
+Finding the $F$-heavy edges is not, but we have already shown in Theorem \ref{ramverify}
+that linear time is sufficient on the RAM.
+\qed
+
+\thmn{Worst-case complexity of the KKT algorithm}
+The KKT algorithm runs in time $\O(\min(n^2,m\log n))$ in the worst case on the RAM.
+
+\thmn{Average-case complexity of the KKT algorithm}
+
 \rem
 We could also use a~slightly different formulation of the sampling lemma
 suggested by Chan \cite{chan:backward}. It changes the selection of the subgraph~$H$
@@ -1121,6 +1182,8 @@ a~straightforward application of the backward analysis method. We however prefer
 the Karger's original version, because generating  a~random subset of a~given size
 cannot be generally performed in bounded worst-case time.
 
+\FIXME{Pointer machine.}
+
 %--------------------------------------------------------------------------------
 
 \section{Special cases and related problems}
diff --git a/macros.tex b/macros.tex
index 2bf71d3..7eed678 100644
--- a/macros.tex
+++ b/macros.tex
@@ -40,6 +40,7 @@
 \def\rack#1#2{\setbox0=\hbox{#1}\hbox to \wd0{#2}}
 \def\o#1{\accent23 #1}
 \def\mst{\mathop{\rm mst}}
+\def\msf{\mathop{\rm msf}}
 \def\deg{\mathop{\rm deg}}
 \def\timesalpha{\mskip2mu\alpha}
 \def\timesbeta{\mskip2mu\beta}
-- 
2.39.5