+\proof
+Let us observe that we can obtain the forest~$F$ by running the Kruskal's algorithm
+(\ref{kruskal}) combined with the random process producing~$H$ from~$G$. We sort all edges of~$G$
+by their weights and we start with an~empty forest~$F$. For each edge, we first
+flip a~biased coin (that gives heads with probability~$p$) and if it comes up
+tails, we discard the edge. Otherwise we perform a~single step of the Kruskal's
+algorithm: We check whether $F+e$ contains a~cycle. If it does, we discard~$e$, otherwise
+we add~$e$ to~$F$. At the end, we have produced the subgraph~$H$ and its MSF~$F$.
+
+When we exchange the check for cycles with flipping the coin, we get an~equivalent
+algorithm which will turn out to be more convenient to analyse:
+\algo
+\:If $F+e$ contains a~cycle, we immediately discard~$e$ (we can flip
+the coin, but we need not to, because the edge will be discarded regardless of
+the outcome). We note that~$e$ is $F$-heavy with respect to both the
+current state of~$F$ and the final MSF.
+\:If $F+e$ is acyclic, we flip the coin:
+\::If it comes up heads, we add~$e$ to~$F$. In this case, $e$~is neither $F$-light
+ nor $F$-heavy.
+\::If it comes up tails, we discard~$e$. Such edges are $F$-light.
+\endalgo
+
+The number of $F$-nonheavy edges is therefore equal to the total number of the coin
+flips in step~2 of this algorithm. We also know that the algorithm stops before
+it adds $n$~edges to~$F$. Therefore it flips at most as many coins as a~simple
+random process that repeatedly flips until it gets~$n$ heads. As waiting for
+every occurrence of heads takes expected time~$1/p$ (the distribution is geometric),
+waiting for~$n$ heads must take $n/p$. This is the bound we wanted to achieve.
+\qed
+
+\para
+We will formulate the algorithm as a~doubly-recursive procedure. It alternatively
+performs steps of the Bor\o{u}vka's algorithm and filtering based on the above lemma.
+The first recursive call computes the MSF of the sampled subgraph, the second one
+finds the MSF of the original graph, but without the heavy edges.
+
+As in all contractive algorithms, we use edge labels to keep track of the
+original locations of the edges in the input graph. For the sake of simplicity,
+we do not mention it in the algorithm explicitly.
+
+\algn{MSF by random sampling --- the KKT algorithm}\id{kkt}%
+\algo
+\algin A~graph $G$ with an~edge comparison oracle.
+\:Remove isolated vertices from~$G$. If no vertices remain, stop and return an~empty forest.
+\:Perform two Bor\o{u}vka steps (iterations of Algorithm \ref{contbor}) on~$G$ and
+ remember the set~$B$ of the edges having been contracted.
+\:Select a~subgraph~$H\subseteq G$ by including each edge independently with
+ probability $1/2$.
+\:$F\=\msf(H)$ calculated recursively.
+\:Construct $G'\subseteq G$ by removing all $F$-heavy edges of~$G$.
+\:$R\=\msf(G')$ calculated recursively.
+\:Return $R\cup B$.
+\algout The minimum spanning forest of~$G$.
+\endalgo
+
+\nota
+Let us analyse the time complexity of this algorithm by studying properties of its \df{recursion tree.}
+This tree describes the subproblems processed by the recursive calls. For any vertex~$v$
+of the tree, we denote the number of vertices and edges of the corresponding subproblem~$G_v$
+by~$n_v$ and~$m_v$ respectively.
+If $m_v>0$, the recursion continues: the left son of~$v$ corresponds to the
+call on the sampled subgraph~$H_v$, the right son to the reduced graph~$G^\prime_v$.
+(Similarly, we use letters subscripted with~$v$ for the state of the other variables
+of the algorithm.)
+The root of the recursion tree is obviously the original graph~$G$, the leaves are
+trivial graphs with no edges.
+
+\obs
+The Bor\o{u}vka steps together with the removal of isolated vertices guarantee that the number
+of vertices drops at least by a~factor of four in every recursive call. The size of a~subproblem~$G_v$
+at level~$i$ is therefore at most $n/4^i$ and the depth of the tree is at most $\lceil\log_4 n\rceil$.
+As there are no more than~$2^i$ subproblems at level~$i$, the sum of all~$n_v$'s
+on that level is at most $n/2^i$, which is at most~$2n$ when summed over the whole tree.
+
+We are going to show that the worst case of the KKT algorithm is not worse than
+of the plain contractive algorithm, while the average case is linear.
+
+\lemma
+For every subproblem~$G_v$, the KKT algorithm spends $\O(m_v+n_v)$ time plus the cost
+of the recursive calls.
+
+\proof
+We know from Lemma \ref{contiter} that each Bor\o{u}vka step takes time $\O(m_v+n_v)$.\foot{We
+need to add $n_v$, because the graph could be disconnected.}
+The selection of the edges of~$H_v$ is straightforward.
+Finding the $F_v$-heavy edges is not, but we have already shown in Theorem \ref{ramverify}
+that linear time is sufficient on the RAM.
+\qed
+
+\thmn{Worst-case complexity of the KKT algorithm}
+The KKT algorithm runs in time $\O(\min(n^2,m\log n))$ in the worst case on the RAM.
+
+\proof
+The argument for the $\O(n^2)$ bound is similar to the analysis of the plain
+contractive algorithm. As every subproblem~$G_v$ is a~simple graph, the number
+of its edges~$m_v$ is less than~$n_v^2$. By the previous lemma, we spend time
+$\O(n_v^2)$ on it. Summing over all subproblems yields $\sum_v \O(n_v^2) =
+\O((\sum_v n_v)^2) = \O(n^2)$.
+
+In order to prove the $\O(m\log n)$ bound, it is sufficient to show that the total time
+spent on every level of the recursion tree is $\O(m)$. Suppose that $v$~is a~vertex
+of the recursion tree with its left son~$\ell$ and right son~$r$. Some edges of~$G_v$
+are removed in the Bor\o{u}vka steps, let us denote their number by~$b_v$.
+The remaining edges fall either to~$G_\ell = H_v$, or to $G_r = G^\prime_v$, or possibly
+to both.
+
+We can observe that the intersection $G_\ell\cap G_r$ cannot be large: The edges of~$H_v$ that
+are not in the forest~$F_v$ are $F_v$-heavy, so they do not end up in~$G_r$. Therefore the
+intersection can contain only the edges of~$F_v$. As there are at most $n_v/4$ such edges,
+we have $m_\ell + m_r + b_v \le m_v + n_v/4$.
+
+On the other hand, the first Bor\o{u}vka step selects at least $n_v/2$ edges,
+so $b_v \ge n_v/2$. The duplication of edges between $G_\ell$ and~$G_r$ is therefore
+compensated by the loss of edges by contraction and $m_\ell + m_r \le m_v$. So the total
+number of edges per level does not decrease and it remains to apply the previous lemma.
+\qed