-\examplen{Euclidean MST}
-The MST also has its counterparts in the realm of geometric algorithms. Suppose
-that we have $n$~points $x_1,\ldots,x_n$ in the plane and we want to find the
-shortest system of segments connecting these points. If we want the segments to
-touch only in the given points, this is equivalent to finding the MST of the
-complete graph on the vertices $V=\{x_1,\ldots,x_n\}$ with edge weights
-defined as the Euclidean distances of the points. Since the graph is dense, many of the MST
-algorithms discussed run in linear time with the size of the graph, hence
-in time $\O(n^2)$.
-
-There is a~more efficient method based on the observation that the MST
-is always a~subgraph of the Delaunay's tesselation for the given points
-(this was first noted by Shamos and Hoey in~\cite{shamos:closest}). The
-tesselation is a~planar graph, which guarantees that it has $\O(n)$ edges,
-and it is a~dual graph of the Voronoi diagram of the given points, which can
-be constructed in time $\O(n\log n)$ using for example the Fortune's
-algorithm \cite{fortune:voronoi}. We can therefore reduce the problem
-to finding the MST of the tesselation for which $\O(n\log n)$ time
-is more than sufficient.
-
-This approach fails for non-Euclidean metrics, but in some cases
-(in particular for the rectilinear metric) the $\O(n\log n)$ time bound
-is also achievable by the algorithm of Zhou et al.~\cite{zhou:nodel}
-based on the sweep-line technique and the Red rule. For other
-variations on the geometric MST, see Eppstein's survey paper
-\cite{eppstein:spanning}.
-
-\examplen{Steiner trees}
-The constraint that the segments in the previous example are allowed to touch
-each other only in the given points looks artificial and it is indeed uncommon in
-practical applications (including the problem of designing electrical transmission
-lines originally studied by Bor\o{u}vka). If we lift this restriction, we get
-the problem known by the name Steiner tree.\foot{It is named after the Swiss mathematician
-Jacob Steiner who studied a~special case of this problem in the 19th century.}
-We can also define it in terms of graphs:
-
-\defn A~\df{Steiner tree} of a~weighted graph~$(G,w)$ with a~set~$M\subseteq V$
-of \df{mandatory notes} is a~tree~$T\subseteq G$ that contains all the mandatory
-vertices and its weight is minimum possible.
-
-For $M=V$ the Steiner tree is identical to the MST, but if we allow an~arbitrary
-choice of the mandatory vertices, it is NP-hard. This has been proven by Garey and Johnson
-\cite{garey:steiner,garey:rectisteiner} for not only the graph version with
-weights $\{1,2\}$, but also for the planar version with Euclidean or rectilinear
-metric. There is a~polynomial approximation algorithm with ratio $5/3$ for
-graphs due to Pr\"omel and Steger \cite{proemel:steiner} and a~polynomial-time
-approximation scheme for the Euclidean Steiner tree in an~arbitrary dimension
-by Arora \cite{arora:tspapx}.
-
-\examplen{Approximating the weight of the MST}
-Sometimes we are not interested in the actual edges forming the MST and only
-the weight matters. If we are willing to put up with a~randomized approximation,
-we can even achieve sub-linear complexity. Chazelle et al.~\cite{chazelle:mstapprox}
-have shown an~algorithm which, given $0 < \varepsilon < 1/2$, approximates
-the weight of the MST of a~graph with average degree~$d$ and edge weights from the set
-$\{1,\ldots,w\}$ in time $\O(dw\varepsilon^{-2}\cdot\log(dw/\varepsilon))$,
-producing a~weight which has relative error at most~$\varepsilon$ with probability
-at least $3/4$. They have also proven an~almost matching lower bound $\Omega(dw\varepsilon^{-2})$.
-
-For the $d$-dimensional Euclidean case, there is a~randomized approximation
-algorithm by Czumaj et al.~\cite{czumaj:euclidean} which with high probability
-produces a~spanning tree within relative error~$\varepsilon$ in $\widetilde\O(\sqrt{n}\cdot \poly(1/\varepsilon))$\foot{%
-$\widetilde\O(f) = \O(f\cdot\log^{\O(1)} f)$ and $\poly(n)=n^{\O(1)}$.}
-queries to a~data structure containing the points. The data structure is expected
-to answer orthogonal range queries and cone approximate nearest neighbor queries.
-There is also a~$\widetilde\O(n\cdot \poly(1/\varepsilon))$ time approximation
-algorithm for the MST weight in arbitrary metric spaces by Czumaj and Sohler \cite{czumaj:metric}.
-(This is still sub-linear since the corresponding graph has roughly $n^2$ edges.)
+\thmn{Expected complexity of the KKT algorithm}\id{kktavg}%
+The expected time complexity of the KKT algorithm on the RAM is $\O(m)$.
+
+\proof
+The structure of the recursion tree depends on the random choices taken,
+but as its worst-case depth is at most~$\lceil \log_4 n\rceil$, the tree
+is always a~subtree of the complete binary tree of that depth. We will
+therefore prove the theorem by examining the complete tree, possibly with
+empty subproblems in some vertices.
+
+The left edges of the tree (edges connecting a~parent with its left
+son) form a~set of \df{left paths.} Let us consider the expected time spent on
+a~single left path. When walking the path downwards from its top vertex~$r$,
+the expected size of the subproblems decreases exponentially: for a~son~$\ell$
+of a~vertex~$v$, we have $n_\ell \le n_v/4$ and $\E m_\ell = \E m_v/2$. The
+expected total time spend on the path is therefore $\O(n_r+m_r)$ and it remains
+to sum this over all left paths.
+
+With the exception of the path going from the root of the tree,
+the top~$r$ of a~left path is always a~right son of a~unique parent vertex~$v$.
+Since the subproblem~$G_r$ has been obtained from its parent subproblem~$G_v$
+by filtering out all heavy edges, we can use the Sampling lemma (\ref{samplemma}) to show that
+$\E m_r \le 2n_v$. The sum of the expected sizes of all top subproblems is
+then $\sum_r n_r + m_r \le \sum_v 3n_v = \O(n)$. After adding the exceptional path
+from the root, we get $\O(m+n)=\O(m)$.
+\qed
+
+\paran{High probability}%
+There is also a~high-probability version of the above theorem. According to
+Karger, Klein and Tarjan \cite{karger:randomized}, the time complexity
+of the algorithm is $\O(m)$ with probability $1-\exp(-\Omega(m))$. The proof
+again follows the recursion tree and it involves applying the Chernoff bound
+\cite{chernoff} to bound the tail probabilities.
+
+\paran{Different sampling}%
+We could also use a~slightly different formulation of the Sampling lemma
+suggested by Chan \cite{chan:backward}. He changes the selection of the subgraph~$H$
+to choosing an~$mp$-edge subset of~$E(G)$ uniformly at random. The proof is then
+a~straightforward application of the backward analysis method. We however preferred
+the Karger's original version, because generating a~random subset of a~given size
+requires an~unbounded number of random bits in the worst case.
+
+\paran{On the Pointer Machine}%
+The only place where we needed the power of the RAM is finding the heavy edges,
+so we can employ the pointer-machine verification algorithm mentioned in \ref{pmverify}
+to bring the results of this section to the~PM.