+\proof
+The first two steps of the algorithm are trivial as we have linear time at our
+disposal.
+
+By the Partitioning theorem (\ref{partthm}), the call to \<Partition> with~$\varepsilon$
+set to a~constant takes $\O(m)$ time and it produces a~collection of clusters of size
+at most~$t$ and at most $m/4$ corrupted edges. It also guarantees that the
+connected components of the union of the $C_i$'s have at least~$t$ vertices
+(unless there is just a~single component).
+
+To apply the decision trees, we will use the framework of topological computations developed
+in Section \ref{bucketsort}. We pad all clusters in~$\C$ with isolated vertices, so that they
+have exactly~$t$ vertices. We use a~computation that labels the graph with a~pointer to
+its optimal decision tree. Then we apply Theorem \ref{topothm} combined with the
+brute-force construction of optimal decision trees from Lemma \ref{odtconst}. Together they guarantee
+that we can assign the decision trees to the clusters in time:
+$$\O\Bigl(\Vert\C\Vert + t^{t(t+2)} \cdot \bigl(2^{2^{4t^2}} + t^2\bigr)\Bigr)
+= \O\Bigl(m + 2^{2^{2^t}}\Bigr)
+= \O(m).$$
+Execution of the decision tree on each cluster~$C_i$ then takes $\O(D(C_i))$ steps.
+
+The contracted graph~$G_A$ has at most $n/t = \O(n / \log^{(3)}n)$ vertices and asymptotically
+the same number of edges as~$G$, so according to Corollary \ref{ijdens}, the Iterated Jarn\'\i{}k's
+algorithm runs on it in linear time.
+
+The combined graph~$G_B$ has~$n$ vertices, but less than~$n$ edges from the
+individual spanning trees and at most~$m/4$ additional edges which were
+corrupted. The Bor\o{u}vka steps on~$G_B$ take $\O(m)$
+time by Lemma \ref{boruvkaiter} and they produce a~graph~$G_C$ with at most~$n/4$
+vertices and at most $n/4 + m/4 \le m/2$ edges. (The $n$~tree edges in~$G_B$ are guaranteed
+to be reduced by the Bor\o{u}vka's algorithm.) It is easy to verify that this
+graph is still connected, so we can recurse on it.
+
+The remaining steps of the algorithm can be easily performed in linear time either directly
+or in case of the contractions by the bucket-sorting techniques of Section \ref{bucketsort}.
+\qed
+
+\paran{Optimality}%
+The properties of decision tree complexity, which we have proven in the previous
+section, will help us show that the time complexity recurrence is satisfied by a~constant
+multiple of the decision tree complexity $D(m,n)$ itself. This way, we will prove
+the following theorem:
+
+\thmn{Optimality of the Optimal algorithm}
+The time complexity of the Optimal MST algorithm \ref{optimal} is $\Theta(D(m,n))$.
+
+\proof
+We will prove by induction that $T(m,n) \le cD(m,n)$ for some $c>0$. The base
+case is trivial, for the induction step we will expand on the previous lemma:
+\def\eqalign#1{\null\,\vcenter{\openup\jot
+ \ialign{\strut\hfil$\displaystyle{##}$&$\displaystyle{{}##}$\hfil
+ \crcr#1\crcr}}\,}
+$$\vcenter{\openup\jot\halign{\strut\hfil $\displaystyle{#}$&$\displaystyle{{}#}$\hfil&\quad#\hfil\cr
+T(m,n)
+ &\le \sum_i c_1 D(C_i) + T(m/2, n/4) + c_2 m &(Lemma \ref{optlemma})\cr
+ &\le c_1 D({\textstyle\bigcup}_i C_i) + T(m/2, n/4) + c_2 m &(Corollary \ref{dtpart})\cr
+ &\le c_1 D(m,n) + T(m/2, n/4) + c_2m &(definition of $D(m,n)$)\cr
+ &\le c_1 D(m,n) + cD(m/2, n/4) + c_2m &(induction hypothesis)\cr
+ &\le c_1 D(m,n) + c/2\cdot D(m,n/2) + c_2m &(Corollary \ref{dttwice})\cr
+ &\le c_1 D(m,n) + c/2\cdot D(m,n) + 2c_2 D(m,n) &(Lemma \ref{dtbasic})\cr
+ &\le (c_1 + c/2 + 2c_2) \cdot D(m,n)&\cr
+ &\le cD(m,n). &(by setting $c=2c_1+4c_2$)\cr
+}}$$
+The other inequality is obvious as $D(m,n)$ is an~asymptotic lower bound on
+the time complexity of every comparison-based algorithm.
+\qed
+
+\paran{Complexity of MST}%
+As we have already noted, the exact decision tree complexity $D(m,n)$ of the MST problem
+is still open and so therefore is the time complexity of the optimal algorithm. However,
+every time we come up with another comparison-based algorithm, we can use its complexity
+(or more specifically the number of comparisons it performs, which can be even lower)
+as an~upper bound on the optimal algorithm.
+
+The best explicit comparison-based algorithm known to date achieves complexity $\O(m\timesalpha(m,n))$.\foot{$\alpha(m,n)$
+is a~certain inverse of the Ackermann's function, $\lambda_k(n)$ is the row inverse of the same
+function. See \ref{ackerinv} for the exact definitions.}
+It has been discovered by Chazelle \cite{chazelle:ackermann} as an~improvement of his previous
+$\O(m\timesalpha(m,n)\cdot\log\alpha(m,n))$ algorithm \cite{chazelle:almostacker}.
+It is also based on the ideas of this chapter --- in particular the soft heaps and robust contractions.
+The algorithm is very complex and it involves a~lot of elaborate
+technical details, so we only refer to the original paper here. Another algorithm of the same
+complexity, using similar ideas, has been discovered independently by Pettie \cite{pettie:ackermann}, who,
+having the optimal algorithm at hand, does not take care about the low-level details and he only
+bounds the number of comparisons. Using any of these results, we can prove an~Ackermannian
+upper bound on the optimal algorithm:
+
+\thmn{Upper bound on complexity of the Optimal algorithm}\id{optthm}%
+The time complexity of the Optimal MST algorithm is $\O(m\timesalpha(m,n))$.
+
+\proof
+We bound $D(m,n)$ by the number of comparisons performed by the algorithm of Chazelle \cite{chazelle:ackermann}
+or Pettie \cite{pettie:ackermann}.
+\qed
+
+Similarly to the Iterated Jarn\'\i{}k's algorithm, this bound is actually linear for classes of graphs
+that do not have density extremely close to constant:
+
+\cor\id{lambdacor}%
+The Optimal MST algorithm runs in linear time whenever $m\ge n\cdot \lambda_k(n)$ for any fixed~$k$.
+
+\proof
+Combine the previous theorem with Lemma \ref{alphaconst}.
+\qed
+
+\rem
+It is also known from \cite{pettie:optimal} that when we run the Optimal algorithm on a~random
+graph drawn from either $G_{n,p}$ (random graphs on~$n$ vertices, each edge is included with probability~$p$
+independently on the other edges) or $G_{n,m}$ (we draw the graph uniformly at random from the
+set of all graphs with~$n$ vertices and $m$~edges), it runs in linear time with high probability,
+regardless of the edge weights.
+
+\paran{Models of computation}%
+Another important consequence of the optimal algorithm is that when we aim for a~linear-time
+MST algorithm (or for proving that it does not exist), we do not need to care about computational
+models at all. The elaborate RAM data structures of Chapter \ref{ramchap}, which have helped us
+so much in the case of integer weights, cannot help if we are allowed to access the edge weights
+by performing comparisons only. We can even make use of non-uniform objects given by some sort
+of oracle. Indeed, whatever trick we employ to achieve linear time complexity, we can mimic it in the
+world of decision trees and thus we can use it to show that the algorithm we already knew is
+also linear.
+
+This however applies to deterministic algorithms only --- we have shown that access to a~source
+of random bits allows us to compute the MST in expected linear time (the KKT algorithm, \ref{kkt}).
+There were attempts to derandomize the KKT algorithm, but so far the best result in this direction
+is the randomized algorithm also by Pettie \cite{pettie:minirand} which achieves expected linear time
+complexity with only $\O(\log^* n)$ random bits.