+against each \<Insert>. This yields the time bound.
+\qed
+
+\rem
+When we set $\varepsilon = 1/2n$, the soft heap is not allowed to corrupt any
+items, so it can be used like any traditional heap. By the standard lower bound on
+sorting it therefore requires $\Omega(\log n)$ time per operation, so the
+time complexity is optimal for this choice of~$\varepsilon$. Chazelle \cite{chazelle:softheap}
+proves that it is optimal for every choice of~$\varepsilon$.
+
+The space consumed by the heap need not be linear in the \em{current} number
+of items, but if a~case where this matters ever occurred, we could fix it easily
+by rebuilding the whole data structure completely after $n/2$ deletes. This
+increases the number of potentially corrupted items, but at worst twice, so it
+suffices to decrease~$\varepsilon$ twice.
+
+%--------------------------------------------------------------------------------
+
+\section{Robust contractions}
+
+Having the soft heaps at hand, we would like to use them in a~conventional MST
+algorithm in place of a~normal heap. The most efficient specimen of a~heap-based
+algorithm we have seen so far is the Iterated Jarn\'\i{}k's algorithm (\ref{itjar}).
+It is based on a~simple, yet powerful idea: Run the Jarn\'\i{}k's algorithm with
+limited heap size, so that it stops when the neighborhood of the tree becomes too
+large. Grow multiple such trees, always starting in a~vertex not visited yet. All
+these trees are contained in the MST, so by the Contraction lemma
+(\ref{contlemma}) we can contract each of them to a~single vertex and iterate
+the algorithm on the resulting graph.
+
+We can try implanting the soft heap in this algorithm, preferably in the earlier
+version without active edges (\ref{jarnik}) as the soft heap lacks the \<Decrease>
+operation. This brave, but somewhat simple-minded attempt is however doomed to
+fail. The reason is of course the corruption of items inside the heap, which
+leads to increase of weights of some subset of edges. In presence of corrupted
+edges, most of the theory we have so carefully built breaks down. For example,
+the Blue lemma (\ref{bluelemma}) now holds only when we consider a~cut with no
+corrupted edges, with a~possible exception of the lightest edge of the cut.
+Similarly, the Red lemma (\ref{redlemma}) is true only if the heaviest edge on
+the cycle is not corrupted.
+
+There is fortunately some light in this darkness. While the basic structural
+properties of MST's no longer hold, there is a~weaker form of the Contraction
+lemma that takes the corrupted edges into account. Before we prove this lemma,
+we expand our awareness of subgraphs which can be contracted.
+
+\defn
+A~subgraph $C\subseteq G$ is \df{contractible} iff for every pair of edges $e,f\in\delta(C)$\foot{That is,
+of~$G$'s edges with exactly one endpoint in~$C$.} there exists a~path in~$C$ connecting the endpoints
+of the edges $e,f$ such that all edges on this path are lighter than either $e$ or~$f$.
+
+\example\id{jarniscont}%
+Let us see that when we stop the Jarn\'\i{}k's algorithm at some moment and
+we take a~subgraph~$C$ induced by the constructed tree, this subgraph is contractible.
+Indeed, when we consider any two distinct edges $uv, xy$ having exactly one endpoint in~$C$
+(w.l.o.g.~$u,x\in C$ and $v,y\not\in C$),
+they enter the algorithm's heap at some time. Without loss of generality $uv$~enters it earlier.
+Before the algorithm reaches the vertex~$x$, it adds the whole path $ux$ to the tree.
+As the edge~$uv$ never leaves the heap, all edges on the path $ux$ must be lighter
+than this edge.
+
+We can now easily reformulate the Contraction lemma (\ref{contlemma}) in the language
+of contractible subgraphs. We again assume that we are working with multigraphs
+and that they need not be connected.
+Furthermore, we slightly abuse the notation in the way that we omit the explicit bijection
+between $G-C$ and~$G/C$, so we can write $G=C \cup (G/C)$.
+
+\lemman{Generalized contraction}
+When~$C\subseteq G$ is a~contractible subgraph, then $\msf(G)=\msf(C) \cup \msf(G/C)$.
+
+\proof
+As both sides of the equality are forests spanning the same graph, it suffices
+to show that $\msf(G) \subseteq \msf(C)\cup\msf(G/C)$.
+Let us show that edges of~$G$ that do not belong to the right-hand side
+do not belong to the left-hand side either.
+We know that the edges that
+do not participate in the MSF of some graph are exactly those which are the heaviest
+on some cycle (this is the Cycle rule from Lemma \ref{redlemma}).
+
+Whenever an~edge~$g$ lies in~$C$, but not in~$\msf(C)$, then $g$~is the heaviest edge
+on some cycle in~$C$. As this cycle is also contained in~$G$, the edge $g$~does not participate
+in $\msf(G)$ either.
+
+Similarly for $g\in (G/C)\setminus\msf(G/C)$: when the cycle does not contain
+the vertex~$c$ to which we have contracted the subgraph~$C$, this cycle is present
+in~$G$, too. Otherwise we consider the edges $e,f$ incident with~$c$ on this cycle.
+Since $C$~is contractible, there must exist a~path~$P$ in~$C$ connecting the endpoints
+of~$e$ and~$f$ in~$G$, such that all edges of~$P$ are lighter than either $e$ or~$f$
+and hence also than~$g$. Expanding~$c$ in the cycle to the path~$P$ then produces
+a~cycle in~$G$ whose heaviest edge is~$g$.
+\qed
+
+We are now ready to bring corruption back to the game and state a~``robust'' version
+of this lemma. A~notation for corrupted graphs will be handy:
+
+\nota\id{corrnota}%
+When~$G$ is a~weighted graph and~$R$ a~subset of its edges, we will use $G\crpt
+R$ to denote an arbitrary graph obtained from~$G$ by increasing the weights of
+some of the edges in~$R$. As usually, we will assume that all edges of this graph
+have pairwise distinct weights. While this is technically not true for the corruption
+caused by soft heaps, we can easily make it so.
+
+Whenever~$C$ is a~subgraph of~$G$, we will use $R^C$ to refer to the edges of~$R$ with
+exactly one endpoint in~$C$ (i.e., $R^C = R\cap \delta(C)$).
+
+\lemman{Robust contraction, Chazelle \cite{chazelle:almostacker}}\id{robcont}%
+Let $G$ be a~weighted graph and $C$~its subgraph contractible in~$G\crpt R$
+for some set~$R$ of edges. Then $\msf(G) \subseteq \msf(C) \cup \msf((G/C) \setminus R^C) \cup R^C$.
+
+\proof
+We will modify the proof of the previous lemma. We will again consider all possible types
+of edges that do not belong to the right-hand side and we will show that they are the
+heaviest edges of certain cycles. Every edge~$g$ of~$G$ lies either in~$C$, or in $H=(G/C)\setminus R^C$,
+or possibly in~$R^C$.
+
+If $g\in C\setminus\msf(C)$, then the same argument as before applies.
+
+If $g\in H\setminus\msf(H)$, we consider the cycle in~$H$ on which $g$~is the heaviest.
+When $c$ (the vertex to which we have contracted~$C$) is outside this cycle, we are done.
+Otherwise we observe that the edges $e,f$ adjacent to~$c$ on this cycle cannot be corrupted
+(they would be in~$R^C$ otherwise, which is impossible). By contractibility of~$C$ there exists
+a~path~$P$ in~$C\crpt (R\cap C)$ such that all edges of~$P$ are lighter than $e$~or~$f$ and hence
+also than~$g$. The weights of the edges of~$P$ in the original graph~$G$ cannot be higher than
+in~$G\crpt R$, so the path~$P$ is lighter than~$g$ even in~$G$ and we can again perform the
+trick with expanding the vertex~$c$ to~$P$ in the cycle~$C$. Hence $g\not\in\mst(G)$.
+\qed
+
+\para
+We still intend to mimic the Iterated Jarn\'\i{}k's algorithm. We will partition the given graph to a~collection~$\C$
+of non-overlapping contractible subgraphs called \df{clusters} and we put aside all edges that got corrupted in the process.
+We recursively compute the MSF of those subgraphs and of the contracted graph. Then we take the
+union of these MSF's and add the corrupted edges. According to the previous lemma, this does not produce
+the MSF of~$G$, but a~sparser graph containing it, on which we can continue.
+
+%%HACK%%
+\>We can formulate the exact partitioning algorithm and its properties as follows:
+
+\algn{Partition a~graph to a~collection of contractible clusters}\id{partition}%
+\algo
+\algin A~graph~$G$ with an~edge comparison oracle, a~parameter~$t$ controlling the size of the clusters,
+ and an~accuracy parameter~$\varepsilon$.
+\:Mark all vertices as ``live''.
+\:$\C\=\emptyset$, $R^\C\=\emptyset$. \cmt{Start with an~empty collection and no corrupted edges.}
+\:While there is a~live vertex~$v_0$:
+\::$T=\{v_0\}$. \cmt{the tree that we currently grow}
+\::$K=\emptyset$. \cmt{edges known to be corrupted in the current iteration}
+\::\<Create> a~soft heap with accuracy~$\varepsilon$ and \<Insert> the edges adjacent to~$v_0$ into it.
+\::While the heap is not empty and $\vert T\vert \le t$:
+\:::\<DeleteMin> an~edge $uv$ from the heap, assume $u\in T$.
+\:::If $uv$ was corrupted, add it to~$K$.
+\:::If $v\in T$, drop the edge and repeat the previous two steps.
+\:::$T\=T\cup\{v\}$.
+\:::If $v$ is dead, break out of the current loop.
+\:::Insert all edges incident with~$v$ to the heap.
+\::$\C\=\C \cup \{G[T]\}$. \cmt{Record the cluster induced by the tree.}
+\::\<Explode> the heap and add all remaining corrupted edges to~$K$.
+\::$R^\C\=R^\C \cup K^T$. \cmt{Record the ``interesting'' corrupted edges.}
+\::$G\=G\setminus K^T$. \cmt{Remove the corrupted edges from~$G$.}
+\::Mark all vertices of~$T$ as ``dead''.
+\algout The collection $\C$ of contractible clusters and the set~$R^\C$ of
+corrupted edges in the neighborhood of these clusters.
+\endalgo
+
+\thmn{Partitioning to contractible clusters, Chazelle \cite{chazelle:almostacker}}\id{partthm}%
+Given a~weighted graph~$G$ and parameters $\varepsilon$ ($0<\varepsilon\le 1/2$)
+and~$t$, the Partition algorithm (\ref{partition}) constructs a~collection
+$\C=\{C_1,\ldots,C_k\}$ of clusters and a~set~$R^\C$ of edges such that:
+
+\numlist\ndotted
+\:All the clusters and the set~$R^\C$ are mutually edge-disjoint.
+\:Each cluster contains at most~$t$ vertices.
+\:Each vertex of~$G$ is contained in at least one cluster.
+\:The connected components of the union of all clusters have at least~$t$ vertices each,
+ except perhaps for those which are equal to a~connected component of $G\setminus R^\C$.
+\:$\vert R^\C\vert \le 2\varepsilon m$.
+\:$\msf(G) \subseteq \bigcup_i \msf(C_i) \cup \msf\bigl((G / \bigcup_i C_i) \setminus R^\C\bigr) \cup R^\C$.
+\:The algorithm runs in time $\O(n+m\log (1/\varepsilon))$.
+\endlist
+
+\proof
+Claim~1: The Partition algorithm grows a~series of trees which induce the clusters~$C_i$ in~$G$.
+A~tree is built from edges adjacent to live vertices
+and once it is finished, all vertices of the tree die, so no edge is ever reused in another
+tree. The edges that enter~$R^\C$ are immediately deleted from the graph, so they never participate
+in any tree.
+
+Claim~2: The algorithm stops when all vertices are dead, so each vertex must have
+entered some tree.
+
+Claim~3: The trees have at most~$t$ vertices each, which limits the size of the
+$C_i$'s as well.
+
+Claim~4: We can show that each connected component has $t$~or more vertices as we already did
+in the proof of Lemma \ref{ijsize}: How can a~new tree stop growing? Either it acquires
+$t$~vertices, or it joins one of the existing trees (this only increases the
+size of the component), or the heap becomes empty (which means that the tree spans
+a~full component of the current graph and hence also of the final~$G\setminus R^\C$).
+
+Claim~5: The set~$R^\C$ contains a~subset of edges corrupted by the soft heaps over
+the course of the algorithm. As every edge is inserted to a~heap at most once per
+its endpoint, the heaps can corrupt at worst $2\varepsilon m$ edges altogether.
+
+We will prove the remaining two claims as separate lemmata.
+\qed
+
+\lemman{Correctness of Partition, Claim 6 of Theorem \ref{partthm}}
+$$\msf(G) \subseteq \bigcup_i \msf(C_i) \cup \msf\biggl( \bigl(G / \bigcup_i C_i \bigr) \setminus R^\C\biggr) \cup R^\C.$$
+
+\proof
+By iterating the Robust contraction lemma (\ref{robcont}). Let $K_i$ be the set of edges
+corrupted when constructing the cluster~$C_i$ in the $i$-th phase of the algorithm,
+and similarly for the state of the other variables at that time.
+We will use induction on~$i$ to prove that the lemma holds at the end of every phase.
+
+At the beginning, the statement is obviously true, even as an~equality.
+In the $i$-th phase we construct the cluster~$C_i$ by running the partial Jarn\'\i{}k's algorithm on the graph
+$G_i = G\setminus\bigcup_{j<i} K_{\smash j}^{C_j}$.
+If it were not for corruption, the cluster~$C_i$ would be contractible, as we already know from Example \ref{jarniscont}.
+When the edges in~$K_i$ get corrupted, the cluster is contractible in some corrupted graph
+$G_i \crpt K_i$. (We have to be careful as the edges are becoming corrupted gradually,
+but it is easy to check that it can only improve the situation.) Since $C_i$~shares no edges
+with~$C_j$ for any~$j<i$, we know that~$C_i$ also a~contractible subgraph of $(G_i / \bigcup_{j<i} C_j) \crpt K_i$.
+Now we can use the Robust contraction lemma to show that:
+$$
+\msf\biggl(\bigl(G / \bigcup_{j<i} C_j \bigr) \setminus \bigcup_{j<i} K_j^{C_j} \biggr) \subseteq
+\msf(C_i) \cup \msf\biggl(\bigl(G / \bigcup_{j\le i} C_j \bigr) \setminus \bigcup_{j\le i} K_j^{C_j} \biggr) \cup K_i^{C_i}.
+$$
+This completes the induction step, because $K_j^{C_j} = K_j^{\C_j}$ for all~$j$.
+\qed
+
+\lemman{Efficiency of Partition, Claim 7 of Theorem \ref{partthm}}
+The Partition algorithm runs in time $\O(n+m\log(1/\varepsilon))$.
+
+\proof
+The inner loop (steps 7--13) processes the edges of the current cluster~$C_i$ and also
+the edges in its neighborhood $\delta(C_i)$. Steps 6 and~13 insert every such edge to the heap
+at most once, so steps 8--12 visit each edge also at most once. For every edge, we spend
+$\O(\log(1/\varepsilon))$ time amortized on inserting it and $\O(1)$ on the other operations
+(by Theorem \ref{softheap} on performance of the soft heaps).
+
+Each edge of~$G$ can participate in some $C_i\cup \delta(C_i)$ only twice before both its
+endpoints die. The inner loop therefore processes $\O(m)$ edges total, so it takes $\O(m\log(1/\varepsilon))$
+time. The remaining parts of the algorithm spend $\O(m)$ time on operations with clusters and corrupted edges
+and additionally $\O(n)$ on identifying the live vertices.
+\qed
+
+%--------------------------------------------------------------------------------
+
+\section{Decision trees}\id{dtsect}%
+
+The Pettie's and Ramachandran's algorithm combines the idea of robust partitioning with optimal decision
+trees constructed by brute force for very small subgraphs. In this section, we will
+explain the basics of the decision trees and prove several lemmata which will
+turn out to be useful for the analysis of time complexity of the final algorithm.
+
+Let us consider all computations of some comparison-based MST algorithm when we
+run it on a~fixed graph~$G$ with all possible permutations of edge weights.
+The computations can be described by a~binary tree. The root of the tree corresponds to the first
+comparison performed by the algorithm and depending to its result, the computation
+continues either in the left subtree or in the right subtree. There it encounters
+another comparison and so on, until it arrives to a~leaf of the tree where the
+spanning tree found by the algorithm is recorded.
+
+Formally, the decision trees are defined as follows:
+
+\defnn{Decision trees and their complexity}\id{decdef}%
+A~\df{MSF decision tree} for a~graph~$G$ is a~binary tree. Its internal vertices
+are labeled with pairs of $G$'s edges to be compared, each of the two outgoing tree edges
+corresponds to one possible result of the comparison.\foot{There are two possible
+outcomes since there is no reason to compare an~edge with itself and we, as usually,
+expect that the edge weights are distinct.}
+Leaves of the tree are labeled with spanning trees of the graph~$G$.
+
+A~\df{computation} of the decision tree on a~specific permutation of edge weights
+in~$G$ is the path from the root to a~leaf such that the outcome of every comparison
+agrees with the edge weights. The result of the computation is the spanning tree
+assigned to its final leaf.
+A~decision tree is \df{correct} iff for every permutation the corresponding
+computation results in the real MSF of~$G$ with the particular weights.
+
+The \df{time complexity} of a~decision tree is defined as its depth. It therefore
+bounds the number of comparisons spent on every path. (It need not be equal since
+some paths need not correspond to an~actual computation --- the sequence of outcomes
+on the path could be unsatisfiable.)
+
+A~decision tree is called \df{optimal} if it is correct and its depth is minimum possible
+among the correct decision trees for the given graph.
+We will denote an~arbitrary optimal decision tree for~$G$ by~${\cal D}(G)$ and its
+complexity by~$D(G)$.
+
+The \df{decision tree complexity} $D(m,n)$ of the MSF problem is the maximum of~$D(G)$
+over all graphs~$G$ with $n$~vertices and~$m$ edges.
+
+\obs
+Decision trees are the most general deterministic comparison-based computation model possible.
+The only operations that count in its time complexity are comparisons. All
+other computation is free, including solving NP-complete problems or having
+access to an~unlimited source of non-uniform constants. The decision tree
+complexity is therefore an~obvious lower bound on the time complexity of the
+problem in all other comparison-based models.
+
+The downside is that we do not know any explicit construction of the optimal
+decision trees, or at least a~non-constructive proof of their complexity.
+On the other hand, the complexity of any existing comparison-based algorithm
+can be used as an~upper bound on the decision tree complexity. For example:
+
+\lemma
+$D(m,n) \le 4/3 \cdot n^2$.
+
+\proof
+Let us count the comparisons performed by the Contractive Bor\o{u}vka's algorithm
+(\ref{contbor}), tightening up the constants in its previous analysis in Theorem
+\ref{contborthm}. In the first iteration, each edge participates in two comparisons
+(one per endpoint), so the algorithm performs at most $2m \le 2{n\choose 2} \le n^2$
+comparisons. Then the number of vertices drops at least by a~factor of two, so
+the subsequent iterations spend at most $(n/2)^2, (n/4)^2, \ldots$ comparisons, which sums
+to less than $n^2\cdot\sum_{i=0}^\infty (1/4)^i = 4/3 \cdot n^2$. Between the Bor\o{u}vka steps,
+we flatten the multigraph to a~simple graph, which also needs some comparisons,
+but for every such comparison we remove one of the participating edges, which saves
+at least one comparison in the subsequent steps.
+\qed
+
+\para
+Of course we can get sharper bounds from the better algorithms, but we will first
+show how to find the optimal trees using brute force. The complexity of the search
+will be of course enormous, but as we already promised, we will need the optimal
+trees only for very small subgraphs.
+
+\lemman{Construction of optimal decision trees}\id{odtconst}%
+An~optimal MST decision tree for a~graph~$G$ on~$n$ vertices can be constructed on
+the Pointer Machine in time $\O(2^{2^{4n^2}})$.
+
+\proof
+We will try all possible decision trees of depth at most $2n^2$
+(we know from the previous lemma that the desired optimal tree is shallower). We can obtain
+any such tree by taking the complete binary tree of exactly this depth
+and labeling its $2\cdot 2^{2n^2}-1$ vertices with comparisons and spanning trees. Those labeled
+with comparisons become internal vertices of the decision tree, the others
+become leaves and the parts of the tree below them are removed. There are less
+than $n^4$ possible comparisons and less than $2^{n^2}$ spanning trees of~$G$,
+so the number of candidate decision trees is bounded by
+$(n^4+2^{n^2})^{2^{2n^2+1}} \le 2^{(n^2+1)\cdot 2^{2n^2+1}} \le 2^{2^{2n^2+2}} \le 2^{2^{3n^2}}$.
+
+We will enumerate the trees in an~arbitrary order, test each of them for correctness and
+find the shallowest tree among those correct. Testing can be accomplished by running
+through all possible permutations of edges, each time calculating the MSF using any
+of the known algorithms and comparing it with the result given by the decision tree.
+The number of permutations does not exceed $(n^2)! \le (n^2)^{n^2} \le n^{2n^2} \le 2^{n^3}$
+and each one can be checked in time $\O(\poly(n))$.
+
+On the Pointer Machine, trees and permutations can be certainly enumerated in time
+$\O(\poly(n))$ per object. The time complexity of the whole algorithm is therefore
+$\O(2^{2^{3n^2}} \cdot 2^{n^3} \cdot \poly(n)) = \O(2^{2^{4n^2}})$.