X-Git-Url: http://mj.ucw.cz/gitweb/?a=blobdiff_plain;f=adv.tex;h=2a81be339ee37a25dc2d122741ec139154d641b4;hb=89c2cfcb9705555988fd011d56f3a6fad73652ed;hp=dc4e70fcbdadab8e14a3dd0bebe5d2105a69a144;hpb=1e0055d9c1ccd9dccba79f5bb64f4f558efc74e1;p=saga.git

diff --git a/adv.tex b/adv.tex
index dc4e70f..2a81be3 100644
--- a/adv.tex
+++ b/adv.tex
@@ -6,9 +6,9 @@
 
 \section{Minor-closed graph classes}\id{minorclosed}%
 
-The contractive algorithm given in section~\ref{contalg} has been found to perform
+The contractive algorithm given in Section~\ref{contalg} has been found to perform
 well on planar graphs, but in the general case its time complexity was not linear.
-Can we find any broader class of graphs where the algorithm is still efficient?
+Can we find any broader class of graphs where this algorithm is still linear?
 The right context turns out to be the minor-closed graph classes, which are
 closed under contractions and have bounded density.
 
@@ -18,7 +18,7 @@ from a~subgraph of~$G$ by a sequence of simple graph contractions (see \ref{simp
 
 \defn
 A~class~$\cal C$ of graphs is \df{minor-closed}, when for every $G\in\cal C$ and
-its every minor~$H$, the graph~$H$ lies in~$\cal C$ as well. A~class~$\cal C$ is called
+every minor~$H$ of~$G$, the graph~$H$ lies in~$\cal C$ as well. A~class~$\cal C$ is called
 \df{non-trivial} if at least one graph lies in~$\cal C$ and at least one lies outside~$\cal C$.
 
 \example
@@ -32,8 +32,8 @@ Non-trivial minor-closed classes include:
 
 \para
 Many of the nice structural properties of planar graphs extend to
-minor-closed classes, too (see \cite{lovasz:minors} for a~nice survey
-of this theory and \cite{diestel:gt} for some of the deeper results).
+minor-closed classes, too (see Lov\'asz \cite{lovasz:minors} for a~nice survey
+of this theory and Diestel \cite{diestel:gt} for some of the deeper results).
 The most important property is probably the characterization
 of such classes in terms of their forbidden minors.
 
@@ -52,31 +52,30 @@ For example, the planar graphs can be equivalently described as the class $\Forb
 --- this follows from the Kuratowski's theorem (the theorem speaks of forbidden
 subdivisions, but while in general this is not the same as forbidden minors, it
 is for $K_5$ and $K_{3,3}$). The celebrated theorem by Robertson and Seymour
-guarantees that we can always find a~finite set of forbidden minors.
+guarantees that we can always find a~finite set of forbidden minors:
 
 \thmn{Excluded minors, Robertson \& Seymour \cite{rs:wagner}}
 For every non-trivial minor-closed graph class~$\cal C$ there exists
 a~finite set~$\cal H$ of graphs such that ${\cal C}=\Forb({\cal H})$.
+\qed
 
-\proof
 This theorem has been proven in a~long series of papers on graph minors
 culminating with~\cite{rs:wagner}. See this paper and follow the references
 to the previous articles in the series.
-\qed
 
 \para
 For analysis of the contractive algorithm,
 we will make use of another important property --- the bounded density of
 minor-closed classes. The connection between minors and density dates back to
 Mader in the 1960's and it can be proven without use of the Robertson-Seymour
-theorem.
+theory.
 
 \defn\id{density}%
-Let $\cal C$ be a class of graphs. We define its \df{edge density} $\varrho(\cal C)$
-to be the infimum of all~$\varrho$'s such that $m(G) \le \varrho\cdot n(G)$
-holds for every $G\in\cal C$.
+Let $G$ be a~graph and $\cal C$ be a class of graphs. We define the \df{edge density}
+$\varrho(G)$ of~$G$ as the average number of edges per vertex, i.e., $m(G)/n(G)$. The
+edge density $\varrho(\cal C)$ of the class is then defined as the infimum of $\varrho(G)$ over all $G\in\cal C$.
 
-\thmn{Mader \cite{mader:dens}}
+\thmn{Mader \cite{mader:dens}}\id{maderthm}%
 For every $k\in{\bb N}$ there exists $h(k)\in{\bb R}$ such that every graph
 of average degree at least~$h(k)$ contains a~subdivision of~$K_{k}$ as a~subgraph.
 
@@ -85,7 +84,7 @@ of average degree at least~$h(k)$ contains a~subdivision of~$K_{k}$ as a~subgrap
 
 Let us fix~$k$ and prove by induction on~$m$ that every graph of average
 degree at least~$2^m$ contains a~subdivision of some graph with $k$~vertices
-and ${k\choose 2}\ge m\ge k$~edges. For $m={k\choose 2}$ the theorem follows
+and $m$~edges (for $k\le m\le {k\choose 2}$). When we reach $m={k\choose 2}$, the theorem follows
 as the only graph with~$k$ vertices and~$k\choose 2$ edges is~$K_k$.
 
 The base case $m=k$: Let us observe that when the average degree
@@ -98,13 +97,13 @@ the cycle~$C_k$.
 Induction step: Let~$G$ be a~graph with average degree at least~$2^m$ and
 assume that the theorem already holds for $m-1$. Without loss of generality,
 $G$~is connected. Consider a~maximal set $U\subseteq V$ such that the subgraph $G[U]$
-induced by~$U$ is connected and the graph $G.U$ ($G$~with $U$~contracted to
+induced by~$U$ is connected and the graph $G\sgc U$ ($G$~with $U$~contracted to
 a~single vertex) has average degree at least~$2^m$ (such~$U$ exists, because
-$G=G.U$ whenever $\vert U\vert=1$). Now consider the subgraph~$H$ induced
+$G=G\sgc U$ whenever $\vert U\vert=1$). Now consider the subgraph~$H$ induced
 in~$G$ by the neighbors of~$U$. Every $v\in V(H)$ must have $\deg_H(v) \ge 2^{m-1}$,
 as otherwise we can add this vertex to~$U$, contradicting its
 maximality. By the induction hypothesis, $H$ contains a~subdivision of some
-graph~$R$ with $r$~vertices and $m-1$ edges. Any two non-adjacent vertices
+graph~$R$ with $k$~vertices and $m-1$ edges. Any two non-adjacent vertices
 of~$R$ can be connected in the subdivision by a~path lying entirely in~$G[U]$,
 which reveals a~subdivision of a~graph with $m$~edges. \qed
 
@@ -112,8 +111,9 @@ which reveals a~subdivision of a~graph with $m$~edges. \qed
 Every non-trivial minor-closed class of graphs has finite edge density.
 
 \proof
-Let~$\cal C$ be any such class, $X$~its smallest excluded minor and $x=n(X)$.
-As $H\minorof K_x$, the class $\cal C$ entirely lies in ${\cal C}'=\Forb(K_x)$, so
+Let~$\cal C$ be any such class, $X$~its excluded minor with the smallest number
+of vertices~$x$.
+As $X\minorof K_x$, the class $\cal C$ is entirely contained in ${\cal C}'=\Forb(K_x)$, so
 $\varrho({\cal C}) \le \varrho({\cal C}')$ and therefore it suffices to prove the
 theorem for classes excluding a~single complete graph~$K_x$.
 
@@ -123,32 +123,29 @@ edges, its average degree would be at least~$h(x)$, so by the previous theorem
 $G$~would contain a~subdivision of~$K_x$ and hence $K_x$ as a~minor.
 \qed
 
-\rem
-Minor-closed classes share many other interesting properties, for example bounded chromatic
-numbers of various kinds, as shown by Theorem 6.1 of \cite{nesetril:minors}.
-
 Let us return to the analysis of our algorithm.
 
-\thmn{MST on minor-closed classes \cite{mm:mst}}\id{mstmcc}%
+\thmn{MST on minor-closed classes, Tarjan \cite{tarjan:dsna}}\id{mstmcc}%
 For any fixed non-trivial minor-closed class~$\cal C$ of graphs, the Contractive Bor\o{u}vka's
 algorithm (\ref{contbor}) finds the MST of any graph of this class in time
 $\O(n)$. (The constant hidden in the~$\O$ depends on the class.)
 
 \proof
 Following the proof for planar graphs (\ref{planarbor}), we denote the graph considered
-by the algorithm at the beginning of the $i$-th iteration by~$G_i$ and its number of vertices
+by the algorithm at the beginning of the $i$-th Bor\o{u}vka step by~$G_i$ and its number of vertices
 and edges by $n_i$ and $m_i$ respectively. Again the $i$-th phase runs in time $\O(m_i)$
-and $n_i \le n/2^i$, so it remains to show a linear bound for the $m_i$'s.
+and we have $n_i \le n/2^i$, so it remains to show a linear bound for the $m_i$'s.
 
 Since each $G_i$ is produced from~$G_{i-1}$ by a sequence of edge contractions,
-all $G_i$'s are minors of~$G$.\foot{Technically, these are multigraph contractions,
+all $G_i$'s are minors of the input graph.\foot{Technically, these are multigraph contractions,
 but followed by flattening, so they are equivalent to contractions on simple graphs.}
-So they also belong to~$\cal C$ and by the previous theorem $m_i\le \varrho({\cal C})\cdot n_i$.
+So they also belong to~$\cal C$ and by the Density theorem $m_i\le \varrho({\cal C})\cdot n_i$.
+The time complexity is therefore $\sum_i \O(m_i) = \sum_i \O(n_i) = \O(\sum_i n/2^i) = \O(n)$.
 \qed
 
 \paran{Local contractions}\id{nobatch}%
 The contractive algorithm uses ``batch processing'' to perform many contractions
-in a single step. It is also possible to perform contractions one edge at a~time,
+in a single step. It is also possible to perform them one edge at a~time,
 batching only the flattenings. A~contraction of an edge~$uv$ can be done
 in time~$\O(\deg(u))$ by removing all edges incident with~$u$ and inserting them back
 with $u$ replaced by~$v$. Therefore we need to find a lot of vertices with small
@@ -181,7 +178,7 @@ $\deg(v)\le 4\varrho$.
 \:While $n(G)>1$:
 \::While there exists a~vertex~$v$ such that $\deg(v)\le t$:
 \:::Select the lightest edge~$e$ incident with~$v$.
-\:::Contract~$G$ along~$e$.
+\:::Contract~$e$.
 \:::$T\=T + \ell(e)$.
 \::Flatten $G$, removing parallel edges and loops.
 \algout Minimum spanning tree~$T$.
@@ -203,8 +200,8 @@ are minors of the graph~$G$ given as the input.
 For the choice $t=4\varrho$, the Lemma on low-degree vertices (\ref{lowdeg})
 guarantees that at the beginning of the $i$-th iteration, at least $n_i/2$ vertices
 have degree at most~$t$. Each selected edge removes one such vertex and
-possibly increases the degree of another, so at least $n_i/4$ edges get selected.
-Hence $n_i\le 3/4\cdot n_{i-1}$ and therefore $n_i\le n\cdot (3/4)^i$ and the
+possibly increases the degree of another one, so at least $n_i/4$ edges get selected.
+Hence $n_i\le 3/4\cdot n_{i-1}$ and $n_i\le n\cdot (3/4)^i$, so the
 algorithm terminates after $\O(\log n)$ iterations.
 
 Each selected edge belongs to $\mst(G)$, because it is the lightest edge of
@@ -213,20 +210,20 @@ The steps 6 and~7 therefore correspond to the operation
 described by the Contraction Lemma (\ref{contlemma}) and when
 the algorithm stops, $T$~is indeed the minimum spanning tree.
 
-It remains to analyse the time complexity of the algorithm. Since $G_i\in{\cal C}$, we have
+It remains to analyse the time complexity of the algorithm. Since $G_i\in{\cal C}$, we know that
 $m_i\le \varrho n_i \le \varrho n/2^i$.
 We will show that the $i$-th iteration is carried out in time $\O(m_i)$.
 Steps 5 and~6 run in time $\O(\deg(v))=\O(t)$ for each~$v$, so summed
-over all $v$'s they take $\O(tn_i)$, which is linear for a fixed class~$\cal C$.
-Flattening takes $\O(m_i)$, as already noted in the analysis of the Contracting
+over all $v$'s they take $\O(tn_i)$, which is $\O(n_i)$ for a fixed class~$\cal C$.
+Flattening takes $\O(m_i)$ as already noted in the analysis of the Contracting
 Bor\o{u}vka's Algorithm (see \ref{contiter}).
 
 The whole algorithm therefore runs in time $\O(\sum_i m_i) = \O(\sum_i n/2^i) = \O(n)$.
 \qed
 
 \paran{Back to planar graphs}%
-For planar graphs, we can get a sharper version of the low-degree lemma,
-showing that the algorithm works with $t=8$ as well (we had $t=12$ as
+For planar graphs, we can obtain a sharper version of the low-degree lemma
+showing that the algorithm works with $t=8$ as well (we had $t=12$ from
 $\varrho=3$). While this does not change the asymptotic time complexity
 of the algorithm, the constant-factor speedup can still delight the hearts of
 its practical users.
@@ -238,39 +235,50 @@ have degree at most~8.
 \proof
 It suffices to show that the lemma holds for triangulations (if there
 are any edges missing, the situation can only get better) with at
-least 3 vertices. Since $G$ is planar, $\sum_v \deg(v) < 6n$.
+least 4 vertices. Since $G$ is planar, we have $\sum_v \deg(v) < 6n$.
 The numbers $d(v):=\deg(v)-3$ are non-negative and $\sum_v d(v) < 3n$,
 so by the same argument as in the proof of the general lemma, for at least $n/2$
-vertices~$v$ it holds that $d(v) < 6$, hence $\deg(v) \le 8$.
+vertices~$v$ it holds that $d(v) < 6$, and thus $\deg(v) \le 8$.
 \qed
 
 \rem\id{hexa}%
 The constant~8 in the previous lemma is the best we can have.
 Consider a $k\times k$ triangular grid. It has $n=k^2$ vertices, $\O(k)$ of them
-lie on the outer face and have degrees at most~6, the remaining $n-\O(k)$ interior
+lie on the outer face and they have degree at most~6, the remaining $n-\O(k)$ interior
 vertices have degree exactly~6. Therefore the number of faces~$f$ is $6/3\cdot n=2n$,
 ignoring terms of order $\O(k)$. All interior triangles can be properly colored with
 two colors, black and white. Now add a~new vertex inside each white face and connect
-it to all three vertices on the boundary of that face. This adds $f/2 \approx n$
+it to all three vertices on the boundary of that face (see the picture). This adds $f/2 \approx n$
 vertices of degree~3 and it increases the degrees of the original $\approx n$ interior
-vertices to~9, therefore about a half of the vertices of the new planar graph
+vertices to~9, therefore about a~half of the vertices of the new planar graph
 has degree~9.
 
 \figure{hexangle.eps}{\epsfxsize}{The construction from Remark~\ref{hexa}}
 
 \rem
-The observation in~Theorem~\ref{mstmcc} was also independently made by Gustedt in~\cite{gustedt:parallel}
-who studied a~parallel version of the contractive Bor\o{u}vka's algorithm applied
+The observation in~Theorem~\ref{mstmcc} was also used by Gustedt \cite{gustedt:parallel},
+to construct parallel version of the Contractive Bor\o{u}vka's algorithm applied
 to minor-closed classes.
 
+\rem
+The bound on the average degree needed to enforce a~$K_k$ minor, which we get from Theorem \ref{maderthm},
+is very coarse. Kostochka \cite{kostochka:lbh} and independently Thomason \cite{thomason:efc}
+have proven that an~average degree $\Omega(k\sqrt{\log k})$ is sufficient and that this
+is the best what we can get.
+
+\rem
+Minor-closed classes share many other interesting properties, for example bounded chromatic
+numbers of various kinds, as shown by Theorem 6.1 of \cite{nesetril:minors}. We can expect
+that many algorithmic problems will turn out to be easy for them.
+
 %--------------------------------------------------------------------------------
 
 \section{Iterated algorithms}\id{iteralg}%
 
 We have seen that the Jarn\'\i{}k's Algorithm \ref{jarnik} runs in $\Theta(m\log n)$ time.
-Fredman and Tarjan have shown a~faster implementation in~\cite{ft:fibonacci}
-using their Fibonacci heaps. In this section, we convey their results and we
-show several interesting consequences.
+Fredman and Tarjan \cite{ft:fibonacci} have shown a~faster implementation
+using their Fibonacci heaps. In this section, we will convey their results and we
+will show several interesting consequences.
 
 The previous implementation of the algorithm used a binary heap to store all edges
 separating the current tree~$T$ from the rest of the graph, i.e., edges of the cut~$\delta(T)$.
@@ -281,20 +289,20 @@ and keep them in a~Fibonacci heap, ordered by weight.
 When we want to extend~$T$ by the lightest edge of~$\delta(T)$, it is sufficient to
 find the lightest active edge~$uv$ and add this edge to~$T$ together with the new vertex~$v$.
 Then we have to update the active edges as follows. The edge~$uv$ has just ceased to
-be active. We scan all neighbors~$w$ of the vertex~$v$. When $w$~is in~$T$, no action
+be active. We scan all neighbors~$w$ of the vertex~$v$. When $w$~is already in~$T$, no action
 is needed. If $w$~is outside~$T$ and it was not adjacent to~$T$ (there is no active edge
 remembered for it so far), we set the edge~$vw$ as active. Otherwise we check the existing
 active edge for~$w$ and replace it by~$vw$ if the new edge is lighter.
 
 The following algorithm shows how these operations translate to insertions, decreases
-and deletions on the heap.
+and deletions in the heap.
 
 \algn{Active Edge Jarn\'\i{}k; Fredman and Tarjan \cite{ft:fibonacci}}\id{jarniktwo}%
 \algo
 \algin A~graph~$G$ with an edge comparison oracle.
 \:$v_0\=$ an~arbitrary vertex of~$G$.
 \:$T\=$ a tree containing just the vertex~$v_0$.
-\:$H\=$ a~Fibonacci heap of active edges stored as pairs $(u,v)$ where $u\in T,v\not\in T$, ordered by the weights $w(uv)$, initially empty.
+\:$H\=$ a~Fibonacci heap of active edges stored as pairs $(u,v)$ where $u\in T,v\not\in T$, ordered by the weights $w(uv)$, and initially empty.
 \:$A\=$ a~mapping of vertices outside~$T$ to their active edges in the heap; initially all elements undefined.
 \:\<Insert> all edges incident with~$v_0$ to~$H$ and update~$A$ accordingly.
 \:While $H$ is not empty:
@@ -307,15 +315,15 @@ and deletions on the heap.
 \algout Minimum spanning tree~$T$.
 \endalgo
 
-\para
-To analyze the time complexity of this algorithm, we will use the standard
+\paran{Analysis}%
+To analyse the time complexity of this algorithm, we will use the standard
 theorem on~complexity of the Fibonacci heap:
 
 \thmn{Fibonacci heaps, Fredman and Tarjan \cite{ft:fibonacci}} The~Fibonacci heap performs the following operations
 with the indicated amortized time complexities:
 \itemize\ibull
 \:\<Insert> (insertion of a~new element) in $\O(1)$,
-\:\<Decrease> (decreasing value of an~existing element) in $\O(1)$,
+\:\<Decrease> (decreasing the value of an~existing element) in $\O(1)$,
 \:\<Merge> (merging of two heaps into one) in $\O(1)$,
 \:\<DeleteMin> (deletion of the minimal element) in $\O(\log n)$,
 \:\<Delete> (deletion of an~arbitrary element) in $\O(\log n)$,
@@ -343,7 +351,7 @@ thus by the previous theorem the operations take $\O(m+n\log n)$ time in total.
 \qed
 
 \cor
-For graphs with edge density at least $\log n$, this algorithm runs in linear time.
+For graphs with edge density $\Omega(\log n)$, this algorithm runs in linear time.
 
 \remn{Other heaps}%
 We can consider using other kinds of heaps that have the property that inserts
@@ -352,7 +360,7 @@ optimal (by the standard $\Omega(n\log n)$ lower bound on sorting by comparisons
 for example \cite{clrs}), so the other data structures can improve only
 multiplicative constants or offer an~easier implementation.
 
-A~nice example is a~\df{$d$-regular heap} --- a~variant of the usual binary heap
+A~nice example is the \df{$d$-regular heap} --- a~variant of the usual binary heap
 in the form of a~complete $d$-regular tree. \<Insert>, \<Decrease> and other operations
 involving bubbling the values up spend $\O(1)$ time at a~single level, so they run
 in~$\O(\log_d n)$ time. \<Delete> and \<DeleteMin> require bubbling down, which incurs
@@ -365,7 +373,7 @@ Another possibility is to use the 2-3-heaps \cite{takaoka:twothree} or Trinomial
 heaps \cite{takaoka:trinomial}. Both have the same asymptotic complexity as Fibonacci
 heaps (the latter even in the worst case, but it does not matter here) and their
 authors claim faster implementation. For integer weights, we can use Thorup's priority
-queues described in \cite{thorup:pqsssp} which have constant-time \<Insert> and \<Decrease>
+queues described in \cite{thorup:sssp} which have constant-time \<Insert> and \<Decrease>
 and $\O(\log\log n)$ time \<DeleteMin>. (We will however omit the details since we will
 show a~faster integer algorithm soon.)
 
@@ -373,8 +381,8 @@ show a~faster integer algorithm soon.)
 As we already noted, the improved Jarn\'\i{}k's algorithm runs in linear time
 for sufficiently dense graphs. In some cases, it is useful to combine it with
 another MST algorithm, which identifies a~part of the MST edges and contracts
-the graph to increase its density. For example, we can perform several Bor\o{u}vka
-steps  and find the rest of the MST by the Active Edge Jarn\'\i{}k's algorithm.
+them to increase the density of the graph. For example, we can perform several Bor\o{u}vka
+steps  and then find the rest of the MST by the Active Edge Jarn\'\i{}k's algorithm.
 
 \algn{Mixed Bor\o{u}vka-Jarn\'\i{}k}
 \algo
@@ -399,19 +407,19 @@ and both trees can be combined in linear time, too.
 
 \paran{Iterating Jarn\'\i{}k's algorithm}%
 Actually, there is a~much better choice of the algorithms to combine: use the
-Active Edge Jarn\'\i{}k's algorithm multiple times, each time stopping after a~while.
+Active Edge Jarn\'\i{}k's algorithm multiple times, each time stopping it after a~while.
 A~good choice of the stopping condition is to place a~limit on the size of the heap.
 We start with an~arbitrary vertex, grow the tree as usually and once the heap gets too large,
 we conserve the current tree and start with a~different vertex and an~empty heap. When this
 process runs out of vertices, it has identified a~sub-forest of the MST, so we can
-contract the graph along the edges of~this forest and iterate.
+contract the edges of~this forest and iterate.
 
 \algn{Iterated Jarn\'\i{}k; Fredman and Tarjan \cite{ft:fibonacci}}\id{itjar}%
 \algo
 \algin A~graph~$G$ with an edge comparison oracle.
 \:$T\=\emptyset$. \cmt{edges of the MST}
 \:$\ell(e)\=e$ for all edges~$e$. \cmt{edge labels as usually}
-\:$m_0\=m$.
+\:$m_0\=m$. \cmt{in the following, $n$ and $m$ will change with the graph}
 \:While $n>1$: \cmt{We will call iterations of this loop \df{phases}.}
 \::$F\=\emptyset$. \cmt{forest built in the current phase}
 \::$t\=2^{\lceil 2m_0/n \rceil}$. \cmt{the limit on heap size}
@@ -423,7 +431,7 @@ contract the graph along the edges of~this forest and iterate.
 \:::Denote the resulting tree~$R$.
 \:::$F\=F\cup R$.
 \::$T\=T\cup \ell[F]$. \cmt{Remember MST edges found in this phase.}
-\::Contract~$G$ along all edges of~$F$ and flatten it.
+\::Contract all edges of~$F$ and flatten~$G$.
 \algout Minimum spanning tree~$T$.
 \endalgo
 
@@ -438,10 +446,10 @@ However the choice of the parameter~$t$ can seem mysterious, the following
 lemma makes the reason clear:
 
 \lemma\id{ijphase}%
-The $i$-th phase of the Iterated Jarn\'\i{}k's algorithm runs in time~$\O(m)$.
+Each phase of the Iterated Jarn\'\i{}k's algorithm runs in time~$\O(m)$.
 
 \proof
-During the phase, the heap always contains at most~$t_i$ elements, so it takes
+During the $i$-th phase, the heap always contains at most~$t_i$ elements, so it takes
 time~$\O(\log t_i)=\O(m/n_i)$ to delete an~element from the heap. The trees~$R_i^j$
 are edge-disjoint, so there are at most~$n_i$ \<DeleteMin>'s over the course of the phase.
 Each edge is considered at most twice (once per its endpoint), so the number
@@ -454,13 +462,13 @@ Unless the $i$-th phase is final, the forest~$F_i$ consists of at most $2m_i/t_i
 \proof
 As every edge of~$G_i$ is incident with at most two trees of~$F_i$, it is sufficient
 to establish that there are at least~$t_i$ edges incident with every such tree, including
-connecting two vertices of the tree.
+edges connecting two vertices of the same tree.
 
 The forest~$F_i$ evolves by additions of the trees~$R_i^j$. Let us consider the possibilities
 how the algorithm could have stopped growing the tree~$R_i^j$:
 \itemize\ibull
 \:the heap had more than~$t_i$ elements (step~10): since the each elements stored in the heap
-  corresponds to a~unique edges incident with~$R_i^j$, we have enough such edges;
+  corresponds to a~unique edge incident with~$R_i^j$, we have enough such edges;
 \:the algorithm just added a~vertex of~$F_i$ to~$R_i^j$ (step~9): in this case, an~existing
   tree of~$F_i$ is extended, so the number of edges incident with it cannot decrease;\foot{%
   This is the place where we needed to count the interior edges as well.}
@@ -486,8 +494,8 @@ $$
 \left. \vcenter{\hbox{$\displaystyle t_i \ge 2^{2^{\scriptstyle 2^{\scriptstyle\rddots^{\scriptstyle m/n}}}} $}}\;\right\}
 \,\hbox{a~tower of~$i$ exponentials.}
 $$
-As soon as~$t_i\ge n$, the $i$-th phase must be final, because at that time
-there is enough space in the heap to process the whole graph. So~there are
+As soon as~$t_i\ge n$, the $i$-th phase is final, because at that time
+there is enough space in the heap to process the whole graph without stopping. So~there are
 at most~$\beta(m,n)$ phases and we already know that each phase runs in linear
 time (Lemma~\ref{ijphase}).
 \qed
@@ -496,7 +504,7 @@ time (Lemma~\ref{ijphase}).
 The Iterated Jarn\'\i{}k's algorithm runs in time $\O(m\log^* n)$.
 
 \proof
-$\beta(m,n) \le \beta(1,n) = \log^* n$.
+$\beta(m,n) \le \beta(n,n) \le \log^* n$.
 \qed
 
 \cor\id{ijdens}%
@@ -507,7 +515,7 @@ at least~$\log^{(k)} n$ for some $k\in{\bb N}^+$, it runs in time~$\O(km)$.
 If $m/n \ge \log^{(k)} n$, then $\beta(m,n)\le k$.
 \qed
 
-\obs
+\paran{Integer weights}%
 The algorithm spends most of the time in phases which have small heaps. Once the
 heap grows to $\Omega(\log^{(k)} n)$ for any fixed~$k$, the graph gets dense enough
 to guarantee that at most~$k$ phases remain. This means that if we are able to
@@ -515,31 +523,31 @@ construct a~heap of size $\Omega(\log^{(k)} n)$ with constant time per operation
 we can get a~linear-time algorithm for MST. This is the case when the weights are
 integers:
 
-\thmn{MST for graphs with integer weights, Fredman and Willard \cite{fw:transdich}}\id{intmst}%
+\thmn{MST for integer weights, Fredman and Willard \cite{fw:transdich}}\id{intmst}%
 MST of a~graph with integer edge weights can be found in time $\O(m)$ on the Word-RAM.
 
 \proof
-We will combine the Iterated Jarn\'\i{}k's algorithm with the Q-heaps from section \ref{qheaps}.
+We will combine the Iterated Jarn\'\i{}k's algorithm with the Q-heaps from Section \ref{qheaps}.
 We modify the first pass of the algorithm to choose $t=\log n$ and use the Q-heap tree instead
 of the Fibonacci heap. From Theorem \ref{qh} and Remark \ref{qhtreerem} we know that the
 operations on the Q-heap tree run in constant time, so the modified first phase takes time~$\O(m)$.
 Following the analysis of the original algorithm in the proof of Theorem \ref{itjarthm} we obtain
 $t_2\ge 2^{t_1} = 2^{\log n} = n$, so the algorithm stops after the second phase.\foot{%
-Alternatively, we can use the Q-heaps directly with $k=\log^{1/4}n$ and then stop
+Alternatively, we can use the Q-heaps directly with $k=\log^{1/4}n$ and then the algorithm stops
 after the third phase.}
 \qed
 
-\rem
+\paran{Further improvements}%
 Gabow et al.~\cite{gabow:mst} have shown how to speed up the Iterated Jarn\'\i{}k's algorithm to~$\O(m\log\beta(m,n))$.
 They split the adjacency lists of the vertices to small buckets, keep each bucket
 sorted and consider only the lightest edge in each bucket until it is removed.
 The mechanics of the algorithm is complex and there is a~lot of technical details
 which need careful handling, so we omit the description of this algorithm.
-A~better algorithm will be shown in Chapter \ref{optchap}.
+A~better algorithm will be shown in Chapter~\ref{optchap}.
 
 %--------------------------------------------------------------------------------
 
-\section{Verification of minimality}
+\section{Verification of minimality}\id{verifysect}%
 
 Now we will turn our attention to a~slightly different problem: given a~spanning
 tree, how to verify that it is minimum? We will show that this can be achieved
@@ -548,8 +556,8 @@ MST algorithm in Section~\ref{randmst}.
 
 MST verification has been studied by Koml\'os \cite{komlos:verify}, who has
 proven that $\O(m)$ edge comparisons are sufficient, but his algorithm needed
-superlinear time to find the edges to compare. Dixon, Rauch and Tarjan
-have later shown in \cite{dixon:verify} that the overhead can be reduced
+super-linear time to find the edges to compare. Dixon, Rauch and Tarjan \cite{dixon:verify}
+have later shown that the overhead can be reduced
 to linear time on the RAM using preprocessing and table lookup on small
 subtrees. Later, King has given a~simpler algorithm in \cite{king:verifytwo}.
 
@@ -557,7 +565,7 @@ In this section, we will follow Koml\'os's steps and study the comparisons
 needed, saving the actual efficient implementation for later.
 
 \para
-To verify that a~spanning~$T$ is minimum, it is sufficient to check that all
+To verify that a~spanning tree~$T$ is minimum, it is sufficient to check that all
 edges outside~$T$ are $T$-heavy (by the Minimality Theorem, \ref{mstthm}). In fact, we will be
 able to find all $T$-light edges efficiently. For each edge $uv\in E\setminus T$,
 we will find the heaviest edge of the tree path $T[u,v]$ and compare its weight
@@ -565,20 +573,20 @@ to $w(uv)$. It is therefore sufficient to solve the following problem:
 
 \problem
 Given a~weighted tree~$T$ and a~set of \df{query paths} $Q \subseteq \{ T[u,v] \mid u,v\in V(T) \}$
-specified by their endpoints, find the heaviest edge (\df{peak}) for every path in~$Q$.
+specified by their endpoints, find the heaviest edge \df{(peak)} of every path in~$Q$.
 
 \paran{Bor\o{u}vka trees}%
 Finding the peaks can be burdensome if the tree~$T$ is degenerated,
 so we will first reduce it to the same problem on a~balanced tree. We run
 the Bor\o{u}vka's algorithm on~$T$, which certainly produces $T$ itself, and we
-record the order in which the subtrees have been merged in another tree~$B(T)$.
+record the order, in which the subtrees have been merged, in another tree~$B(T)$.
 The peak queries on~$T$ can be then easily translated to peak queries on~$B(T)$.
 
 \defn
 For a~weighted tree~$T$ we define its \df{Bor\o{u}vka tree} $B(T)$ as a~rooted tree which records
 the execution of the Bor\o{u}vka's algorithm run on~$T$. The leaves of $B(T)$
 are all the vertices of~$T$, an~internal vertex~$v$ at level~$i$ from the bottom
-corresponds to a~component tree~$C(v)$ formed in the $i$-th phase of the algorithm. When
+corresponds to a~component tree~$C(v)$ formed in the $i$-th iteration of the algorithm. When
 a~tree $C(v)$ selects an adjacent edge~$e$ and gets merged with some other trees to form
 a~component $C(u)$, we add an~edge $uv$ to~$B(T)$ and set its weight to $w(e)$.
 
@@ -603,8 +611,8 @@ component $C(u)$ contains both $a$ and~$b$, and consider the sons $v_a$ and $v_b
 for which $a\in C(v_a)$ and $b\in C(v_b)$. As the edge~$h$ must have been
 selected by at least one of these components, we assume without loss of generality that
 it was $C(v_a)$, and hence we have $w(uv_a)=w(h)$. We will show that the
-edge~$uv_a$ lies in~$P'$, because exactly one of the endpoints of~$h$ lies
-in~$C(v_a)$. Both endpoints cannot lie there, since it would imply that $C(v_a)$,
+edge~$uv_a$ lies in~$P'$, because exactly one of the vertices $x$, $y$ lies
+in~$C(v_a)$. Both cannot lie there, since it would imply that $C(v_a)$,
 being connected, contains the whole path~$P$, including~$h$. On the other hand,
 if $C(v_a)$ contained neither~$x$ nor~$y$, it would have to be incident with
 another edge of~$P$ different from~$h$, so this lighter edge would be selected
@@ -623,7 +631,7 @@ consider only paths that connect a~vertex with one of its ancestors.
 
 When we combine the two transforms, we get:
 
-\lemma\id{verbranch}%
+\lemman{Balancing of trees}\id{verbranch}%
 For each tree~$T$ on $n$~vertices and a~set~$Q$ of $q$~query paths on~$T$, it is possible
 to find a~complete branching tree~$T'$, together with a~set~$Q'$ of paths on~$T'$,
 such that the weights of the heaviest edges of the paths in~$Q$ can be deduced from
@@ -649,15 +657,15 @@ The peak of every original query path is then the heavier of the peaks of its ha
 
 \paran{Bounding comparisons}%
 We will now describe a~simple variant of the depth-first search which finds the
-peaks of all query paths of the transformed problem. As we promised,
+peaks of all query paths of the balanced problem. As we promised,
 we will take care of the number of comparisons only, as long as all other operations
 are well-defined and they can be performed in polynomial time.
 
 \defn
 For every edge~$e=uv$, we consider the set $Q_e$ of all query paths containing~$e$.
-The vertex of a~path, which is closer to the root, will be called its \df{top,}
+The vertex of a~path, that is closer to the root, will be called the \df{top} of the path,
 the other vertex its \df{bottom.}
-We define arrays $T_e$ and~$P_e$ as follows: $T_e$ contains
+We define arrays $T_e$ and~$P_e$ as follows: $T_e$~contains
 the tops of the paths in~$Q_e$ in order of their increasing depth (we
 will call them \df{active tops} and each of them will be stored exactly once). For
 each active top~$t=T_e[i]$, we define $P_e[i]$ as the peak of the path $T[v,t]$.
@@ -666,8 +674,9 @@ each active top~$t=T_e[i]$, we define $P_e[i]$ as the peak of the path $T[v,t]$.
 As for every~$i$ the path $T[v,T_e[i+1]]$ is contained within $T[v,T_e[i]]$,
 the edges of~$P_e$ must have non-increasing weights, that is $w(P_e[i+1]) \le
 w(P_e[i])$.
+This leads to the following algorithm:
 
-\alg $\<FindPeaks>(u,p,T_p,P_p)$ --- process all queries in the subtree rooted
+\alg $\<FindPeaks>(u,p,T_p,P_p)$ --- process all queries located in the subtree rooted
 at~$u$ entered from its parent via an~edge~$p$.
 \id{findpeaks}
 
@@ -680,7 +689,7 @@ the desired edge from~$P_p[i]$.
 
 \::Construct the array of tops~$T_e$ for the edge~$e$: Start with~$T_p$, remove
    the tops of the paths that do not contain~$e$ and add the vertex~$u$ itself
-   if there is a~query path which has~$u$ as its top and which has bottom somewhere
+   if there is a~query path which has~$u$ as its top and whose bottom lies somewhere
    in the subtree rooted at~$v$.
 
 \::Prepare the array of the peaks~$P_e$: Start with~$P_p$, remove the entries
@@ -692,7 +701,7 @@ the desired edge from~$P_p[i]$.
    edge~$e$, compare $w(e)$ with weights of the edges recorded in~$P_e$ and replace
    those edges which are lighter by~$e$.
    Since $P_p$ was sorted, we can use binary search
-   to locate the boundary between lighter and heavier edges in~$P_e$.
+   to locate the boundary between the lighter and heavier edges in~$P_e$.
 
 \::Recurse on~$v$: call $\<FindPeaks>(v,e,T_e,P_e)$.
 \endalgo
@@ -704,7 +713,7 @@ therefore start with $\<FindPeaks>(r,p_0,\emptyset,\emptyset)$.
 Let us account for the comparisons:
 
 \lemma\id{vercompares}%
-When the procedure \<FindPeaks> is called on the transformed problem, it
+When the procedure \<FindPeaks> is called on the balanced problem, it
 performs $\O(n+q)$ comparisons, where $n$ is the size of the tree and
 $q$ is the number of query paths.
 
@@ -713,7 +722,7 @@ We will calculate the number of comparisons~$c_i$ performed when processing the
 going from the $(i+1)$-th to the $i$-th level of the tree.
 The levels are numbered from the bottom, so leaves are at level~0 and the root
 is at level $\ell\le \lceil \log_2 n\rceil$. There are $n_i\le n/2^i$ vertices
-at the $i$-th level, so we consider exactly $n_i$ edges. To avoid taking a~logarithm
+at the $i$-th level, so we consider exactly $n_i$ edges. To avoid taking a~logarithm\foot{All logarithms are binary.}
 of zero, we define $\vert T_e\vert=1$ for $T_e=\emptyset$.
 \def\eqalign#1{\null\,\vcenter{\openup\jot
   \ialign{\strut\hfil$\displaystyle{##}$&$\displaystyle{{}##}$\hfil
@@ -751,7 +760,7 @@ c \le n + (q+n) + \O(n) = \O(n+q). \qedmath
 $$
 
 \para
-When we combine this lemma with the above reduction, we get the following theorem:
+When we combine this lemma with the above reduction from general trees to balanced trees, we get the following theorem:
 
 \thmn{Verification of the MST, Koml\'os \cite{komlos:verify}}\id{verify}%
 For every weighted graph~$G$ and its spanning tree~$T$, it is sufficient to
@@ -769,7 +778,7 @@ the tops of all query paths. According to Lemma \ref{vercompares}, this spends a
 comparisons. Since we (as always) assume that~$G$ is connected, $\O(m+n)=\O(m)$.
 \qed
 
-\rem
+\paran{Other applications}%
 The problem of computing path maxima or minima in a~weighted tree has several other interesting
 applications. One of them is computing minimum cuts separating given pairs of vertices in a~given
 weighted undirected graph~$G$. We construct a~Gomory-Hu tree~$T$ for the graph as described in \cite{gomoryhu}
@@ -781,7 +790,7 @@ takes $\Omega(n^2)$ time, we could of course invest this time in precomputing th
 all pairs of vertices. This would however require quadratic space, so we can better use
 the method of this section which fits in $\O(n+q)$ space for $q$~queries.
 
-\rem
+\paran{Dynamic verification}%
 A~dynamic version of the problem is also often considered. It calls for a~data structure
 representing a~weighted forest with operations for modifying the structure of the forest
 and querying minima or maxima on paths. Sleator and Tarjan have shown in \cite{sleator:trees}
@@ -794,7 +803,7 @@ in time $\O(mn\log n)$.
 \section{Verification in linear time}\id{verifysect}%
 
 We have proven that $\O(m)$ edge weight comparisons suffice to verify minimality
-of a~given spanning tree. Now we will show an~algorithm for the RAM,
+of a~given spanning tree. Now we will show an~algorithm for the RAM
 which finds the required comparisons in linear time. We will follow the idea
 of King from \cite{king:verifytwo}, but as we have the power of the RAM data structures
 from Section~\ref{bitsect} at our command, the low-level details will be easier,
@@ -802,7 +811,7 @@ especially the construction of vertex and edge labels.
 
 \para
 First of all, let us make sure that the reduction to fully branching trees
-in Lemma \ref{verbranch} can be made run in linear time. As already noticed
+in the Balancing lemma (\ref{verbranch}) can be made run in linear time. As already noticed
 in the proof, the Bor\o{u}vka's algorithm runs in linear time. Constructing
 the Bor\o{u}vka tree in the process adds at most a~constant overhead to every
 step of the algorithm.
@@ -822,10 +831,10 @@ answer lowest common ancestor queries presented online in constant time.
 The reductions in Lemma \ref{verbranch} can be performed in time $\O(m)$.
 
 \para
-Having the reduced problem at hand, it remains to implement the procedure \<FindPeaks>
+Having the balanced problem at hand, it remains to implement the procedure \<FindPeaks>
 of Algorithm \ref{findpeaks} efficiently. We need a~compact representation of
 the arrays $T_e$ and~$P_e$, which will allow to reduce the overhead of the algorithm
-to time linear will be linear in the number of comparisons performed. To achieve
+to time linear in the number of comparisons performed. To achieve
 this goal, we will encode the arrays in RAM vectors (see Section \ref{bitsect}
 for the vector operations).
 
@@ -839,30 +848,30 @@ labels} and we note that each label requires only $\ell=\lceil \log\lceil\log n\
 bits. As every tree edge is uniquely identified by its bottom vertex, we can
 use the same encoding for \df{edge labels.}
 
-\em{Slots:} As we will need several operations which are not computable
+\em{Slots:} As we are going to need several operations which are not computable
 in constant time on the RAM, we precompute tables for these operations
 like we did in the Q-heaps (cf.~Lemma \ref{qhprecomp}). A~table for word-sized
 arguments would take too much time to precompute, so we will generally store
 our data structures in \df{slots} of $s=\lceil 1/3\cdot\log n\rceil$ bits each.
-We will show soon that it is possible to precompute a~table of any reasonable
+We will soon show that it is possible to precompute a~table of any reasonable
 function whose arguments fit in two slots.
 
 \em{Top masks:} The array~$T_e$ will be represented by a~bit mask~$M_e$ called the \df{top mask.} For each
 of the possible tops~$t$ (i.e., the ancestors of the current vertex), we store
 a~single bit telling whether $t\in T_e$. Each top mask fits in $\lceil\log n\rceil$
 bits and therefore in a~single machine word. If needed, it can be split to three slots.
-Unions and intersections of sets of tops then translate to calling $\band$/$\bor$
+Unions and intersections of sets of tops then translate to $\band$/$\bor$
 on the top masks.
 
 \em{Small and big lists:} The heaviest edge found so far for each top is stored
 by the algorithm in the array~$P_e$. Instead of keeping the real array,
 we store the labels of these edges in a~list encoded in a~bit string.
 Depending on the size of the list, we use one of two possible encodings:
-\df{Small lists} are stored in a~vector which fits in a~single slot, with
+\df{Small lists} are stored in a~vector that fits in a~single slot, with
 the unused fields filled by a~special constant, so that we can easily infer the
 length of the list.
 
-If the data do not fit in a~small list, we use a~\df{big list} instead, which
+If the data do not fit in a~small list, we use a~\df{big list} instead. It
 is stored in $\O(\log\log n)$ words, each of them containing a~slot-sized
 vector. Unlike the small lists, we use the big lists as arrays. If a~top~$t$ of
 depth~$d$ is active, we keep the corresponding entry of~$P_e$ in the $d$-th
@@ -879,10 +888,10 @@ this, we introduce \df{pointers} as another kind of edge identifiers.
 A~pointer is an~index to the nearest big list on the path from the small
 list containing the pointer to the root. As each big list has at most $\lceil\log n\rceil$
 fields, the pointer fits in~$\ell$ bits, but we need one extra bit to distinguish
-between normal labels and pointers.
+between regular labels and pointers.
 
 \lemman{Precomputation of tables}
-When~$f$ is a~function of two arguments computable in polynomial time, we can
+When~$f$ is a~function of up to two arguments computable in polynomial time, we can
 precompute a~table of the values of~$f$ for all values of arguments that fit
 in a~single slot. The precomputation takes $\O(n)$ time.
 
@@ -893,13 +902,13 @@ possible values of arguments, so the precomputation takes time $\O(n^{2/3}\cdot\
 \qed
 
 \example
-As we can afford spending spending $\O(n)$ time on preprocessing,
+As we can afford spending $\O(n)$ time on preprocessing,
 we can assume that we can compute the following functions in constant time:
 
 \itemize\ibull
 \:$\<Weight>(x)$ --- the Hamming weight of a~slot-sized number~$x$
 (we already considered this operation in Algorithm \ref{lsbmsb}, but we needed
-quadratic word size for it). We can easily extend this to $\log n$-bit numbers
+quadratic word size for it). We can easily extend this function to $\log n$-bit numbers
 by splitting the number in three slots and adding their weights.
 
 \:$\<FindKth>(x,k)$ --- the $k$-th set bit from the top of the slot-sized
@@ -963,7 +972,7 @@ by counting bits of the top mask~$M_e$ at position~$d$ and higher
 \qed
 
 \lemma\id{verfh}%
-The procedure \<FindPeaks> processes an~edge~$e$ in time $\O(\log \vert T_e\vert + q_e)$,
+\<FindPeaks> processes an~edge~$e$ in time $\O(\log \vert T_e\vert + q_e)$,
 where $q_e$~is the number of query paths having~$e$ as its bottom edge.
 
 \proof
@@ -993,7 +1002,7 @@ that shall be deleted by a~subsequent call to \<SubList>. Pointers
 can be retained as they still refer to the same ancestor list.
 
 \:\em{Big from big:} We can copy the whole~$P_p$, since the layout of the
-big lists is fixed and the items we do not want simply end up as unused
+big lists is fixed and the items, which we do not want, simply end up as unused
 fields in~$P_e$.
 
 \:\em{Small from big:} We use the operation \<Bits> to construct a~list
@@ -1017,43 +1026,52 @@ sub-word of~$M_e$ in the intended interval).
 \qeditem
 \endlist
 
-\>We are now ready to combine these steps and get the following theorem:
+\>We now have all the necessary ingredients to prove the following theorem
+and thus conclude this section:
 
 \thmn{Verification of MST on the RAM}\id{ramverify}%
-There is a~RAM algorithm, which for every weighted graph~$G$ and its spanning tree~$T$
+There is a~RAM algorithm which for every weighted graph~$G$ and its spanning tree~$T$
 determines whether~$T$ is minimum and finds all $T$-light edges in~$G$ in time $\O(m)$.
 
 \proof
 Implement the Koml\'os's algorithm from Theorem \ref{verify} with the data
 structures developed in this section.
-According to Lemma \ref{verfh}, it runs in time $\sum_e \O(\log\vert T_e\vert + q_e)
+According to Lemma \ref{verfh}, the algorithm runs in time $\sum_e \O(\log\vert T_e\vert + q_e)
 = \O(\sum_e \log\vert T_e\vert) + \O(\sum_e q_e)$. The second sum is $\O(m)$
 as there are $\O(1)$ query paths per edge, the first sum is $\O(\#\hbox{comparisons})$,
 which is $\O(m)$ by Theorem \ref{verify}.
 \qed
 
+\>In Section \ref{kbestsect}, we will need a~more specialized statement:
+
+\cor\id{rampeaks}%
+There is a~RAM algorithm which for every weighted tree~$T$ and a~set~$P$ of
+paths in~$T$ calculates the peaks of these paths in time $\O(n(T) + \vert P\vert)$.
+
 \paran{Verification on the Pointer Machine}\id{pmverify}%
 Buchsbaum et al.~\cite{buchsbaum:verify} have recently shown that linear-time
 verification can be achieved even on the Pointer Machine. They first solve the
 problem of finding the lowest common ancestors for a~set of pairs of vertices
 by batch processing: They combine an~algorithm of time complexity $\O(m\timesalpha(m,n))$
 based on the Disjoint Set Union data structure with the framework of topological graph
-computations developed in Section \ref{bucketsort}. Then they use a~similar
+computations described in Section \ref{bucketsort}. Then they use a~similar
 technique for finding the peaks themselves.
 
 \paran{Online verification}%
 The online version of this problem has turned out to be more difficult. It calls for an~algorithm
 that preprocesses the tree and then answers queries for peaks of paths presented online. Pettie
 \cite{pettie:onlineverify} has proven an~interesting lower bound based on the inverses of the
-Ackermann's function (see \ref{ackerinv}). If we want to answer queries within $t$~comparisons, we
-have to invest $\Omega(n\log\lambda_t(n))$ time into preprocessing. This implies that with
+Ackermann's function. If we want to answer queries within $t$~comparisons, we
+have to invest $\Omega(n\log\lambda_t(n))$ time into preprocessing.\foot{$\lambda_t(n)$ is the
+$t$-th row inverse of the Ackermann's function, $\alpha(n)$ is its diagonal inverse. See
+\ref{ackerinv} for the exact definitions.} This implies that with
 preprocessing in linear time, the queries require $\Omega(\alpha(n))$ time.
 
 %--------------------------------------------------------------------------------
 
 \section{A randomized algorithm}\id{randmst}%
 
-When we analysed the contractive Bor\o{u}vka's algorithm in Section~\ref{contalg},
+When we analysed the Contractive Bor\o{u}vka's algorithm in Section~\ref{contalg},
 we observed that while the number of vertices per iteration decreases exponentially,
 the number of edges generally does not, so we spend $\Theta(m)$ time on every phase.
 Karger, Klein and Tarjan \cite{karger:randomized} have overcome this problem by
@@ -1069,7 +1087,7 @@ good, but it will soon turn out that when we take~$T$ as the MST of a~randomly s
 subgraph, only a~small expected number of edges remains.
 
 Selecting a~subgraph at random will unavoidably produce disconnected subgraphs
-at occassion, so we will drop the implicit assumption that all graphs are
+at occasion, so we will drop the implicit assumption that all graphs are
 connected for this section and we will always search for the minimum spanning forest.
 As we already noted (\ref{disconn}), with a~little bit of care our
 algorithms and theorems keep working.
@@ -1077,20 +1095,19 @@ algorithms and theorems keep working.
 Since we need the MST verification algorithm for finding the $T$-heavy edges,
 we will assume that we are working on the RAM.
 
-\lemman{Random sampling, Karger \cite{karger:sampling}}
+\lemman{Random sampling, Karger \cite{karger:sampling}}\id{samplemma}%
 Let $H$~be a~subgraph of~$G$ obtained by including each edge independently
-with probability~$p$ and $F$~the minimum spanning forest of~$H$. Then the
-expected number of $F$-nonheavy\foot{These include $F$-light edges and also
-edges of~$F$ itself.}
+with probability~$p$. Let further $F$~be the minimum spanning forest of~$H$. Then the
+expected number of $F$-nonheavy\foot{That is, $F$-light edges and also edges of~$F$ itself.}
 edges in~$G$ is at most $n/p$.
 
 \proof
 Let us observe that we can obtain the forest~$F$ by running the Kruskal's algorithm
 (\ref{kruskal}) combined with the random process producing~$H$ from~$G$. We sort all edges of~$G$
 by their weights and we start with an~empty forest~$F$. For each edge, we first
-flip a~biased coin (which gives heads with probability~$p$) and if it comes up
+flip a~biased coin (that gives heads with probability~$p$) and if it comes up
 tails, we discard the edge. Otherwise we perform a~single step of the Kruskal's
-algoritm: We check whether $F+e$ contains a~cycle. If it does, we discard~$e$, otherwise
+algorithm: We check whether $F+e$ contains a~cycle. If it does, we discard~$e$, otherwise
 we add~$e$ to~$F$. At the end, we have produced the subgraph~$H$ and its MSF~$F$.
 
 When we  exchange the check for cycles with flipping the coin, we get an~equivalent
@@ -1109,28 +1126,28 @@ current state of~$F$ and the final MSF.
 The number of $F$-nonheavy edges is therefore equal to the total number of the coin
 flips in step~2 of this algorithm. We also know that the algorithm stops before
 it adds $n$~edges to~$F$. Therefore it flips at most as many coins as a~simple
-random process which repeatedly flips until it gets~$n$ heads. As waiting for
-every occurence of heads takes expected time~$1/p$, waiting for~$n$ heads
-must take $n/p$. This is the bound we wanted to achieve.
+random process that repeatedly flips until it gets~$n$ heads. As waiting for
+every occurrence of heads takes expected time~$1/p$ (the distribution is geometric),
+waiting for~$n$ heads must take $n/p$. This is the bound we wanted to achieve.
 \qed
 
 \para
 We will formulate the algorithm as a~doubly-recursive procedure. It alternatively
-peforms steps of the Bor\o{u}vka's algorithm and filtering based on the above lemma.
+performs steps of the Bor\o{u}vka's algorithm and filtering based on the above lemma.
 The first recursive call computes the MSF of the sampled subgraph, the second one
-finds the MSF of the graph without the heavy edges.
+finds the MSF of the original graph, but without the heavy edges.
 
 As in all contractive algorithms, we use edge labels to keep track of the
 original locations of the edges in the input graph. For the sake of simplicity,
-we do not mention it in the algorithm.
+we do not mention it in the algorithm explicitly.
 
 \algn{MSF by random sampling --- the KKT algorithm}\id{kkt}%
 \algo
 \algin A~graph $G$ with an~edge comparison oracle.
 \:Remove isolated vertices from~$G$. If no vertices remain, stop and return an~empty forest.
 \:Perform two Bor\o{u}vka steps (iterations of Algorithm \ref{contbor}) on~$G$ and
-  remember the set~$B$ of edges contracted.
-\:Select subgraph~$H\subseteq G$ by including each edge independently with
+  remember the set~$B$ of the edges having been contracted.
+\:Select a~subgraph~$H\subseteq G$ by including each edge independently with
   probability $1/2$.
 \:$F\=\msf(H)$ calculated recursively.
 \:Construct $G'\subseteq G$ by removing all $F$-heavy edges of~$G$.
@@ -1141,7 +1158,7 @@ we do not mention it in the algorithm.
 
 \nota
 Let us analyse the time complexity of this algorithm by studying properties of its \df{recursion tree.}
-The tree describes the subproblems processed by the recursive calls. For any vertex~$v$
+This tree describes the subproblems processed by the recursive calls. For any vertex~$v$
 of the tree, we denote the number of vertices and edges of the corresponding subproblem~$G_v$
 by~$n_v$ and~$m_v$ respectively.
 If $m_v>0$, the recursion continues: the left son of~$v$ corresponds to the
@@ -1156,18 +1173,18 @@ The Bor\o{u}vka steps together with the removal of isolated vertices guarantee t
 of vertices drops at least by a~factor of four in every recursive call. The size of a~subproblem~$G_v$
 at level~$i$ is therefore at most $n/4^i$ and the depth of the tree is at most $\lceil\log_4 n\rceil$.
 As there are no more than~$2^i$ subproblems at level~$i$, the sum of all~$n_v$'s
-on that level is at most $n/2^i$, which is at most~$2n$ for the whole tree.
+on that level is at most $n/2^i$, which is at most~$2n$ when summed over the whole tree.
 
 We are going to show that the worst case of the KKT algorithm is not worse than
 of the plain contractive algorithm, while the average case is linear.
 
 \lemma
-For every subproblem~$G_v$, the KKT algorithm spends time $\O(m_v+n_v)$ plus the time
-spent on the recursive calls.
+For every subproblem~$G_v$, the KKT algorithm spends $\O(m_v+n_v)$ time plus the cost
+of the recursive calls.
 
 \proof
 We know from Lemma \ref{contiter} that each Bor\o{u}vka step takes time $\O(m_v+n_v)$.\foot{We
-add $n_v$ as the graph could be disconnected.}
+need to add $n_v$, because the graph could be disconnected.}
 The selection of the edges of~$H_v$ is straightforward.
 Finding the $F_v$-heavy edges is not, but we have already shown in Theorem \ref{ramverify}
 that linear time is sufficient on the RAM.
@@ -1201,7 +1218,7 @@ compensated by the loss of edges by contraction and $m_\ell + m_r \le m_v$. So t
 number of edges per level does not decrease and it remains to apply the previous lemma.
 \qed
 
-\thmn{Average-case complexity of the KKT algorithm}
+\thmn{Expected complexity of the KKT algorithm}\id{kktavg}%
 The expected time complexity of the KKT algorithm on the RAM is $\O(m)$.
 
 \proof
@@ -1209,9 +1226,9 @@ The structure of the recursion tree depends on the random choices taken,
 but as its worst-case depth is at most~$\lceil \log_4 n\rceil$, the tree
 is always a~subtree of the complete binary tree of that depth. We will
 therefore prove the theorem by examining the complete tree, possibly with
-empty subproblems at some vertices.
+empty subproblems in some vertices.
 
-The set of all left edges in the tree (edges connecting a~parent with its left
+The left edges of the tree (edges connecting a~parent with its left
 son) form a~set of \df{left paths.} Let us consider the expected time spent on
 a~single left path. When walking the path downwards from its top vertex~$r$,
 the expected size of the subproblems decreases exponentially: for a~son~$\ell$
@@ -1222,7 +1239,7 @@ to sum this over all left paths.
 With the exception of the path going from the root of the tree,
 the top~$r$ of a~left path is always a~right son of a~unique parent vertex~$v$.
 Since the subproblem~$G_r$ has been obtained from its parent subproblem~$G_v$
-by filtering out all heavy edges, we can use the Sampling lemma to show that
+by filtering out all heavy edges, we can use the Sampling lemma (\ref{samplemma}) to show that
 $\E m_r \le 2n_v$. The sum of the expected sizes of all top subproblems is
 then $\sum_r n_r + m_r \le \sum_v 3n_v = \O(n)$. After adding the exceptional path
 from the root, we get $\O(m+n)=\O(m)$.
@@ -1236,10 +1253,10 @@ again follows the recursion tree and it involves applying the Chernoff bound
 \cite{chernoff} to bound the tail probabilities.
 
 \paran{Different sampling}%
-We could also use a~slightly different formulation of the sampling lemma
-suggested by Chan \cite{chan:backward}. It changes the selection of the subgraph~$H$
+We could also use a~slightly different formulation of the Sampling lemma
+suggested by Chan \cite{chan:backward}. He changes the selection of the subgraph~$H$
 to choosing an~$mp$-edge subset of~$E(G)$ uniformly at random. The proof is then
-a~straightforward application of the backward analysis method. We however prefered
+a~straightforward application of the backward analysis method. We however preferred
 the Karger's original version, because generating a~random subset of a~given size
 requires an~unbounded number of random bits in the worst case.