the edges of~$P_e$ must have non-increasing weights, that is $w(P_e[i+1]) \le
w(P_e[i])$.
-\alg $\<FindHeavy>(u,p,T_p,P_p)$ --- process all queries in the subtree rooted
+\alg $\<FindPeaks>(u,p,T_p,P_p)$ --- process all queries in the subtree rooted
at~$u$ entered from its parent via an~edge~$p$.
\id{findheavy}
if there is a~query path which has~$u$ as its top and which has bottom somewhere
in the subtree rooted at~$v$.
-\::Construct the array of the peaks~$P_e$: Start with~$P_p$,
- remove the entries corresponding to the tops which are no longer active.
+\::Prepare the array of the peaks~$P_e$: Start with~$P_p$, remove the entries
+ corresponding to the tops which are no longer active. If $u$ became an~active
+ top, append~$e$ to the array.
+
+\::Finish~$P_e$:
Since the paths leading to all active tops have been extended by the
edge~$e$, compare $w(e)$ with weights of the edges recorded in~$P_e$ and replace
those edges which are lighter by~$e$.
Since $P_p$ was sorted, we can use binary search
- to locate the boundary between lighter and heavier edges in~$P_e$. Finally
- append~$e$ to the array if the top $u$ became active.
+ to locate the boundary between lighter and heavier edges in~$P_e$.
-\::Recurse on~$v$: call $\<FindHeavy>(v,e,T_e,P_e)$.
+\::Recurse on~$v$: call $\<FindPeaks>(v,e,T_e,P_e)$.
\endalgo
\>As we need a~parent edge to start the recursion, we add an~imaginary parent
edge~$p_0$ of the root vertex~$r$, for which no queries are defined. We can
-therefore start with $\<FindHeavy>(r,p_0,\emptyset,\emptyset)$.
+therefore start with $\<FindPeaks>(r,p_0,\emptyset,\emptyset)$.
Let us account for the comparisons:
\lemma\id{vercompares}%
-When the procedure \<FindHeavy> is called on the transformed problem, it
+When the procedure \<FindPeaks> is called on the transformed problem, it
performs $\O(n+q)$ comparisons, where $n$ is the size of the tree and
$q$ is the number of query paths.
of $G\setminus T$). We use the reduction from Lemma \ref{verbranch} to get
an~equivalent problem with a~full branching tree and a~set of parent-descendant
paths. The reduction costs $\O(m+n)$ comparisons.
-Then we run the \<FindHeavy> procedure (Algorithm \ref{findheavy}) to find
+Then we run the \<FindPeaks> procedure (Algorithm \ref{findheavy}) to find
the tops of all query paths. According to Lemma \ref{vercompares}, this spends another $\O(m+n)$
comparisons. Since we (as always) assume that~$G$ is connected, $\O(m+n)=\O(m)$.
\qed
\section{Verification in linear time}
We have proven that $\O(m)$ edge weight comparisons suffice to verify minimality
-of a~given spanning tree. In this section, we will show an~algorithm for the RAM,
+of a~given spanning tree. Now we will show an~algorithm for the RAM,
which finds the required comparisons in linear time. We will follow the idea
of King from \cite{king:verify}, but as we have the power of the RAM data structures
from Section~\ref{bitsect} at our command, the low-level details will be easier,
especially the construction of vertex and edge labels.
-\paran{Reduction}
+\para
First of all, let us make sure that the reduction to fully branching trees
in Lemma \ref{verbranch} can be made run in linear time. As already noticed
in the proof, the Bor\o{u}vka's algorithm runs in linear time. Constructing
Finding the common ancestors is not trivial, but Harel and Tarjan have shown
in \cite{harel:nca} that linear time is sufficient on the RAM. Several more
accessible algorithms have been developed since then (see the Alstrup's survey
-paper \cite{alstrup:nca} and a~particularly elegant algorithm shown by Bender
+paper \cite{alstrup:nca} and a~particularly elegant algorithm described by Bender
and Falach-Colton in \cite{bender:lca}). Any of them implies the following
theorem:
On the RAM, it is possible to preprocess a~tree~$T$ in time $\O(n)$ and then
answer lowest common ancestor queries presented online in constant time.
-\proof
-See for example Bender and Falach-Colton \cite{bender:lca}.
-\qed
-
\cor
The reductions in Lemma \ref{verbranch} can be performed in time $\O(m)$.
\para
-Having the reduced problem at hand, we have to implement the procedure \<FindHeavy>
+Having the reduced problem at hand, it remains to implement the procedure \<FindPeaks>
of Algorithm \ref{findheavy} efficiently. We need a~compact representation of
-the arrays $T_e$ and~$P_e$ by vectors, so that the overhead of the algorithm
-will be linear in the number of comparisons performed.
+the arrays $T_e$ and~$P_e$, which will allow to reduce the overhead of the algorithm
+to time linear will be linear in the number of comparisons performed. To achieve
+this goal, we will encode the arrays in RAM vectors (see Section \ref{bitsect}
+for the vector operations).
\defn
-\em{Vertex identifiers:} Since all vertices referred to by the procedure
-lie on the path from root to the current vertex~$u$, we modify the algorithm
-to keep a~stack of these vertices in an~array and refer to each vertex by its
+\em{Vertex identifiers:} Since all vertices processed by the procedure
+lie on the path from the root to the current vertex~$u$, we modify the algorithm
+to keep a~stack of these vertices in an~array. We will often refer to each vertex by its
index in this array, i.e., by its depth. We will call these identifiers \df{vertex
-labels} and we note that each label require only $\ell=\lceil \log\log n\rceil$
+labels} and we note that each label requires only $\ell=\lceil \log\lceil\log n\rceil\rceil$
bits. As every tree edge is uniquely identified by its bottom vertex, we can
use the same encoding for \df{edge labels.}
\em{Slots:} As we will need several operations which are not computable
in constant time on the RAM, we precompute tables for these operations
-like we did in the Q-Heaps (cf.~Lemma \ref{qhprecomp}). A~table for word-sized
+like we did in the Q-heaps (cf.~Lemma \ref{qhprecomp}). A~table for word-sized
arguments would take too much time to precompute, so we will generally store
-our data structures in \df{slots} of $s=1/3\cdot\lceil\log n\rceil$ bits each.
+our data structures in \df{slots} of $s=\lceil 1/3\cdot\log n\rceil$ bits each.
We will show soon that it is possible to precompute a~table of any reasonable
function whose arguments fit in two slots.
of the possible tops~$t$ (i.e., the ancestors of the current vertex), we store
a~single bit telling whether $t\in T_e$. Each top mask fits in $\lceil\log n\rceil$
bits and therefore in a~single machine word. If needed, it can be split to three slots.
+Unions and intersections of sets of tops then translate to calling $\band$/$\bor$
+on the top masks.
\em{Small and big lists:} The heaviest edge found so far for each top is stored
by the algorithm in the array~$P_e$. Instead of keeping the real array,
we store the labels of these edges in a~list encoded in a~bit string.
-Depending on the size of the list, we use two one of two possible encodings:
+Depending on the size of the list, we use one of two possible encodings:
\df{Small lists} are stored in a~vector which fits in a~single slot, with
-the unused fields filled by a~special constant, so that we can infer the
+the unused fields filled by a~special constant, so that we can easily infer the
length of the list.
If the data do not fit in a~small list, we use a~\df{big list} instead, which
depth~$d$ is active, we keep the corresponding entry of~$P_e$ in the $d$-th
field of the big list. Otherwise, we keep that entry unused.
-We will want to perform all operations on small lists in constant time,
+We want to perform all operations on small lists in constant time,
but we can afford spending time $\O(\log\log n)$ on every big list. This
-is true because whenever we need a~big list, $\vert T_e\vert = \Omega(\log n/\log\log n)$,
-so we need $\log\vert T_e\vert = \Omega(\log\log n)$ comparisons anyway.
+is true because whenever we use a~big list, $\vert T_e\vert = \Omega(\log n/\log\log n)$,
+hence we need $\log\vert T_e\vert = \Omega(\log\log n)$ comparisons anyway.
\em{Pointers:} When we need to construct a~small list containing a~sub-list
-of a~big list, we do not have enough time to see the whole big list. To solve
-this problem, we will introduce \df{pointers} as another kind of edge identifiers.
+of a~big list, we do not have enough time to see the whole big list. To handle
+this, we introduce \df{pointers} as another kind of edge identifiers.
A~pointer is an~index to the nearest big list on the path from the small
list containing the pointer to the root. As each big list has at most $\lceil\log n\rceil$
fields, the pointer fits in~$\ell$ bits, but we need one extra bit to distinguish
between normal labels and pointers.
-\lemma
+\lemman{Precomputation of tables}
When~$f$ is a~function of two arguments computable in polynomial time, we can
precompute a~table of the values of~$f$ for all values of arguments which fit
in a~single slot. The precomputation takes $\O(n)$ time.
\qed
\example
-We can assume we can compute the following functions in constant time (after $\O(n)$ preprocessing):
+As we can afford spending spending $\O(n)$ time on preprocessing,
+we can assume that we can compute the following functions in constant time:
\itemize\ibull
-\:$\<Weight>(x)$ --- computes the Hamming weight of a~slot-sized number~$x$
+\:$\<Weight>(x)$ --- the Hamming weight of a~slot-sized number~$x$
(we already considered this operation in Algorithm \ref{lsbmsb}, but we needed
quadratic word size for it). We can easily extend this to $\log n$-bit numbers
by splitting the number in three slots and adding their weights.
-\:$\<FindKth>(x,k)$ --- find the $k$-th set bit from the top of the slot-sized
+\:$\<FindKth>(x,k)$ --- the $k$-th set bit from the top of the slot-sized
number~$x$. Again, this can be extended to multi-slot numbers by calculating
the \<Weight> of each slot first and then finding the slot containing the
-$k$-th one.
+$k$-th~\1.
\:$\<Bits>(m)$ --- for a~slot-sized bit mask~$m$, it returns a~small list
-of the positions of bits set in~$\(m)$.
+of the positions of the bits set in~$\(m)$.
\:$\<Select>(x,m)$ --- constructs a~slot containing the substring of $\(x)$
selected by the bits set in~$\(m)$.
\endlist
\para
-We will now show how to perform all parts of the procedure \<FindHeavy>
+We will now show how to perform all parts of the procedure \<FindPeaks>
in the required time. We will denote the size of the tree by~$n$ and the
number of query paths by~$q$.
Run depth-first search on the tree, assign the depth of a~vertex when entering
it and construct its top mask when leaving it. The top mask can be obtained
by $\bor$-ing the masks of its sons, excluding the level of the sons and
-including the tops of all query paths which have bottom at the current vertex
+including the tops of all query paths which have their bottoms at the current vertex
(the depths of the tops are already assigned).
\qed
extracted using bit masking and shifts.
If it is a~small list, we extract the field directly, but we have to
-dereference it in case it is a pointer. We modify the recursion in \<FindHeavy>
+dereference it in case it is a pointer. We modify the recursion in \<FindPeaks>
to pass the depth of the lowest edge endowed with a~big list and when we
encounter a~pointer, we index this big list.
\qed
\qed
\lemma\id{verfh}%
-The procedure \<FindHeavy> processes an~edge~$e$ in time $\O(\log \vert T_e\vert + q_e)$,
-where $q_e$~is the number of query paths having~$e$ as its bottommost edge.
+The procedure \<FindPeaks> processes an~edge~$e$ in time $\O(\log \vert T_e\vert + q_e)$,
+where $q_e$~is the number of query paths having~$e$ as its bottom edge.
\proof
-The edge is examined in steps 1, 2, 4 and~5 of the algorithm. Step~4 is trivial
-as we have already computed the top masks and we can reconstruct the entries
-of~$T_e$ in constant time (Lemma \ref{verth}). We have to perform the other
-in constant time if $P_e$ is a~small list or $\O(\log\log n)$ if it is big.
+The edge is examined in steps 1, 3, 4 and~5 of the algorithm. We will show how to
+perform each of these steps in constant time if $P_e$ is a~small list or
+$\O(\log\log n)$ if it is big.
-\em{Step 5} involves binary search on~$P_e$ in $\O(\log\vert T_e\vert)$ comparisons,
-each of them indexes~$P_e$, which is $\O(1)$ by Lemma \ref{verth}. Rewriting the
-lighter edges is $\O(1)$ for small lists by replication and bit masking, for a~big
-list we do the same for each of its slot.
+\em{Step~1} looks up $q_e$~tops in~$P_e$ and we already know from Lemma \ref{verhe}
+how to do that in constant time per top.
-\em{Step 1} looks up $q_e$~tops in~$P_e$ and we already know from Lemma \ref{verhe}
-how to do that in constant time.
+\em{Step~3} is trivial as we have already computed the top masks and we can
+reconstruct the entries of~$T_e$ in constant time according to Lemma \ref{verth}.
-\em{Step 2} is the only non-trivial one. We already know which tops to select
+\em{Step~5} involves binary search on~$P_e$ in $\O(\log\vert T_e\vert)$ comparisons,
+each of them indexes~$P_e$, which is $\O(1)$ again by Lemma \ref{verth}. Rewriting the
+lighter edges is $\O(1)$ for small lists by replication and bit masking, for a~big
+list we do the same for each of its slots.
+
+\em{Step~4} is the only non-trivial one. We already know which tops to select
(we have the top masks $M_e$ and~$M_p$ precomputed), but we have to carefully
-extract the sublist.
-We have to handle these four cases:
+extract the sublist.
+We need to handle these four cases:
\itemize\ibull
-\:\em{Small from small:} We use \<Select> to find the fields of~$P_p$ which
-shall be deleted. Then we apply \<SubList> to actually delete them. Pointers
+\:\em{Small from small:} We use $\<Select>(T_e,T_p)$ to find the fields of~$P_p$
+which shall be deleted by a~subsequent call to \<SubList>. Pointers
can be retained as they still refer to the same ancestor list.
\:\em{Big from big:} We can copy the whole~$P_p$, since the layout of the
adjusted to be relative to the beginning of the slot (we use \<Compare>
and \<Replicate> from Algorithm \ref{vecops} and bit masking). Then we
use a~precomputed table to replace the pointers by the fields of~$B_i$
-they point to. Finally, we $\bor$ together the partial results.
+they point to. We $\bor$ together the partial results and we again have
+a~small list.
-Finally, we have to spread the fields of the small list to the whole big list.
+Finally, we have to spread the fields of this small list to the whole big list.
This is similar: for each slot of the big list, we find the part of the small
-list keeping the fields we want (\<Weight> on the sub-words of~$M_e$ before
-and above the intended interval of depths) and we use a~precomputed table
+list keeping the fields we want (we call \<Weight> on the sub-words of~$M_e$ before
+and after the intended interval of depths) and we use a~tabulated function
to shift the fields to the right locations in the slot (controlled by the
sub-word of~$M_e$ in the intended interval).
\qeditem
\endlist
+\>We are now ready to combine these steps and get the following theorem:
+
\thmn{Verification of MST on the RAM}\id{ramverify}%
There is a~RAM algorithm, which for every weighted graph~$G$ and its spanning tree~$T$
-determines whether~$T$ is minimum and finds all $T$-light edges in~$G$.
+determines whether~$T$ is minimum and finds all $T$-light edges in~$G$ in time $\O(m)$.
\proof
Implement the Koml\'os's algorithm from Theorem \ref{verify} with the data
Buchsbaum et al.~have recently shown in \cite{buchsbaum:verify} that linear-time
verification can be achieved even on the pointer machine. They first solve the
problem of finding the lowest common ancestors for a~set of pairs of vertices
-by batch processing, combining an~$\O(m\alpha(m,n))$ algorithm using the Union-Find
-data structure with table lookup for small subtrees. Then they use a~similar
-technique for the path maxima themselves. The tricky part is of course the table
+by batch processing. They combine an~algorithm of time complexity $\O(m\alpha(m,n))$
+using the Union-Find data structure with table lookup for small subtrees. Then they use a~similar
+technique for finding the peaks themselves. The tricky part is of course the table
lookup, which they handle by radix-sorting pointer-based codes of the subtrees.
\rem