From 9ab438c07ce74904621b1e2421db7789d542cbd7 Mon Sep 17 00:00:00 2001
From: Martin Mares <mj@ucw.cz>
Date: Sat, 8 Mar 2008 20:22:03 +0100
Subject: [PATCH] Fixing verification...

---
 PLAN       |   1 +
 adv.tex    | 249 +++++++++++++++++++++++++++++++++++++++++------------
 biblio.bib |  34 ++++++++
 macros.tex |   3 +
 4 files changed, 232 insertions(+), 55 deletions(-)

diff --git a/PLAN b/PLAN
index 0bd1e98..11d0215 100644
--- a/PLAN
+++ b/PLAN
@@ -84,3 +84,4 @@ Notation:
 - unify use of n(G) vs. n
 - use calligraphic letters from ams?
 - change the notation for contractions -- use double slash instead of the dot?
+- introduce \widehat\O early
diff --git a/adv.tex b/adv.tex
index b5c433b..8eff275 100644
--- a/adv.tex
+++ b/adv.tex
@@ -545,19 +545,20 @@ which need careful handling, so we omit the description of this algorithm.
 
 Now we will turn our attention to a~slightly different problem: given a~spanning
 tree, how to verify that it is minimum? We will show that this can be achieved
-in linear time and it will serve as a~basis for the randomized linear-time
-MST algorithm in the next section.
+in linear time and it will serve as a~basis for a~randomized linear-time
+MST algorithm in Section~\ref{randmst}.
 
 MST verification has been studied by Koml\'os \cite{komlos:verify}, who has
 proven that $\O(m)$ edge comparisons are sufficient, but his algorithm needed
 superlinear time to find the edges to compare. Dixon, Rauch and Tarjan
 have later shown in \cite{dixon:verify} that the overhead can be reduced
 to linear time on the RAM using preprocessing and table lookup on small
-subtrees. This algorithm was then simplified by King in \cite{king:verifytwo}.
-We will follow the results of Koml\'os and King, but as we have the power of the
-RAM data structures from Section~\ref{bitsect} at our command, we will not have
-to worry about much technical details.
+subtrees. Later, King has given a~simpler algorithm in \cite{king:verifytwo}.
 
+In this section, we will follow Koml\'os's steps and study the comparisons
+needed, saving the actual efficient implementation for later.
+
+\para
 To verify that a~spanning~$T$ is minimum, it is sufficient to check that all
 edges outside~$T$ are $T$-heavy (by Theorem \ref{mstthm}). In fact, we will be
 able to find all $T$-light edges efficiently. For each edge $uv\in E\setminus T$,
@@ -623,57 +624,39 @@ query path $T[x,y]$ to two half-paths $T[x,a]$ and $T[a,y]$ where~$a$ is the
 \df{lowest common ancestor} of~$x$ and~$y$ in~$T$. It is therefore sufficient to
 consider only paths which connect a~vertex with one of its ancestors.
 
-Finding the common ancestors is not trivial, but Harel and Tarjan have shown
-in \cite{harel:nca} that linear time is sufficient on the RAM. Several more
-accessible algorithms have been developed since then (see the Alstrup's survey
-paper \cite{alstrup:nca} and a~particularly elegant algorithm shown by Bender
-and Falach-Colton in \cite{bender:lca}). Any of them implies the following
-theorem:
-
-\thmn{Lowest common ancestors}\id{lcathm}%
-On the RAM, it is possible to preprocess a~tree~$T$ in time $\O(n)$ and then
-answer lowest common ancestor queries presented online in constant time.
-
-\proof
-See for example Bender and Falach-Colton \cite{bender:lca}.
-\qed
-
-\para
-We summarize the reductions in the following lemma, also showing that they can
-be performed in linear time:
+When we combine the two transforms, we get:
 
 \lemma\id{verbranch}%
-For each tree~$T$ and a~set of query paths~$Q$ on~$T$, it is possible to find
-a~complete branching tree~$T'$ in linear time together with a~set~$Q'$ of query
+For each tree~$T$ on $n$~vertices and a~set of $m$~query paths~$Q$ on~$T$, it is possible to find
+a~complete branching tree~$T'$ in $\O(n)$ comparisons, together with a~set~$Q'$ of
 paths on~$T'$, such that the weights of the heaviest edges of the new paths can
-be transformed to the weights of the heaviest edges of the paths in~$Q$ in
-linear time.
-The tree $T$ has at most $2n(T)$ vertices and $\O(\log n(T))$ levels. The set~$Q'$ contains
-at most~$2\vert Q\vert$ paths and each of them connects a~vertex of~$T'$ with one
+be transformed to the weights of the heaviest edges of the paths in~$Q$ in $\O(m)$ comparisons.
+The tree $T'$ has at most $2n(T)$ vertices and $\O(\log n(T))$ levels. The set~$Q'$ contains
+at most~$2m$ paths and each of them connects a~vertex of~$T'$ with one
 of its ancestors.
 
 \proof
-The tree~$T'$ will be the Bor\o{u}vka tree for~$T$. We run the contractive version
-of the Bor\o{u}vka's algorithm (Algorithm \ref{contbor}) on~$T$. It runs in linear time,
-for example because trees are planar (Theorem \ref{planarbor}). The construction of~$T'$
-itself adds only a~constant overhead to every step of the algorithm. As~$T'$ has~$m(T)=n(T)-1$
-leaves and it is a~complete branching tree, it has at most~$m(T)$ internal vertices,
-so~$n(T')\le 2n(T)$ as promised. Since all internal vertices have at least two sons,
-the depth must be logarithmic.
+The tree~$T'$ will be the Bor\o{u}vka tree for~$T$, obtained by running the
+contractive version of the Bor\o{u}vka's algorithm (Algorithm \ref{contbor})
+on~$T$. The algorithm runs in linear time, for example because trees are planar
+(Theorem \ref{planarbor}). We therefore spend $\O(n)$ comparisons in it.
+
+As~$T'$ has~$m(T)=n(T)-1$ leaves and it is a~complete branching tree, it has at most~$m(T)$ internal vertices,
+so~$n(T')\le 2n(T)$ as promised. Since the number of passes of the Bor\o{u}vka's
+algorithm is $\O(\log n)$, the depth of the tree must be logarithmic as well.
 
 For each query path $T[x,y]$ we find the lowest common ancestor of~$x$ and~$y$
-using Theorem \ref{lcathm} and replace the path by the two half-paths. This
-produces a~set~$Q'$ of at most~$2\vert Q\vert$ paths. If we remember the origin
-of each of the new paths, the reconstruction of answers to the original queries
-is then trivial.
+and replace the path by the two half-paths. This
+produces a~set~$Q'$ of at most~$2m$ half-paths. If we remember the origin
+of each of the new half-paths, the reconstruction of answers to the original queries
+is then just taking the minimum of the answers for the two half-paths.
 \qed
 
 \para
 We will now describe a~simple variant of depth-first search which finds the
-maximum-weight edges for all query paths of the transformed problem. For the
-time being, we will not care about the time complexity of the algorithm (as long
-as it is polynomial) and we will minimize only the number of edge weight comparisons
-performed.
+heaviest edges for all query paths of the transformed problem. As we promised,
+we will take care of the number of comparisons only, as long as all other operations
+are well-defined and they can be performed in polynomial time.
 
 For every edge~$e=uv$, we consider the set $Q_e$ of all query paths containing~$e$.
 The vertex of a~path, which is closer to the root, will be called its \df{top,}
@@ -696,7 +679,8 @@ the path in~$T_p$ and record the heaviest edge remembered for it in~$H_p$.
 
 \:First find the tops~$T$ which will be shared by all edges going from~$u$ downwards.
 These are the tops from~$T_p$ except for the ones which have ceased to be active,
-because all query paths which were referring to them have~$u$ as their bottom.
+because all query paths which were referring to them either have~$u$ as their bottom
+or continue from~$u$ downwards by another edge.
 Select the corresponding array of the heaviest edges~$H$ from~$H_p$.
 
 \:For every son~$v$ of~$u$, do:
@@ -765,9 +749,8 @@ $$\eqalign{
 }$$
 Putting all three parts together, we conclude that:
 $$
-c \le n + m + \O(n) = \O(n+m).
+c \le n + m + \O(n) = \O(n+m). \qedmath
 $$
-\qed
 
 \para
 When we combine this lemma with the above reduction, we get the following theorem:
@@ -777,26 +760,182 @@ For every weighted graph~$G$ and its spanning tree~$T$, it is sufficient to
 perform $\O(m)$ comparisons of edge weights to determine whether~$T$ is minimum
 and to find all $T$-light edges in~$G$.
 
-\para
+\proof
 We first transform the problem to finding the heaviest edges for a~set
 of query paths in~$T$ (these are exactly the paths covered by the edges
 of $G\setminus T$). We use the reduction from Lemma \ref{verbranch} to get
 an~equivalent problem with a~full branching tree and a~set of parent-descendant
-paths. Then we run the \<FindHeavy> procedure (\ref{findheavy}) to find
-the heaviest edges and we employ Lemma \ref{vercompares} to bound the number
-of comparisons used.
+paths, which costs $\O(m+n)$ comparisons.
+Then we run the \<FindHeavy> procedure (\ref{findheavy}) to find
+the heaviest edges and according to Lemma \ref{vercompares} it spends
+another $\O(m+n)$ comparisons. Since we (as always) assume that~$G$ is connected,
+$\O(m+n)=\O(m)$.
 \qed
 
+\rem
+The problem of computing path maxima or minima in a~tree has several other interesting applications,
+such as computing minimum cuts separating given pairs of vertices in a~given weighted undirected
+graph~$G$. We construct a~Gomory-Hu tree~$T$ for the graph as described in \cite{gomoryhu}
+(see also \cite{bhalgat:ght} for a~more efficient algorithm running in time
+$\widetilde\O(mn)$ for unit-cost graphs). This tree has the property that for every two
+vertices $u$, $v$ of the graph~$G$, the minimum-cost edge on $T[u,v]$ has the same cost
+as the minimum cut separating $u$ and~$v$ in~$G$. Since the construction of~$T$ generally
+takes $\Omega(n^2)$ time, we could also precompute the minima for all pairs of vertices
+at no extra cost. This would however require quadratic space, while the method of this
+section needs only $\O(n+q)$ to process~$q$ queries.
+
+\rem
+A~dynamic version of the problem is also often considered. It calls for a~data structure
+representing a~weighted tree with operations for modifying the structure of the tree
+and querying minima or maxima on paths. Sleator and Tarjan have shown in \cite{sleator:trees}
+how to do this in $\O(\log n)$ time amortized per operation, which for example
+allows an~implementation of the Dinic's maximum flow algorithm \cite{dinic:flow}
+in time $\O(mn\log n)$.
+
+%--------------------------------------------------------------------------------
+
+\section{Verification in linear time}
+
+We have proven that $\O(m)$ edge weight comparisons suffice to verify minimality
+of a~given spanning tree. In this section, we will show an~algorithm for the RAM,
+which finds the required comparisons in linear time. We will follow the idea
+of King from \cite{king:verify}, but as we have the power of the RAM data structures
+from Section~\ref{bitsect} at our command, the low-level details will be easier.
+
+\paran{Reduction}
+First of all, let us make sure that the reduction to fully branching trees
+in Lemma \ref{verbranch} can be made run in linear time. As already noticed
+in the proof, the Bor\o{u}vka's algorithm runs in linear time. Constructing
+the Bor\o{u}vka tree in the process adds at most a~constant overhead to every
+step of the algorithm.
+
+Finding the common ancestors is not trivial, but Harel and Tarjan have shown
+in \cite{harel:nca} that linear time is sufficient on the RAM. Several more
+accessible algorithms have been developed since then (see the Alstrup's survey
+paper \cite{alstrup:nca} and a~particularly elegant algorithm shown by Bender
+and Falach-Colton in \cite{bender:lca}). Any of them implies the following
+theorem:
+
+\thmn{Lowest common ancestors}\id{lcathm}%
+On the RAM, it is possible to preprocess a~tree~$T$ in time $\O(n)$ and then
+answer lowest common ancestor queries presented online in constant time.
+
+\proof
+See for example Bender and Falach-Colton \cite{bender:lca}.
+\qed
+
+\cor
+The reductions in Lemma \ref{verbranch} can be performed in time $\O(m)$.
+
 \para
-We will now show an~efficient implementation of \<FindHeavy>, which will
-run in linear time on the RAM.
+Having the reduced problem at hand, we have to implement the procedure \<FindHeavy>
+of Algorithm \ref{findheavy} efficiently. We need a~compact representation of
+the arrays $T_e$ and~$H_e$ by vectors, so that the overhead of the algorithm
+will be linear in the number of comparisons performed.
 
+\defn
+
+\em{Vertex identifiers:} Since all vertices referred to by the procedure
+lie on the path from root to the current vertex~$u$, we modify the algorithm
+to keep a~stack of these vertices in an~array and refer to each vertex by its
+index in this array, i.e., by its depth. We will call these identifiers \df{vertex
+labels} and we note that each label require only $\ell=\lceil \log\log n\rceil$
+bits. As every tree edge is uniquely identified by its bottom vertex, we can
+use the same encoding for \df{edge labels.}
+
+\em{Slots:} As we will need several operations which are not computable
+in constant time on the RAM, we precompute tables for these operations
+like we did in the Q-Heaps (cf.~Lemma \ref{qhprecomp}). A~table for word-sized
+arguments would take too much time to precompute, so we will generally store
+our data structures in \df{slots} of $s=1/3\cdot\lceil\log n\rceil$ bits each.
+We will show soon that it is possible to precompute a~table of any reasonable
+function whose arguments fit in two slots.
+
+\em{Top masks:} The array~$T_e$ will be represented by bit masks. For each
+of the possible tops~$t$ (i.e., the ancestors of the current vertex), we store
+a~single bit telling whether $t\in T_e$. Each bit mask fits in $\lceil\log n\rceil$
+bits and therefore in a~single machine word. If needed, it can be split to three slots.
+
+\em{Small and big lists:} The heaviest edge found so far for each top is stored
+by the algorithm in the array~$H_e$. Instead of keeping the real array,
+we store the labels of these edges in a~list encoded in a~bit string.
+Depending on the size of the list, we use two one of two possible encodings:
+\df{Small lists} are stored in a~vector which fits in a~single slot, with
+the unused entries filled by a~special constant, so that we can infer the
+length of the list.
+
+If the data do not fit in a~small list, we use a~\df{big list} instead, which
+is stored in $\O(\log\log n)$ words, each of them containing a~slot-sized
+vector. Unlike the small lists, we use the big lists as arrays. If a~top~$t$ of
+depth~$d$ is active, we keep the corresponding entry of~$H_e$ in the $d$-th
+entry of the big list. Otherwise, we keep that entry unused.
+
+We will want to perform all operations on small lists in constant time,
+but we can afford spending time $\O(\log\log n)$ on every big list. This
+is true because whenever we need a~big list, $\vert T_e\vert = \Omega(\log n/\log\log n)$,
+so we need $\log\vert T_e\vert = \Omega(\log\log n)$ comparisons anyway.
+
+\em{Pointers:} When we need to construct a~small list containing a~sub-list
+of a~big list, we do not have enough time to see the whole big list. To solve
+this problem, we will introduce \df{pointers} as another kind of edge identifiers.
+A~pointer is an~index to the nearest big list on the path from the small
+list containing the pointer to the root. As each big list has at most $\lceil\log n\rceil$
+entries, the pointer fits in~$\ell$ bits, but we need one extra bit to distinguish
+between normal labels and pointers.
 
+\lemma
+When~$f$ is a~function of two arguments computable in polynomial time, we can
+precompute a~table of the values of~$f$ for all values of arguments which fit
+in a~single slot. The precomputation takes $\O(n)$ time.
 
+\proof
+Similar to the proof of Lemma \ref{qhprecomp}. There are $\O(2^{2s}) = \O(n^{2/3})$
+possible values of arguments, so the precomputation takes time $\O(n^{2/3}\cdot\poly(s))
+= \O(n^{2/3}\cdot\poly(\log n)) = \O(n)$.
+\qed
+
+\example
+We can assume we can compute the following functions in constant time (after $\O(n)$ preprocessing):
+
+\itemize\ibull
+\:$\<Weight>(x)$ --- computes the Hamming weight of a~slot-sized number~$x$
+(we already considered this operation in Algorithm \ref{lsbmsb}, but we needed
+quadratic word size for it). We can easily extend this to $\log n$-bit numbers
+by splitting the number in three slots and adding their weights.
 
+\:$\<FindKth>(x,k)$ --- find the $k$-th set bit from the top of the slot-sized
+number~$x$. Again, this can be extended to multi-slot numbers by calculating
+the \<Weight> of each slot first and then finding the slot containing the
+$k$-th one.
 
-\FIXME{GHT}
+\:$\<Bits>(m)$ --- for a~slot-sized bit mask~$m$, it returns a~small list
+of the positions of bits set in~$m$.
+
+\:$\<SubList>(x,m)$ --- when~$x$ is a~small list and~$m$ a bit mask, it returns
+a~small list containing the elements of~$x$ selected by the bits set in~$m$.
+\endlist
+
+\para
+We will now show how to perform all parts of the procedure \<FindHeavy>
+in the required time.
+
+\lemma
+
+\proof
+\qed
+
+\lemma
+The array $H_e$ can be indexed in constant time.
+
+\proof
+\qed
+
+
+\FIXME{Mention online version}
+
+%--------------------------------------------------------------------------------
 
+\section{A~randomized algorithm}\id{randmst}%
 
 %--------------------------------------------------------------------------------
 
diff --git a/biblio.bib b/biblio.bib
index d56cf90..10335d4 100644
--- a/biblio.bib
+++ b/biblio.bib
@@ -1044,3 +1044,37 @@ inproceedings{ pettie:minirand,
   year={1984},
   publisher={Society for Industrial and Applied Mathematics Philadelphia, PA, USA}
 }
+
+@article{ gomoryhu,
+   author = "R. E. Gomory and T. C. Hu",
+   title = "Multi-Terminal Network Flows",
+   journal = "{Journal of SIAM}",
+   volume = {9},
+   number = {4},
+   year = {1961},
+   pages = {551--570}
+}
+
+@inproceedings{ bhalgat:ght,
+ author = {Ramesh Hariharan and Telikepalli Kavitha and Debmalya Panigrahi and Anand Bhalgat},
+ title = {{An $\widehat\O(mn)$ Gomory-Hu tree construction algorithm for unweighted graphs}},
+ booktitle = {STOC '07: Proceedings of the thirty-ninth annual ACM symposium on Theory of computing},
+ year = {2007},
+ isbn = {978-1-59593-631-8},
+ pages = {605--614},
+ location = {San Diego, California, USA},
+ doi = {http://doi.acm.org/10.1145/1250790.1250879},
+ publisher = {ACM},
+ address = {New York, NY, USA},
+}
+
+@article{ sleator:trees,
+  title={{A data structure for dynamic trees}},
+  author={Sleator, D.D. and Tarjan, R.E.},
+  journal={Journal of Computer and System Sciences},
+  volume={26},
+  number={3},
+  pages={362--391},
+  year={1983},
+  publisher={Academic Press, Inc. Orlando, FL, USA}
+}
diff --git a/macros.tex b/macros.tex
index c0ac011..c238860 100644
--- a/macros.tex
+++ b/macros.tex
@@ -34,6 +34,7 @@
 \let\>=\noindent
 \def\qed{{\parfillskip=0pt\allowbreak\hfill\nobreak $\spadesuit$\par}}
 \def\qeditem{{\parfillskip=0pt\hfill\rlap{\hskip\rightskip\llap{$\spadesuit$}}\par}}
+\def\qedmath{\eqno{\spadesuit}}
 \def\FIXME#1{\>{\bo TODO:} #1}
 \def\symdiff{\mathbin{\Delta}}
 \def\rack#1#2{\setbox0=\hbox{#1}\hbox to \wd0{#2}}
@@ -358,6 +359,8 @@
 \def\examplen{\example\labelx}
 \def\problemn{\problem\labelx}
 
+\def\paran#1{\para {\sl #1:}}
+
 \def\proof{\noindent {\sl Proof.}\enspace}
 \def\proofsketch{\noindent {\sl Proof sketch.}\enspace}
 
-- 
2.39.5