edges outside~$T$ are $T$-heavy (by Theorem \ref{mstthm}). In fact, we will be
able to find all $T$-light edges efficiently. For each edge $uv\in E\setminus T$,
we will find the heaviest edge of the tree path $T[u,v]$ and compare its weight
-to $w(uv)$. It is therefore sufficient to solve this problem:
+to $w(uv)$. It is therefore sufficient to solve the following problem:
\problem
Given a~weighted tree~$T$ and a~set of \df{query paths} $Q \subseteq \{ T[u,v] ; u,v\in V(T) \}$
-specified by their endpoints, find the heaviest edge for every path in~$Q$.
+specified by their endpoints, find the heaviest edge (\df{peak}) for every path in~$Q$.
\para
-Finding the heaviest edges can be burdensome if the tree~$T$ is degenerated,
+Finding the peaks can be burdensome if the tree~$T$ is degenerated,
so we will first reduce it to the same problem on a~balanced tree. We run
the Bor\o{u}vka's algorithm on~$T$, which certainly produces $T$ itself, and we
record the order in which the subtrees have been merged in another tree~$B(T)$.
-The queries on~$T$ can be then easily translated to queries on~$B(T)$.
+The peak queries on~$T$ can be then easily translated to peak queries on~$B(T)$.
\defn
For a~weighted tree~$T$ we define its \df{Bor\o{u}vka tree} $B(T)$ as a~rooted tree which records
at least two sons. Such trees will be called \df{complete branching trees.}
\lemma
-For every tree~$T$ and every pair of its vertices $x,y\in V(T)$, the heaviest edge
-of the path $T[x,y]$ has the same weight as the heaviest edge of~the path
-$B(T)[x,y]$.
+For every tree~$T$ and every pair of its vertices $x,y\in V(T)$, the peak
+of the path $T[x,y]$ has the same weight as the peak of~the path $B(T)[x,y]$.
\proof
Let us denote the path $T[x,y]$ by~$P$ and its heaviest edge by~$h=ab$. Similarly,
For each query path $T[x,y]$ we find the lowest common ancestor of~$x$ and~$y$
and split the path by the two half-paths. This produces a~set~$Q'$ of at most~$2q$ half-paths.
-The heaviest edge of every original query path is then the maximum of the results for
-its halves.
+The peak of every original query path is then the heavier of the peaks of its halves.
\qed
\para
-We will now describe a~simple variant of depth-first search which finds the
-heaviest edges for all query paths of the transformed problem. As we promised,
+We will now describe a~simple variant of the depth-first search which finds the
+peaks of all query paths of the transformed problem. As we promised,
we will take care of the number of comparisons only, as long as all other operations
are well-defined and they can be performed in polynomial time.
+\defn
For every edge~$e=uv$, we consider the set $Q_e$ of all query paths containing~$e$.
The vertex of a~path, which is closer to the root, will be called its \df{top,}
the other vertex its \df{bottom.}
-We define arrays $T_e$ and~$H_e$ as follows: $T_e$ contains
+We define arrays $T_e$ and~$P_e$ as follows: $T_e$ contains
the tops of the paths in~$Q_e$ in order of their increasing depth (we
will call them \df{active tops} and each of them will be stored exactly once). For
-each active top~$t=T_e[i]$, we define $H_e[i]$ as the heaviest edge of the path $T[v,t]$.
+each active top~$t=T_e[i]$, we define $P_e[i]$ as the peak of the path $T[v,t]$.
+
+\obs
As for every~$i$ the path $T[v,T_e[i+1]]$ is contained within $T[v,T_e[i]]$,
-the edges in~$H_e$ must have non-increasing weights, that is $w(H_e[i+1]) \le
-w(H_e[i])$.
+the edges of~$P_e$ must have non-increasing weights, that is $w(P_e[i+1]) \le
+w(P_e[i])$.
-\alg $\<FindHeavy>(u,p,T_p,H_p)$ --- process all queries in the subtree rooted
+\alg $\<FindHeavy>(u,p,T_p,P_p)$ --- process all queries in the subtree rooted
at~$u$ entered from its parent via an~edge~$p$.
\id{findheavy}
\algo
-\:Process all query paths whose bottom is~$u$. For each of them, find the top of
-the path in~$T_p$ and record the heaviest edge remembered for it in~$H_p$.
-
-\:First find the tops~$T$ which will be shared by all edges going from~$u$ downwards.
-These are the tops from~$T_p$ except for the ones which have ceased to be active,
-because all query paths which were referring to them either have~$u$ as their bottom
-or continue from~$u$ downwards by another edge.
-Select the corresponding array of the heaviest edges~$H$ from~$H_p$.
-
-\:For every son~$v$ of~$u$, do:
-
-\::Construct the array of tops~$T_e$ for the edge $e=uv$. It contains all tops
- from~$T$ and possibly also the vertex~$u$ itself if there is a~query path
- which has~$u$ as its top and which has bottom somewhere in the subtree
- rooted at~$v$.
-
-\::Construct the array of the heaviest edges~$H_e$. For all tops from~$T$, just compare
- the weight of the edge recorded in~$H$ with $w(e)$ and if it was smaller,
- replace the remembered edge by~$e$. Since $H_p$ was sorted, so is~$H$
- and we can use binary search to locate the boundary between lighter and
- heavier edges in~$H$.
- For the new top~$u$, the edge~$e$ itself is obviously the heaviest.
-
-\::Recurse on~$v$: call $\<FindHeavy>(v,e,T_e,H_e)$.
+\:Process all query paths whose bottom is~$u$ and record their peaks.
+This is accomplished by finding the index~$i$ of each path's top in~$T_p$ and reading
+the desired edge from~$P_p[i]$.
+
+\:For every son~$v$ of~$u$, process the edge $e=uv$:
+
+\::Construct the array of tops~$T_e$ for the edge~$e$: Start with~$T_p$, remove
+ the tops of the paths which do not contain~$e$ and add the vertex~$u$ itself
+ if there is a~query path which has~$u$ as its top and which has bottom somewhere
+ in the subtree rooted at~$v$.
+
+\::Construct the array of the peaks~$P_e$: Start with~$P_p$,
+ remove the entries corresponding to the tops which are no longer active.
+ Since the paths leading to all active tops have been extended by the
+ edge~$e$, compare $w(e)$ with weights of the edges recorded in~$P_e$ and replace
+ those edges which are lighter by~$e$.
+ Since $P_p$ was sorted, we can use binary search
+ to locate the boundary between lighter and heavier edges in~$P_e$. Finally
+ append~$e$ to the array if the top $u$ became active.
+
+\::Recurse on~$v$: call $\<FindHeavy>(v,e,T_e,P_e)$.
\endalgo
\>As we need a~parent edge to start the recursion, we add an~imaginary parent
\lemma\id{vercompares}%
When the procedure \<FindHeavy> is called on the transformed problem, it
-performs $\O(n+m)$ comparisons, where $n$ is the size of the tree and
-$m$ is the number of query paths.
+performs $\O(n+q)$ comparisons, where $n$ is the size of the tree and
+$q$ is the number of query paths.
\proof
We will calculate the number of comparisons~$c_i$ performed when processing the edges
going from the $(i+1)$-th to the $i$-th level of the tree.
-The levels are numbered from the bottom, so leaves are at level~0, the root
-is at level $\ell\le \lceil \log_2 n\rceil$ and there are $n_i\le n/2^i$ vertices
+The levels are numbered from the bottom, so leaves are at level~0 and the root
+is at level $\ell\le \lceil \log_2 n\rceil$. There are $n_i\le n/2^i$ vertices
at the $i$-th level, so we consider exactly $n_i$ edges. To avoid taking a~logarithm
-of zero, we will define $\vert T_e\vert=1$ for $T_e=\emptyset$.
+of zero, we define $\vert T_e\vert=1$ for $T_e=\emptyset$.
\def\eqalign#1{\null\,\vcenter{\openup\jot
\ialign{\strut\hfil$\displaystyle{##}$&$\displaystyle{{}##}$\hfil
\crcr#1\crcr}}\,}
$$\vcenter{\openup\jot\halign{\strut\hfil $\displaystyle{#}$&$\displaystyle{{}#}$\hfil&\quad#\hfil\cr
-c_i &\le \sum_e \left( 1 + \log \vert T_e\vert \right)&(Total cost of the binary search.)\cr
+c_i &\le \sum_e \left( 1 + \log \vert T_e\vert \right)&(Total cost of the binary searches.)\cr
&\le n_i + \sum_e \log\vert T_e\vert&(We sum over $n_i$ edges.)\cr
&\le n_i + n_i \cdot {\sum_e \log\vert T_e\vert \over n_i}&(Consider the average of logarithms.) \cr
&\le n_i + n_i \cdot \log{\sum_e \vert T_e\vert \over n_i}&(Logarithm is concave.) \cr
- &\le n_i + n_i \cdot \log{m\over n_i}&(Bound the number of tops by queries.) \cr
- &= n_i \cdot \left( 1 + \log\left({m\over n}\cdot{n\over n_i}\right) \right)\cr
- &= n_i + n_i\log{m\over n} + n_i\log{n\over n_i}.\cr
+ &\le n_i + n_i \cdot \log{q+n\over n_i}&(Bound the number of tops by queries.) \cr
+ &= n_i \cdot \left( 1 + \log\left({q+n\over n}\cdot{n\over n_i}\right) \right)\cr
+ &= n_i + n_i\log{q+n\over n} + n_i\log{n\over n_i}.\cr
}}$$
Summing over all levels, we estimate the total number of comparisons as:
$$
-c = \sum_i c_i = \left( \sum_i n_i \right) + \left( \sum_i n_i \log{m\over n}\right) + \left( \sum_i n_i \log{n\over n_i}\right).
+c = \sum_i c_i = \left( \sum_i n_i \right) + \left( \sum_i n_i \log{q+n\over n}\right) + \left( \sum_i n_i \log{n\over n_i}\right).
$$
-The first part is equal to~$n$, the second one to $n\log(m/n)\le m$. For the third
-one, we would like to plug in $n_i \le n/2^i$, but we unfortunately have one~$n_i$
+The first part is equal to~$n$, the second one to $n\log((q+n)/n)\le q+n$. For the third
+one, we would like to plug in the bound $n_i \le n/2^i$, but we unfortunately have one~$n_i$
in the denominator. We save the situation by observing that the function $f(x)=x\log(n/x)$
is decreasing\foot{We can easily check the derivative: $f(x)=(x\ln n-x\ln x)/\ln 2$, so $f'(x)\cdot \ln2 =
\ln n - \ln x - 1$. We want $f'(x)<0$ and therefore $\ln x > \ln n - 1$, i.e., $x > n/e$.}
-for $x > n/e$, so except for $i=0$ and $i=1$ it holds that:
+for $x > n/e$, so for $i\ge 2$ it holds that:
$$
n_i\log{n\over n_i} \le {n\over 2^i}\cdot\log{n\over n/2^i} = {n\over 2^i} \cdot i.
$$
}$$
Putting all three parts together, we conclude that:
$$
-c \le n + m + \O(n) = \O(n+m). \qedmath
+c \le n + (q+n) + \O(n) = \O(n+q). \qedmath
$$
\para
and to find all $T$-light edges in~$G$.
\proof
-We first transform the problem to finding the heaviest edges for a~set
+We first transform the problem to finding all peaks of a~set
of query paths in~$T$ (these are exactly the paths covered by the edges
of $G\setminus T$). We use the reduction from Lemma \ref{verbranch} to get
an~equivalent problem with a~full branching tree and a~set of parent-descendant
-paths, which costs $\O(m+n)$ comparisons.
-Then we run the \<FindHeavy> procedure (\ref{findheavy}) to find
-the heaviest edges and according to Lemma \ref{vercompares} it spends
-another $\O(m+n)$ comparisons. Since we (as always) assume that~$G$ is connected,
-$\O(m+n)=\O(m)$.
+paths. The reduction costs $\O(m+n)$ comparisons.
+Then we run the \<FindHeavy> procedure (Algorithm \ref{findheavy}) to find
+the tops of all query paths. According to Lemma \ref{vercompares}, this spends another $\O(m+n)$
+comparisons. Since we (as always) assume that~$G$ is connected, $\O(m+n)=\O(m)$.
\qed
\rem
-The problem of computing path maxima or minima in a~tree has several other interesting applications,
-such as computing minimum cuts separating given pairs of vertices in a~given weighted undirected
-graph~$G$. We construct a~Gomory-Hu tree~$T$ for the graph as described in \cite{gomoryhu}
+The problem of computing path maxima or minima in a~weighted tree has several other interesting
+applications. One of them is computing minimum cuts separating given pairs of vertices in a~given
+weighted undirected graph~$G$. We construct a~Gomory-Hu tree~$T$ for the graph as described in \cite{gomoryhu}
(see also \cite{bhalgat:ght} for a~more efficient algorithm running in time
-$\widetilde\O(mn)$ for unit-cost graphs). This tree has the property that for every two
+$\widetilde\O(mn)$ for unit-cost graphs). The crucial property of this tree is that for every two
vertices $u$, $v$ of the graph~$G$, the minimum-cost edge on $T[u,v]$ has the same cost
as the minimum cut separating $u$ and~$v$ in~$G$. Since the construction of~$T$ generally
-takes $\Omega(n^2)$ time, we could also precompute the minima for all pairs of vertices
-at no extra cost. This would however require quadratic space, while the method of this
-section needs only $\O(n+q)$ to process~$q$ queries.
+takes $\Omega(n^2)$ time, we could of course invest this time in precomputing the minima for
+all pairs of vertices. This would however require quadratic space, so we can better use
+the method of this section which fits in $\O(n+q)$ space for $q$~queries.
\rem
A~dynamic version of the problem is also often considered. It calls for a~data structure
-representing a~weighted tree with operations for modifying the structure of the tree
+representing a~weighted forest with operations for modifying the structure of the forest
and querying minima or maxima on paths. Sleator and Tarjan have shown in \cite{sleator:trees}
-how to do this in $\O(\log n)$ time amortized per operation, which for example
-allows an~implementation of the Dinic's maximum flow algorithm \cite{dinic:flow}
+how to do this in $\O(\log n)$ time amortized per operation, which leads to
+an~implementation of the Dinic's maximum flow algorithm \cite{dinic:flow}
in time $\O(mn\log n)$.
%--------------------------------------------------------------------------------
\para
Having the reduced problem at hand, we have to implement the procedure \<FindHeavy>
of Algorithm \ref{findheavy} efficiently. We need a~compact representation of
-the arrays $T_e$ and~$H_e$ by vectors, so that the overhead of the algorithm
+the arrays $T_e$ and~$P_e$ by vectors, so that the overhead of the algorithm
will be linear in the number of comparisons performed.
\defn
bits and therefore in a~single machine word. If needed, it can be split to three slots.
\em{Small and big lists:} The heaviest edge found so far for each top is stored
-by the algorithm in the array~$H_e$. Instead of keeping the real array,
+by the algorithm in the array~$P_e$. Instead of keeping the real array,
we store the labels of these edges in a~list encoded in a~bit string.
Depending on the size of the list, we use two one of two possible encodings:
\df{Small lists} are stored in a~vector which fits in a~single slot, with
If the data do not fit in a~small list, we use a~\df{big list} instead, which
is stored in $\O(\log\log n)$ words, each of them containing a~slot-sized
vector. Unlike the small lists, we use the big lists as arrays. If a~top~$t$ of
-depth~$d$ is active, we keep the corresponding entry of~$H_e$ in the $d$-th
+depth~$d$ is active, we keep the corresponding entry of~$P_e$ in the $d$-th
field of the big list. Otherwise, we keep that entry unused.
We will want to perform all operations on small lists in constant time,
\qed
\lemma\id{verth}%
-The arrays $T_e$ and~$H_e$ can be indexed in constant time.
+The arrays $T_e$ and~$P_e$ can be indexed in constant time.
\proof
Indexing~$T_e$ is exactly the operation \<FindKth> applied on the corresponding
top mask~$M_e$.
-If $H_e$ is stored in a~big list, we calculate the index of the particular
+If $P_e$ is stored in a~big list, we calculate the index of the particular
slot and the position of the field inside the slot. This field can be then
extracted using bit masking and shifts.
\qed
\lemma\id{verhe}%
-For an~arbitrary active top~$t$, the corresponding entry of~$H_e$ can be
+For an~arbitrary active top~$t$, the corresponding entry of~$P_e$ can be
extracted in constant time.
\proof
We look up the precomputed depth~$d$ of~$t$ first.
-If $H_e$ is stored in a~big list, we extract the $d$-th entry of the list.
+If $P_e$ is stored in a~big list, we extract the $d$-th entry of the list.
If the list is small, we find the position of the particular field
by counting bits of the top mask~$M_e$ at position~$d$ and higher
(this is \<Weight> of $M_e$ with the lower bits masked out).
The edge is examined in steps 1, 2, 4 and~5 of the algorithm. Step~4 is trivial
as we have already computed the top masks and we can reconstruct the entries
of~$T_e$ in constant time (Lemma \ref{verth}). We have to perform the other
-in constant time if $H_e$ is a~small list or $\O(\log\log n)$ if it is big.
+in constant time if $P_e$ is a~small list or $\O(\log\log n)$ if it is big.
-\em{Step 5} involves binary search on~$H_e$ in $\O(\log\vert T_e\vert)$ comparisons,
-each of them indexes~$H_e$, which is $\O(1)$ by Lemma \ref{verth}. Rewriting the
+\em{Step 5} involves binary search on~$P_e$ in $\O(\log\vert T_e\vert)$ comparisons,
+each of them indexes~$P_e$, which is $\O(1)$ by Lemma \ref{verth}. Rewriting the
lighter edges is $\O(1)$ for small lists by replication and bit masking, for a~big
list we do the same for each of its slot.
-\em{Step 1} looks up $q_e$~tops in~$H_e$ and we already know from Lemma \ref{verhe}
+\em{Step 1} looks up $q_e$~tops in~$P_e$ and we already know from Lemma \ref{verhe}
how to do that in constant time.
\em{Step 2} is the only non-trivial one. We already know which tops to select
We have to handle these four cases:
\itemize\ibull
-\:\em{Small from small:} We use \<Select> to find the fields of~$H_p$ which
+\:\em{Small from small:} We use \<Select> to find the fields of~$P_p$ which
shall be deleted. Then we apply \<SubList> to actually delete them. Pointers
can be retained as they still refer to the same ancestor list.
-\:\em{Big from big:} We can copy the whole~$H_p$, since the layout of the
+\:\em{Big from big:} We can copy the whole~$P_p$, since the layout of the
big lists is fixed and the items we do not want simply end up as unused
-fields in~$H_e$.
+fields in~$P_e$.
\:\em{Small from big:} We use the operation \<Bits> to construct a~list
of pointers (we use bit masking to add the ``this is a~pointer'' flags).