Connect K best to the introductory chapters.

[saga.git] / dyn.tex
diff --git a/dyn.tex b/dyn.tex

index c0d97e6bc2f1e7e907dff4f371802eb840e61cfc..7197b22b68b2e8cd77515bc49c8ec3d6a180589b 100644 (file)
--- a/dyn.tex
+++ b/dyn.tex
@@ -410,7 +410,7 @@ anyway.)
  }
  
  At the beginning, the graph contains no edges, so both invariants are trivially
-satistifed. Newly inserted edges can enter level~0, which cannot break I1 nor~I2.
+satisfied. Newly inserted edges can enter level~0, which cannot break I1 nor~I2.
  
  When we delete a~tree edge at level~$\ell$, we split a~tree~$T$ of~$F_\ell$ to two
  trees $T_1$ and~$T_2$. Without loss of generality, let us assume that $T_1$ is the
@@ -503,7 +503,7 @@ we apply the trick from Example \ref{accel} and store~$F_0$ in a~ET-tree with $a
  This does not hurt the complexity of insertions and deletions, but allows for faster queries.
  \qed
  
-\rem
+\rem\id{dclower}%
  An~$\Omega(\log n/\log\log n)$ lower bound for the amortized complexity of the dynamic connectivity
  problem has been proven by Henzinger and Fredman \cite{henzinger:lowerbounds} in the cell
  probe model with $\O(\log n)$-bit words. Thorup has answered by a~faster algorithm
@@ -516,7 +516,7 @@ to fit our needs, so we omit the details.
  
  %--------------------------------------------------------------------------------
  
-\section{Dynamic spanning forests}
+\section{Dynamic spanning forests}\id{dynmstsect}%
  
  Let us turn our attention back to the dynamic MSF now.
  Most of the early algorithms for dynamic connectivity also imply $\O(n^\varepsilon)$
@@ -643,7 +643,7 @@ replaced it by a~system of auxiliary edges inserted at various places in the str
  We refer to the article \cite{holm:polylog} for details.
  \qed
  
-\corn{Fully dynamic MSF}
+\corn{Fully dynamic MSF}\id{dynmsfcorr}%
  There is a~fully dynamic MSF algorithm that works in time $\O(\log^4 n)$ amortized
  per operation for graphs on $n$~vertices.
  
@@ -727,5 +727,238 @@ Theorem \ref{dyncon} and $\O(\log n)$ on operations with the Sleator-Tarjan tree
  by Theorem \ref{sletar}.
  \qed
  
+%--------------------------------------------------------------------------------
+
+\section{Almost minimum trees}\id{kbestsect}%
+
+In some situations, finding the single minimum spanning tree is not enough and we are interested
+in the $K$~lightest spanning trees, usually for some small value of~$K$. Katoh, Ibaraki
+and Mine \cite{katoh:kmin} have given an~algorithm of time complexity $\O(m\log\beta(m,n) + Km)$,
+building on the MST algorithm of Gabow et al.~\cite{gabow:mst}.
+Subsequently, Eppstein \cite{eppstein:ksmallest} has discovered an~elegant preprocessing step which allows to reduce
+the running time to $\O(m\log\beta(m,n) + \min(K^2,Km))$ by eliminating edges
+which are either present in all $K$ trees or in none of them.
+We will show a~variant of their algorithm based on the MST verification
+procedure of Section~\ref{verifysect}.
+
+In this section, we will require the edge weights to be real numbers (or integers), because
+comparisons are certainly not enough to determine the second best spanning tree. We will
+assume that our computation model is able to add, subtract and compare the edge weights
+in constant time.
+
+Let us focus on finding the second best spanning tree first.
+
+\paran{Second best spanning tree}%
+Suppose that we have a~weighted graph~$G$ and a~sequence $T_1,\ldots,T_z$ of all its spanning
+trees. Also suppose that the weights of these spanning trees are distinct and that the sequence
+is ordered by weight, i.e., $w(T_1) < \ldots < w(T_z)$ and $T_1 = \mst(G)$. Let us observe
+that each tree is similar to at least one of its predecessors:
+
+\lemma\id{kbl}%
+For each $i>1$ there exists $j<i$ such that $T_i$ and~$T_j$ differ by a~single edge exchange.
+
+\proof
+We know from the Monotone exchange lemma (\ref{monoxchg}) that $T_1$ can be transformed
+to~$T_i$ by a~sequence of edge exchanges which never decreases tree weight. The last
+exchange in this sequence therefore obtains~$T_i$ from a~tree of the desired properties.
+\qed
+
+\para
+This lemma implies that the second best spanning tree~$T_2$ differs from~$T_1$ by a~single
+edge exchange and it remains to find which exchange it is. Let us consider the exchange
+of an~edge $f\in E\setminus T_1$ with an~edge $e\in T_1[f]$. We get a~tree $T_1-e+f$
+of weight $w(T_1)-w(e)+w(f)$. To obtain~$T_2$, we have to find~$e$ and~$f$ such that the
+difference $w(f)-w(e)$ is the minimum possible. Thus for every~$f$, the edge $e$~must be always
+the heaviest on the path $T_1[f]$. We can now use the algorithm from Corollary \ref{rampeaks}
+to find the heaviest edges (peaks) of all such paths and examine all possible choices of~$f$
+in linear time. This implies the following:
+
+\lemma
+Given~$G$ and~$T_1$, we can find~$T_2$ in time $\O(m)$.
+
+\paran{Third best spanning tree}%
+When we know~$T_1$ and~$T_2$, how to get~$T_3$? According to Lemma \ref{kbl}, $T_3$~can be
+obtained by a~single exchange from either~$T_1$ or~$T_2$. Therefore we need to find the
+best exchange for~$T_2$ and the second best exchange for~$T_1$ and use the better of them.
+The latter is not easy to find directly, so we will make a~minor side step.
+
+We know that $T_2=T_1-e+f$ for some edges $e$ and~$f$. We define two auxiliary graphs:
+$G_1 := G\sgc e$ ($G$~with the edge~$e$ contracted) and $G_2 := G-e$. The tree~$T_1\sgc e$ is
+obviously the MST of~$G_1$ (by the Contraction lemma) and $T_2$ is the MST of~$G_2$ (all
+$T_2$-light edges in~$G_2$ would be $T_1$-light in~$G$).
+
+\obs
+The tree $T_3$~can be obtained by a~single edge exchange in either $(G_1,T_1\sgc e)$ or $(G_2,T_2)$:
+
+\itemize\ibull
+\:If $T_3 = T_1-e'+f'$ for $e'\ne e$, then $T_3\sgc e = (T_1\sgc e)-e'+f'$ in~$G_1$.
+\:If $T_3 = T_1-e+f'$, then $T_3 = T_2 - f + f'$ in~$G_2$.
+\:If $T_3 = T_2-e'+f'$, then this exchange is found in~$G_2$.
+\endlist
+
+\>Conversely, a~single exchange in $(G_1,T_1\sgc e)$ or in $(G_2,T_2)$ corresponds
+to an~exchange in either~$(G,T_1)$ or $(G,T_2)$.
+Even stronger, a~spanning tree~$T$ of~$G$ either contains~$e$ and then $T\sgc
+e$ is a~spanning tree of~$G_1$, or $T$~doesn't contain~$e$ and so it is
+a~spanning tree of~$G_2$.
+
+Thus we can run the previous algorithm for finding the best edge exchange
+on both~$G_1$ and~$G_2$ and find~$T_3$ again in time $\O(m)$.
+
+\paran{Further spanning trees}%
+The construction of auxiliary graphs can be iterated to obtain $T_1,\ldots,T_K$
+for an~arbitrary~$K$. We will build a~\df{meta-tree} of auxilary graphs. Each node of this meta-tree
+is assigned a~minor of~$G$ and its minimum spanning tree. The root node contains~$(G,T_1)$,
+its sons have $(G_1,T_1\sgc e)$ and $(G_2,T_2)$. When $T_3$ is obtained by an~exchange
+in one of these sons, we attach two new leaves to that son and we assign the two auxiliary
+graphs derived by contracting or deleting the exchanged edge. Then we find the best
+edge exchanges among the new sons and repeat the process. By the above observation,
+each spanning tree of~$G$ is generated exactly once. Lemma \ref{kbl} guarantees that
+the trees are enumerated in the increasing order.
+
+Recalculating the best exchanges in all leaves of the meta-tree after generating each~$T_i$
+is of course not necessary, because most leaves stay unchanged. We will rather remember
+the best exchange for each leaf and keep their values in a~heap. In every step, we will
+delete the minimum from the heap and use the exchange in the particular leaf to generate
+a~new tree. Then we will create the new leaves, calculate their best exchanges and insert
+them into the heap. The algorithm is now straightforward and so will be its analysis:
+
+\algn{Finding $K$ best spanning trees}\id{kbestalg}%
+\algo
+\algin A~weighted graph~$G$, its MST~$T_1$ and an~integer $K>0$.
+\:$R\=$ a~meta tree whose vertices carry triples $(G',T',F')$. Initially
+  it contains just a~root with $(G,T_1,\emptyset)$.
+  \hfil\break
+  \cmt{$G'$ is a~minor of~$G$, $T'$~is its MST, and~$F'$ is a~set of edges of~$G$
+  that are contracted in~$G'$.}
+\:$H\=$ a~heap of quadruples $(\delta,r,e,f)$ ordered on~$\delta$, initially empty.
+  \hfil\break
+  \cmt{Each quadruple describes an~exchange of~$e$ for~$f$ in a~leaf~$r$ of~$R$ and $\delta=w(f)-w(e)$
+  is the weight gain of this exchange.}
+\:Find the best edge exchange in~$(G,T_1)$ and insert it to~$H$.
+\:$i\= 1$.
+\:While $i<K$:
+\::Delete the minimum quadruple $(\delta,r,e,f)$ from~$H$.
+\::$(G',T',F') \=$ the triple carried by the leaf~$r$.
+\::$i\=i+1$.
+\::$T_i\=(T'-e+f) \cup F'$. \cmt{The next spanning tree}
+\::$r_1\=$ a~new leaf carrying $(G'\sgc e,T'\sgc e,F'+e)$.
+\::$r_2\=$ a~new leaf carrying $(G'-e,T_i,F')$.
+\::Attach~$r_1$ and~$r_2$ as sons of~$r$.
+\::Find the best edge exchanges in~$r_1$ and~$r_2$ and insert them to~$H$.
+\algout The spanning trees $T_2,\ldots,T_K$.
+\endalgo
+
+\lemma\id{kbestl}%
+Given~$G$ and~$T_1$, we can find $T_2,\ldots,T_K$ in time $\O(Km + K\log K)$.
+
+\proof
+Generating each~$T_i$ requires finding the best exchange for two graphs and $\O(1)$
+operations on the heap. The former takes $\O(m)$ according to Corollary \ref{rampeaks},
+and each heap operation takes $\O(\log K)$.
+\qed
+
+\paran{Arbitrary weights}%
+While the assumption that the weights of all spanning trees are distinct has helped us
+in thinking about the problem, we should not forget that it is somewhat unrealistic.
+We could refine the proof of our algorithm and demonstrate that the algorithm indeed works
+without this assumption, but we will rather show that the ties can be broken easily.
+
+Let~$\delta$ be the minimum positive difference of weights of spanning trees
+of~$G$ and $e_1,\ldots,e_m$ be the edges of~$G$. We observe that it suffices to
+increase $w(e_i)$ by~$\delta_i = \delta/2^{i+1}$. The cost of every spanning tree
+has increased by at most $\sum_i\delta_i < \delta/2$, so if $T$~was lighter
+than~$T'$, it still is. On the other hand, the no two trees share the same
+weight difference, so all tree weights are now distinct.
+
+The exact value of~$\delta$ is not easy to calculate, but examination of the algorithm
+reveals that it is not needed at all. The only place where the edge weights are examined
+is when we search for the best exchange. In this case, we compare the differences of
+pairs of edge weights with each other. Each such difference is therefore adjusted
+by $\delta\cdot(2^{-i}-2^{-j})$ for some $i,j>1$, which again does not influence comparison
+of originally distinct differences. If the differences were the same, it is sufficient
+to look at their values of~$i$ and~$j$, i.e., at the identifiers of the edges.
+
+\paran{Invariant edges}%
+Our algorithm can be further improved for small values of~$K$ (which seems to be the common
+case in most applications) by the reduction of Eppstein \cite{eppstein:ksmallest}.
+We will observe that there are many edges of~$T_1$
+which are guaranteed to be contained in $T_2,\ldots,T_K$ as well, and likewise there are
+many edges of $G\setminus T_1$ which are certainly not present in those spanning trees.
+The idea is the following (we again assume that the tree weights are distinct):
+
+\defn
+For an~edge $e\in T_1$, we define its \df{gain} $g(e)$ as the minimum weight gained by an~edge exchange
+replacing~$e$. Similarly, we define $G(f)$ for $f\not\in T_1$. Put formally:
+$$\eqalign{
+g(e) &:= \min\{ w(f)-w(e) \mid f\in E, e\in T[f] \} \cr
+G(f) &:= \min\{ w(f)-w(e) \mid e\in E, e\in T[f] \}.\cr
+}$$
+
+\lemma\id{gaina}%
+When $t_1,\ldots,t_{n-1}$ are the edges of~$T_1$ in order of increasing gain,
+the edges $t_K,\ldots,t_n$ are present in all trees $T_2,\ldots,T_K$.
+
+\proof
+The best exchanges in~$T_1$ involving $t_1,\ldots,t_{K-1}$ produce~$K-1$ spanning trees
+of increasing weights. Any exchange involving $t_K,\ldots,t_n$ produces a~tree
+which is heavier or equal than those. (We are ascertained by the Monotone exchange lemma
+that the gain of such exchanges cannot be reverted by any later exchanges.)
+\qed
+
+\lemma\id{gainb}%
+When $q_1,\ldots,q_{m-n+1}$ are the edges of $G\setminus T_1$ in order of increasing gain,
+the edges $q_K,\ldots,q_{m-n+1}$ are not present in any of $T_2,\ldots,T_K$.
+
+\proof
+Similar to the previous lemma.
+\qed
+
+\para
+It is therefore sufficient to find $T_2,\ldots,T_K$ in the graph obtained from~$G$ by
+contracting the edges $t_K,\ldots,t_n$ and deleting $q_K,\ldots,q_{m-n+1}$. This graph
+has only $\O(K)$ vertices and $\O(K)$ edges. The only remaining question is how to
+calculate the gains. For edges outside~$T_1$, it again suffices to find the peaks of the
+covered paths. The gains of MST edges require a~different algorithm, but Tarjan \cite{tarjan:applpc}
+has shown that they can be obtained in time $\O(m\timesalpha(m,n))$.
+
+When we put the results of this section together, we obtain:
+
+\thmn{Finding $K$ best spanning trees}\id{kbestthm}%
+For a~given graph~$G$ with real edge weights, the $K$~best spanning trees can be found
+in time $\O(m\timesalpha(m,n) + \min(K^2,Km + K\log K))$.
+
+\proof
+First we find the MST of~$G$ in time $\O(m\timesalpha(m,n))$ using the Pettie's Optimal
+MST algorithm (Theorem \ref{optthm}). Then we calculate the gains of MST edges by the
+Tarjan's algorithm from \cite{tarjan:applpc}, again in $\O(m\timesalpha(m,n))$, and
+the gains of the other edges using our MST verification algorithm (Corollary \ref{rampeaks})
+in $\O(m)$. We Lemma \ref{gaina} to identify edges that are required, and Lemma \ref{gainb}
+to find edges that are useless. We contract the former edges, remove the latter ones
+and run Algorithm \ref{kbestalg} to find the trees. By Lemma \ref{kbestl}, it runs in
+the desired time.
+
+If~$K\ge m$, this reduction does not pay off, so we run Algorithm \ref{kbestalg}
+directly on the input graph.
+\qed
+
+\paran{Improvements}%
+It is an~interesting open question whether the algorithms of Section \ref{verifysect} can
+be modified to calculate all gains. The main procedure can, but it requires to reduce
+the input to a~balance tree first and here the Bor\o{u}vka trees fail. The Buchsbaum's
+Pointer-Machine algorithm (\ref{pmverify}) is more promising.
+
+\paran{Large~$K$}%
+When $K$~is large, re-running the verification algorithm for every change of the graph
+is too costly. Frederickson \cite{frederickson:ambivalent} has shown how to find the best
+swaps dynamically, reducing the overall time complexity of Algorithm \ref{kbestalg}
+to $\O(Km^{1/2})$ and improving Theorem \ref{kbestthm} to $\O(m\timesalpha(m,n)
++ \min( K^{3/2}, Km^{1/2} ))$. It is open if the dynamic data structures of this
+chapter could be modified to bring the complexity of finding the next tree down
+to polylogarithmic.
+
+\paran{Multiple minimum trees}%
+Another nice application of Theorem \ref{kbestthm} is finding all minimum spanning
+trees in a~graph that does not have distinct edge weights.
  
  \endpart