From 64d76c5fafb26579daa94b141e057e0ed88cb64a Mon Sep 17 00:00:00 2001 From: Martin Mares Date: Sun, 27 Apr 2008 21:42:35 +0200 Subject: [PATCH] Corrections of the K-best section. --- dyn.tex | 109 +++++++++++++++++++++++++++++++------------------------- 1 file changed, 60 insertions(+), 49 deletions(-) diff --git a/dyn.tex b/dyn.tex index 7197b22..221472c 100644 --- a/dyn.tex +++ b/dyn.tex @@ -742,19 +742,19 @@ We will show a~variant of their algorithm based on the MST verification procedure of Section~\ref{verifysect}. In this section, we will require the edge weights to be real numbers (or integers), because -comparisons are certainly not enough to determine the second best spanning tree. We will +comparisons are certainly not sufficient to determine the second best spanning tree. We will assume that our computation model is able to add, subtract and compare the edge weights in constant time. -Let us focus on finding the second best spanning tree first. +Let us focus on finding the second lightest spanning tree first. -\paran{Second best spanning tree}% +\paran{Second lightest spanning tree}% Suppose that we have a~weighted graph~$G$ and a~sequence $T_1,\ldots,T_z$ of all its spanning trees. Also suppose that the weights of these spanning trees are distinct and that the sequence is ordered by weight, i.e., $w(T_1) < \ldots < w(T_z)$ and $T_1 = \mst(G)$. Let us observe that each tree is similar to at least one of its predecessors: -\lemma\id{kbl}% +\lemman{Difference lemma}\id{kbl}% For each $i>1$ there exists $jConversely, a~single exchange in $(G_1,T_1\sgc e)$ or in $(G_2,T_2)$ corresponds +\>Conversely, a~single exchange in $(G_1,T_1/e)$ or in $(G_2,T_2)$ corresponds to an~exchange in either~$(G,T_1)$ or $(G,T_2)$. Even stronger, a~spanning tree~$T$ of~$G$ either contains~$e$ and then $T\sgc e$ is a~spanning tree of~$G_1$, or $T$~doesn't contain~$e$ and so it is @@ -807,20 +807,20 @@ on both~$G_1$ and~$G_2$ and find~$T_3$ again in time $\O(m)$. \paran{Further spanning trees}% The construction of auxiliary graphs can be iterated to obtain $T_1,\ldots,T_K$ -for an~arbitrary~$K$. We will build a~\df{meta-tree} of auxilary graphs. Each node of this meta-tree +for an~arbitrary~$K$. We will build a~\df{meta-tree} of auxiliary graphs. Each node of this meta-tree is assigned a~minor of~$G$ and its minimum spanning tree. The root node contains~$(G,T_1)$, -its sons have $(G_1,T_1\sgc e)$ and $(G_2,T_2)$. When $T_3$ is obtained by an~exchange -in one of these sons, we attach two new leaves to that son and we assign the two auxiliary +its sons have $(G_1,T_1/e)$ and $(G_2,T_2)$. When $T_3$ is obtained by an~exchange +in one of these sons, we attach two new leaves to that son and we assign to them the two auxiliary graphs derived by contracting or deleting the exchanged edge. Then we find the best -edge exchanges among the new sons and repeat the process. By the above observation, -each spanning tree of~$G$ is generated exactly once. Lemma \ref{kbl} guarantees that +edge exchanges among all leaves of the new meta-tree and repeat the process. By Observation \ref{tbobs}, +each spanning tree of~$G$ is generated exactly once. The Difference lemma guarantees that the trees are enumerated in the increasing order. Recalculating the best exchanges in all leaves of the meta-tree after generating each~$T_i$ is of course not necessary, because most leaves stay unchanged. We will rather remember -the best exchange for each leaf and keep their values in a~heap. In every step, we will +the best exchange for each leaf and keep the weight differences of these exchanges in a~heap. In every step, we will delete the minimum from the heap and use the exchange in the particular leaf to generate -a~new tree. Then we will create the new leaves, calculate their best exchanges and insert +a~new spanning tree. Then we will create the new leaves, calculate their best exchanges and insert them into the heap. The algorithm is now straightforward and so will be its analysis: \algn{Finding $K$ best spanning trees}\id{kbestalg}% @@ -842,7 +842,7 @@ them into the heap. The algorithm is now straightforward and so will be its anal \::$(G',T',F') \=$ the triple carried by the leaf~$r$. \::$i\=i+1$. \::$T_i\=(T'-e+f) \cup F'$. \cmt{The next spanning tree} -\::$r_1\=$ a~new leaf carrying $(G'\sgc e,T'\sgc e,F'+e)$. +\::$r_1\=$ a~new leaf carrying $(G'/e,T'/e,F'+e)$. \::$r_2\=$ a~new leaf carrying $(G'-e,T_i,F')$. \::Attach~$r_1$ and~$r_2$ as sons of~$r$. \::Find the best edge exchanges in~$r_1$ and~$r_2$ and insert them to~$H$. @@ -864,19 +864,19 @@ in thinking about the problem, we should not forget that it is somewhat unrealis We could refine the proof of our algorithm and demonstrate that the algorithm indeed works without this assumption, but we will rather show that the ties can be broken easily. -Let~$\delta$ be the minimum positive difference of weights of spanning trees +Let~$\delta$ be the minimum positive difference among the weights of all spanning trees of~$G$ and $e_1,\ldots,e_m$ be the edges of~$G$. We observe that it suffices to increase $w(e_i)$ by~$\delta_i = \delta/2^{i+1}$. The cost of every spanning tree has increased by at most $\sum_i\delta_i < \delta/2$, so if $T$~was lighter -than~$T'$, it still is. On the other hand, the no two trees share the same +than~$T'$, it still is. On the other hand, no two trees share the same weight difference, so all tree weights are now distinct. -The exact value of~$\delta$ is not easy to calculate, but examination of the algorithm +The exact value of~$\delta$ is not easy to calculate, but closer inspection of the algorithm reveals that it is not needed at all. The only place where the edge weights are examined is when we search for the best exchange. In this case, we compare the differences of pairs of edge weights with each other. Each such difference is therefore adjusted by $\delta\cdot(2^{-i}-2^{-j})$ for some $i,j>1$, which again does not influence comparison -of originally distinct differences. If the differences were the same, it is sufficient +of originally distinct differences. If two differences were identical, it is sufficient to look at their values of~$i$ and~$j$, i.e., at the identifiers of the edges. \paran{Invariant edges}% @@ -884,25 +884,25 @@ Our algorithm can be further improved for small values of~$K$ (which seems to be case in most applications) by the reduction of Eppstein \cite{eppstein:ksmallest}. We will observe that there are many edges of~$T_1$ which are guaranteed to be contained in $T_2,\ldots,T_K$ as well, and likewise there are -many edges of $G\setminus T_1$ which are certainly not present in those spanning trees. -The idea is the following (we again assume that the tree weights are distinct): +many edges of $G\setminus T_1$ which are excluded from those spanning trees. +The idea is the following (again assuming that the tree weights are distinct): \defn -For an~edge $e\in T_1$, we define its \df{gain} $g(e)$ as the minimum weight gained by an~edge exchange -replacing~$e$. Similarly, we define $G(f)$ for $f\not\in T_1$. Put formally: +For an~edge $e\in T_1$, we define its \df{gain} $g(e)$ as the minimum weight gained by exchanging~$e$ +for another edge. Similarly, we define the gain $G(f)$ for $f\not\in T_1$. Put formally: $$\eqalign{ g(e) &:= \min\{ w(f)-w(e) \mid f\in E, e\in T[f] \} \cr -G(f) &:= \min\{ w(f)-w(e) \mid e\in E, e\in T[f] \}.\cr +G(f) &:= \min\{ w(f)-w(e) \mid e\in T[f] \}.\cr }$$ \lemma\id{gaina}% When $t_1,\ldots,t_{n-1}$ are the edges of~$T_1$ in order of increasing gain, -the edges $t_K,\ldots,t_n$ are present in all trees $T_2,\ldots,T_K$. +the edges $t_K,\ldots,t_{n-1}$ are present in all trees $T_2,\ldots,T_K$. \proof The best exchanges in~$T_1$ involving $t_1,\ldots,t_{K-1}$ produce~$K-1$ spanning trees of increasing weights. Any exchange involving $t_K,\ldots,t_n$ produces a~tree -which is heavier or equal than those. (We are ascertained by the Monotone exchange lemma +which is heavier or equal than all those trees. (We are ascertained by the Monotone exchange lemma that the gain of such exchanges cannot be reverted by any later exchanges.) \qed @@ -917,15 +917,15 @@ Similar to the previous lemma. \para It is therefore sufficient to find $T_2,\ldots,T_K$ in the graph obtained from~$G$ by contracting the edges $t_K,\ldots,t_n$ and deleting $q_K,\ldots,q_{m-n+1}$. This graph -has only $\O(K)$ vertices and $\O(K)$ edges. The only remaining question is how to +has only $\O(K)$ vertices and $\O(K)$ edges. The only remaining hurdle is how to calculate the gains. For edges outside~$T_1$, it again suffices to find the peaks of the covered paths. The gains of MST edges require a~different algorithm, but Tarjan \cite{tarjan:applpc} -has shown that they can be obtained in time $\O(m\timesalpha(m,n))$. +has shown how to obtain them in time $\O(m\timesalpha(m,n))$. -When we put the results of this section together, we obtain: +When we put the results of this section together, we can conclude: -\thmn{Finding $K$ best spanning trees}\id{kbestthm}% -For a~given graph~$G$ with real edge weights, the $K$~best spanning trees can be found +\thmn{Finding $K$ lightest spanning trees}\id{kbestthm}% +For a~given graph~$G$ with real edge weights and a~positive integer~$K$, the $K$~best spanning trees can be found in time $\O(m\timesalpha(m,n) + \min(K^2,Km + K\log K))$. \proof @@ -933,9 +933,9 @@ First we find the MST of~$G$ in time $\O(m\timesalpha(m,n))$ using the Pettie's MST algorithm (Theorem \ref{optthm}). Then we calculate the gains of MST edges by the Tarjan's algorithm from \cite{tarjan:applpc}, again in $\O(m\timesalpha(m,n))$, and the gains of the other edges using our MST verification algorithm (Corollary \ref{rampeaks}) -in $\O(m)$. We Lemma \ref{gaina} to identify edges that are required, and Lemma \ref{gainb} -to find edges that are useless. We contract the former edges, remove the latter ones -and run Algorithm \ref{kbestalg} to find the trees. By Lemma \ref{kbestl}, it runs in +in $\O(m)$. We use Lemma \ref{gaina} to identify edges that are required, and Lemma \ref{gainb} +to find edges that are superfluous. We contract the former edges, remove the latter ones +and run Algorithm \ref{kbestalg} to find the spanning trees. By Lemma \ref{kbestl}, it runs in the desired time. If~$K\ge m$, this reduction does not pay off, so we run Algorithm \ref{kbestalg} @@ -944,21 +944,32 @@ directly on the input graph. \paran{Improvements}% It is an~interesting open question whether the algorithms of Section \ref{verifysect} can -be modified to calculate all gains. The main procedure can, but it requires to reduce -the input to a~balance tree first and here the Bor\o{u}vka trees fail. The Buchsbaum's -Pointer-Machine algorithm (\ref{pmverify}) is more promising. +be modified to calculate all gains in linear time. The main procedure could be, but it requires having +the input reduced to a~balance tree beforehand and here the Bor\o{u}vka trees fail. The Buchsbaum's +Pointer-Machine algorithm (\ref{pmverify}) seems to be more promising. \paran{Large~$K$}% When $K$~is large, re-running the verification algorithm for every change of the graph is too costly. Frederickson \cite{frederickson:ambivalent} has shown how to find the best swaps dynamically, reducing the overall time complexity of Algorithm \ref{kbestalg} -to $\O(Km^{1/2})$ and improving Theorem \ref{kbestthm} to $\O(m\timesalpha(m,n) +to $\O(Km^{1/2})$ and improving the bound in Theorem \ref{kbestthm} to $\O(m\timesalpha(m,n) + \min( K^{3/2}, Km^{1/2} ))$. It is open if the dynamic data structures of this chapter could be modified to bring the complexity of finding the next tree down to polylogarithmic. \paran{Multiple minimum trees}% Another nice application of Theorem \ref{kbestthm} is finding all minimum spanning -trees in a~graph that does not have distinct edge weights. +trees in a~graph that does not have distinct edge weights. We find a~single MST using +any of the algorithms of the previous chapters and then we use the enumeration algorithm +of this section to find further spanning trees as long as their weights are minimum. + +We can even use the reduction of the number of edges from Lemmata \ref{gaina} and \ref{gainb}: +we start with some fixed~$K$ and when we exhaust all~$K$ trees, we double~$K$ and restart +the whole process. The extra time spent on these restarts is bounded by the time of the +final pass. + +This finally settles the question that we have asked ourselves in Section \ref{mstbasics}, +namely whether we lose anything by assuming that all weights are distinct and searching +for the single minimum tree. \endpart -- 2.39.2