X-Git-Url: http://mj.ucw.cz/gitweb/?a=blobdiff_plain;f=dyn.tex;h=7197b22b68b2e8cd77515bc49c8ec3d6a180589b;hb=285cf4909b333868d6f1a35cc3e695c697a58e00;hp=5a7a8793fa28960fbdc689328297ad72d556f695;hpb=5a80c7a83690193aadbb716b196231b97a5b09ca;p=saga.git diff --git a/dyn.tex b/dyn.tex index 5a7a879..7197b22 100644 --- a/dyn.tex +++ b/dyn.tex @@ -68,8 +68,8 @@ therefore also of~$F$), we have to add~$e$ to~$F$. Otherwise, one of the followi Either $e$~is $F$-heavy and so the forest~$F$ is also the MSF of the new graph. Or it is $F$-light and we have to modify~$F$ by exchanging the heaviest edge~$f$ of the path $F[e]$ with~$e$. -Correctness of the former case follows immediately from the Theorem on Minimality by order -(\ref{mstthm}), because any $F'$-light would be also $F$-light, which is impossible as $F$~was +Correctness of the former case follows immediately from the Minimality Theorem (\ref{mstthm}), +because any $F'$-light would be also $F$-light, which is impossible as $F$~was minimum. In the latter case, the edge~$f$ is not contained in~$F'$ because it is the heaviest on the cycle $F[e]+e$ (by the Red lemma, \ref{redlemma}). We can now use the Blue lemma (\ref{bluelemma}) to prove that it should be replaced with~$e$. Consider the tree~$T$ @@ -185,7 +185,7 @@ from the Link-Cut tree, knowing that there is no edge to replace it). The hard p is the search for replacement edges after an~edge belonging to the MSF is deleted. This very problem also has to be solved by algorithms for fully dynamic connectivity, -we will take a~look on them first. +we will take a~look at them first. %-------------------------------------------------------------------------------- @@ -198,32 +198,36 @@ trees, but it is much simpler and instead of path operations it offers efficient subtrees. It is also possible to attach auxiliary data to vertices and edges of the original tree. \defn\id{eulseq}% -The \df{Eulerian Tour sequence} $\Eul(T)$ of a~rooted tree~$T$ is the sequence of vertices of~$T$ as visited -by the depth-first traversal of~$T$. More precisely, it is generated by the following algorithm $\(v)$ +Let~$T$ be a~rooted tree. We will call a~sequence of vertices of~$T$ its \df{Eulerian Tour sequence (ET-sequence)} +if it lists the vertices visited by the depth-first traversal of~$T$. +More precisely, it can be generated by the following procedure $\(v)$ when it is invoked on the root of the tree: \algo \:Record~$v$ in the sequence. \:For each son~$w$ of~$v$: -\::Call $\(w)$ recursively. +\::Call $\(w)$. \::Record~$w$. \endalgo -\>One of the occurrences of each vertex is defined as its \df{active occurrence} and it will -be used to store auxiliary data associated with the vertex. +\>A~single tree can have multiple ET-sequences, corresponding to different orders in which the +sons can be enumerated in step~2. + +In every ET-tree, one of the occurrences of each vertex is defined as its \df{active occurrence} and +it will be used to store auxiliary data associated with that vertex. \obs -The ET-sequence contains a~vertex of degree~$d$ exactly $d$~times except for the root which +An~ET-sequence contains a~vertex of degree~$d$ exactly $d$~times except for the root which occurs $d+1$ times. The whole sequence therefore contains $2n-1$ elements. It indeed describes the order vertices on an~Eulerian tour in the tree with all edges doubled. Let us observe what happens -to the ET-sequence when we modify the tree. +to an~ET-sequence when we modify the tree. When we \em{delete} an~edge $uv$ from the tree~$T$ (let $u$~be the parent of~$v$), the sequence -$\Eul(T) = AuvBvuC$ (with no~$u$ nor~$v$ in~$B$) splits to two ET-sequences $AuC$ and $vBv$. -If there was only a~single occurrence of~$v$, it corresponded to a~leaf and thus the second -sequence should consist of $v$~alone. +$AuvBvuC$ (with no~$u$ nor~$v$ in~$B$) splits to two sequences $AuC$ and $vBv$. +If there was only a~single occurrence of~$v$, then $v$~was a~leaf and thus the sequence +transforms from $AuvuC$ to $AuC$ and $v$~alone. -\em{Changing the root} of the tree~$T$ from~$v$ to~$w$ changes $\Eul(T)$ from $vAwBwCv$ to $wBwCvAw$. +\em{Changing the root} of the tree~$T$ from~$v$ to~$w$ changes its ET-sequence from $vAwBwCv$ to $wBwCvAw$. If $w$~was a~leaf, the sequence changes from $vAwCv$ to $wCvAw$. If $vw$ was the only edge of~$T$, -the sequence $vw$ becomes $wv$. Note that this works regardless of the presence of~$w$ inside~$B$. +the sequence $vw$ becomes $wv$. Note that this works regardless of the possible presence of~$w$ inside~$B$. \em{Joining} the roots of two trees by a~new edge makes their ET-sequences $vAv$ and~$wBw$ combine to $vAvwBwv$. Again, we have to handle the cases when $v$ or~$w$ has degree~1 separately: @@ -233,64 +237,66 @@ $v$~and~$wBw$ combine to $vwBwv$, and $v$~with~$w$ makes $vwv$. \hbox{\epsfbox{pic/ettree.eps}}\cr \noalign{\qquad\quad} \halign{#\hfil\cr - $\Eul(T_1) = 0121034546474308980$,~~$\Eul(T_2) = aba$. \cr - $\Eul(T_1-34) = 01210308980, 4546474$. \cr - $\Eul(T_1\hbox{~rooted at~3}) = 3454647430898012103$. \cr - $\Eul(T_1+0a+T_2)$: $0121034546474308980aba0$. \cr + $T_1: 0121034546474308980$,~~$T_2: aba$. \cr + $T_1-34: 01210308980, 4546474$. \cr + $T_1\hbox{~rooted at~3}: 3454647430898012103$. \cr + $T_1+0a+T_2$: $0121034546474308980aba0$. \cr }\cr }}{Trees and their ET-sequences} If any of the occurrences that we have removed from the sequence was active, there is always a~new occurrence of the same vertex that can stand in its place and inherit the auxiliary data. -The ET-trees will represent the ET-sequences by $(a,b)$-trees with the parameter~$a$ set upon +The ET-trees will store the ET-sequences as $(a,b)$-trees with the parameter~$a$ set upon initialization of the structure and with $b=2a$. We know from the standard theorems of $(a,b)$-trees (see for example \cite{clrs}) that the depth of a~tree with $n$~leaves is always $\O(\log_a n)$ and that all basic operations including insertion, deletion, search, splitting and joining the trees -run in time $\O(a\log_a n)$ in the worst case. +run in time $\O(b\log_a n)$ in the worst case. -We will use the ET-trees to maintain a~spanning forest of the current graph. The auxiliary data of -each vertex will hold a~list of edges incident with the given vertex, which do not lie in the +We will use the ET-trees to maintain a~spanning forest of the dynamic graph. The auxiliary data of +each vertex will hold a~list of edges incident with the given vertex, that do not lie in the forest. Such edges are usually called the \df{non-tree edges.} \defn -\df{Eulerian Tour trees} are a~data structure that represents a~forest of trees and a~set of non-tree +\df{Eulerian Tour trees (ET-trees)} are a~data structure that represents a~forest of trees and a~set of non-tree edges associated with the vertices of the forest. To avoid confusion, we will distinguish between -\df{original} vertices and edges (of the original trees) and the vertices and edges of the +\df{original} vertices and edges (of the given trees) and the vertices and edges of the data structure. The structure consists of: \itemize\ibull -\:A~collection of $(a,b)$-trees with some fixed parameters $a$ and~$b$. +\:A~collection of $(a,b)$-trees of some fixed parameters $a$ and~$b$. Each such tree corresponds to one of the original trees~$T$. Its leaves (in the usual tree order) correspond to the elements - of the ET-sequence $\Eul(T)$. Each two consecutive leaves $u$ and~$v$ are separated + of an~ET-sequence for~$T$. Each two consecutive leaves $u$ and~$v$ are separated by a~unique key stored in an~internal vertex of the $(a,b)$-tree. This key is used to represent the original edge~$uv$. Each original edge is therefore kept in both its orientations. -\:A~mapping $\(v)$ that maps each original vertex to the leaf containing its active occurrence. -\:A~mapping $\(e)$ that maps an~original edge~$e$ to one of the internal keys representing it. -\:A~mapping $\(k)$ that maps an~internal key~$k$ to the other internal key of the same - original edge. +\:Mappings \, \ and \: + \itemize\icirc + \:$\(v)$ maps each original vertex to the leaf containing its active occurrence; + \:$\(e)$ of an~original edge~$e$ is one of the internal keys representing~it; + \:$\(k)$ pairs an~internal key~$k$ with the other internal key of the same original edge. + \endlist \:A~list of non-tree edges placed in each leaf. The lists are allowed to be non-empty only in the leaves that represent active occurrences of original vertices. \:Boolean \df{markers} in the internal vertices that signal presence of a~non-tree - edge anywhere in the subtree rooted at that internal vertex. + edge anywhere in the subtree rooted at the internal vertex. \:Counters $\(v)$ that contain the number of leaves in the subtree rooted at~$v$. \endlist -\>The structure supports the following operations on the original trees: + +\defn +The ET-trees support the following operations on the original trees: \itemize\ibull \:\ --- Create a~single-vertex tree. \:$\(u,v)$ --- Join two different trees by an~edge~$uv$ and return a~unique identifier of this edge. \:$\(e)$ --- Split a~tree by removing the edge~$e$ given by its identifier. -\:$\(v)$ --- Return the root of the ET-tree containing the vertex~$v$. This can be used - to test whether two vertices lie in the same tree. However, the root is not guaranteed - to stay the same when the tree is modified by a~\ or \. +\:$\(u,v)$ --- Test if the vertices $u$ and~$v$ lie in the same tree. \:$\(v)$ --- Return the number of vertices in the tree containing the vertex~$v$. \:$\(v,e)$ --- Add a~non-tree edge~$e$ to the list at~$v$ and return a~unique identifier of this edge. \:$\(e)$ --- Delete a~non-tree edge~$e$ given by its identifier. -\:$\(v)$ --- Return non-tree edges stored in the same tree as~$v$. +\:$\(v)$ --- Return a~list of non-tree edges associated with the vertices + of the $v$'s tree. \endlist -\>If the non-tree edges have weights, \impl We will implement the operations on the ET-trees by translating the intended changes of the @@ -302,12 +308,12 @@ the $(a,b)$-trees. and joins them back in the different order. \ of two trees can be accomplished by making both vertices the roots of their trees first -and joining the roots by an~edge afterwards. Re-rooting again involves splits and joins of $(a,b)$-trees. +and joining the roots by an~edge afterwards. Re-rooting involves splits and joins of $(a,b)$-trees. As we can split at any occurrence of the new root vertex, we will use the active occurrence -which we remember. Joining of the roots translated to joining of the $(a,b)$-trees. +which we remember. Linking of the roots is translated to joining of the $(a,b)$-trees. -\ just follows parent pointers in the $(a,b)$-tree and it walks the path from the leaf -to the root. +\ follows parent pointers from both $u$ and~$v$ to the roots of their trees. +Then it checks if the roots are equal. \ finds the root and returns its counter. @@ -322,17 +328,17 @@ counter and, if necessary, the markers and counters on the path to the root. \ traverses the tree recursively from the root, but it does not enter the subtrees whose roots are not marked. -Analysis of the time complexity is now straightforward: +Analysis of time complexity of the operations is now straightforward: \thmn{Eulerian Tour trees, Henzinger and Rauch \cite{henzinger:randdyn}}\id{etthm}% The ET-trees perform the operations \ and \ in time $\O(a\log_a n)$, \ -in $\O(1)$, \, \, and \ in $\O(\log_a n)$, and +in $\O(1)$, \, \, \, and \ in $\O(\log_a n)$, and \ in $\O(a\log_a n)$ per edge reported. Here $n$~is the number of vertices in the original forest and $a\ge 2$ is an~arbitrary constant. \proof We set $b=2a$. Our implementation performs $\O(1)$ operations on the $(a,b)$-trees -per operation on the ET-tree, plus $\O(1)$ other operations. We apply the standard theorems +per operation on the ET-tree, plus $\O(1)$ other work. We apply the standard theorems on the complexity of $(a,b)$-trees \cite{clrs}. \qed @@ -342,18 +348,18 @@ of~$n$ can also have its beauty. Suppose that there is a~data structure which ma spanning forest of a~dynamic graph. Suppose also that the structure works in time $\O(\log^k n)$ per operation and that it reports $\O(1)$ changes in the spanning forest for every change in the graph. If we keep the spanning forest in ET-trees with $a=\log n$, the updates of the -data structure cost an~extra $\O(\log^2 n / \log\log n)$, but queries accelerate to $\O(\log +data structure cost an~additional $\O(\log^2 n / \log\log n)$, but connectivity queries accelerate to $\O(\log n/\log\log n)$. \paran{ET-trees with weights} -In some cases, we will also need to represent weighted graphs and enumerate the non-tree +In some cases, we will also need a~representation of weighted graphs and enumerate the non-tree edges in order of their increasing weights (in fact, it will be sufficient to find the lightest one, remove it and iterate). This can be handled by a~minute modification of the ET-trees. The tree edges will remember their weight in the corresponding internal keys of the ET-tree. We replace each list of non-tree edges by an~$(a,b)$-tree keeping the edges sorted by weight. -We also store the minimum element of the tree separately, so that it can be accessed in constant +We also store the minimum element of that tree separately, so that it can be accessed in constant time. The boolean \em{marker} will then become the minimum weight of a~non-tree edge attached to the particular subtree, which can be recalculated as easy as the markers can. Searching for the lightest non-tree edge then just follows the modified markers. @@ -373,7 +379,7 @@ The time bounds in Theorem \ref{etthm} hold for the weighted ET-trees, too. \section{Dynamic connectivity} The fully dynamic connectivity problem has a~long and rich history. In the 1980's, Frederickson \cite{frederickson:dynamic} -has used his topological trees to construct a~dynamic connectivity algorithm of complexity $\O(\sqrt m)$ per update +has used his topological trees to construct a~dynamic connectivity algorithm of complexity $\O(\sqrt m)$ per update and $\O(1)$ per query. Eppstein et al.~\cite{eppstein:sparsify} have introduced a~sparsification technique which can bring the updates down to $\O(\sqrt n)$. Later, several different algorithms with complexity on the order of $n^\varepsilon$ were presented by Henzinger and King \cite{henzinger:mst} and also by Mare\v{s} \cite{mares:dga}. @@ -395,39 +401,43 @@ We will maintain the following \em{invariants:} {\narrower \def\iinv{{\bo I\the\itemcount~}} \numlist\iinv -\:$F$~is the maximum spanning forest of~$G$ with respect to the levels. In other words, -if $uv$ is a~non-tree edge, then $u$ and~$v$ are connected in~$F_{\ell(uw)}$. -\:For each~$i$, the components of~$F_i$ have at most $\lfloor n/2^i \rfloor$ vertices. -This implies that all~$F_i$ for $i>L$ are empty. +\:$F$~is the maximum spanning forest of~$G$ with respect to the levels. (In other words, +if $uv$ is a~non-tree edge, then $u$ and~$v$ are connected in~$F_{\ell(uv)}$.) +\:For each~$i$, the components of~$F_i$ have at most $\lfloor n/2^i \rfloor$ vertices each. +(This implies that it does not make sense to define~$F_i$ for $i>L$, because it would be empty +anyway.) \endlist } At the beginning, the graph contains no edges, so both invariants are trivially -satistifed. Newly inserted edges can enter level~0, which cannot break I1 nor~I2. +satisfied. Newly inserted edges can enter level~0, which cannot break I1 nor~I2. When we delete a~tree edge at level~$\ell$, we split a~tree~$T$ of~$F_\ell$ to two trees $T_1$ and~$T_2$. Without loss of generality, let us assume that $T_1$ is the smaller one. We will try to find the replacement edge of the highest possible -level that connects them back. From I1, we know that such an~edge cannot belong to -level greater than~$\ell$, so we start looking for it at level~$\ell$. According +level that connects the spanning tree back. From I1, we know that such an~edge cannot belong to +a~level greater than~$\ell$, so we start looking for it at level~$\ell$. According to~I2, the tree~$T$ had at most $\lfloor n/2^\ell\rfloor$ vertices, so $T_1$ has at most $\lfloor n/2^{\ell+1} \rfloor$ of them. Thus we can increase the levels of all edges of~$T_1$ without violating either invariant. -We now start enumerating the non-tree edges incident with~$T_1$. For each such edge, -we test whether its other endpoint lies in~$T_2$. If it does, we have found the replacement -edge and we insert it to~$F_\ell$. Otherwise we move the edge one level up. (This will -be the gist of our amortization argument: We can charge most of the work on level +We now start enumerating the non-tree edges incident with~$T_1$. Each such edge +is either local to~$T_1$ or it joins $T_1$ with~$T_2$. We will therefore check each edge +whether its other endpoint lies in~$T_2$ and if it does, we have found the replacement +edge, so we insert it to~$F_\ell$ and stop. Otherwise we move the edge one level up. (This +will be the grist for the mill of our amortization argument: We can charge most of the work at level increases and we know that the level of each edge can reach at most~$L$.) If the non-tree edges at level~$\ell$ are exhausted, we try the same in the next -lower level and so on. If there is no replacement edge on level~0, the tree~$T$ +lower level and so on. If there is no replacement edge at level~0, the tree~$T$ remains disconnected. \impl -We will use a~single ET-tree with~$a$ set to~2 for each level. For the $i$-th level, the ET-tree -${\cal E}_i$ will represent the forest~$F_i$ and the non-tree edges of -level~$i$ will be attached to its vertices. +For each level, we will use a~separate ET-tree ${\cal E}_\ell$ with~$a$ set to~2, +which will represent the forest~$F_i$ and the non-tree edges at that particular level. +Besides operations on the non-tree edges, we also need to find the tree edges of level~$\ell$ +when we want to bring them one level up. This can be accomplished either by modifying the ET-trees +to attach two lists of edges attached to vertices instead of one, or by using a~second ET-tree. \algn{Insertion of an~edge} \algo @@ -446,33 +456,39 @@ level~$i$ will be attached to its vertices. \:If $uv$ is a~non-tree edge: \::Remove $uv$ from the lists of non-tree edges at both $u$ and~$v$ in~${\cal E}_{\ell}$. \:Otherwise: -\::Remove $uv$ from~$F_\ell$. -\::Call $\(uv,\ell)$. +\::Remove $uv$ from~$F_\ell$ and hence also from $F_0,\ldots,F_{\ell-1}$. +\::Call $\(uv,\ell)$ to get the replacement edge~$f$. +\::Insert $f$ to~$F_0,\ldots,F_{\ell(f)}$. \endalgo -\algn{$\(uv,i)$ -- Search for an~replacement edge for~$uv$ at level~$i$} +\algn{$\(uv,i)$ -- Search for replacement for edge~$uv$ at level~$i$} \algo +\algin An~edge~$uv$ to replace and a~level~$i$ such that there is no replacement +at levels greater than~$i$. \:Let $T_1$ and~$T_2$ be the trees in~$F_i$ containing $u$ and~$v$ respectively. \:If $n(T_1) > n(T_2)$, swap $T_1$ with~$T_2$. -\:Move all level~$i$ edges in~$T_1$ to level~$i+1$ and insert them to~${\cal E}_{i+1}$. +\:Find all level~$i$ edges in~$T_1$ using ${\cal E}_i$ and move them to level~$i+1$. \:Enumerate non-tree edges incident with vertices of~$T_1$ and stored in ${\cal E}_i$. For each edge~$xy$, $x\in T_1$, do: -\::If $y\in T_2$, add the edge $xy$ to~$F_i$ and return. -\::Otherwise increase $\ell(xy)$ by one. This includes deleting~$xy$ from~${\cal E}_i$ - and inserting it to~${\cal E}_{i+1}$. +\::If $y\in T_2$, remove~$xy$ from~${\cal E}_i$ and return it to the caller. +\::Otherwise increase $\ell(xy)$ by one. + \hfil\break + This includes deleting~$xy$ from~${\cal E}_i$ and inserting it to~${\cal E}_{i+1}$. \:If $i>0$, call $\(xy,i-1)$. +\:Otherwise return \. +\algout The replacement edge. \endalgo -\>Analysis of the time complexity is straightforward: +\>As promised, time complexity will be analysed by amortization on the levels. \thmn{Fully dynamic connectivity, Holm et al.~\cite{holm:polylog}}\id{dyncon}% -Dynamic connectivity can be maintained in time $\O(\log^2 n)$ amortized for both +Dynamic connectivity can be maintained in time $\O(\log^2 n)$ amortized per \ and \ and in time $\O(\log n/\log\log n)$ per \ in the worst case. \proof The direct cost of an~\ is $\O(\log n)$ for the operations on the ET-trees -(by Theorem \ref{etthm}). We will have it pre-pay all level increases of the new +(by Theorem \ref{etthm}). We will also have the insertion pre-pay all level increases of the new edge. Since the levels never decrease, each edge can be brought a~level up at most $L=\lfloor\log n\rfloor$ times. Every increase costs $\O(\log n)$ on the ET-tree operations, so we pay $\O(\log^2 n)$ for all of them. @@ -482,19 +498,31 @@ ET-trees. Additionally, we call \ up to $L$ times. The initialization o \ costs $\O(\log n)$ per call, the rest is paid for by the edge level increases. -To bring the complexity of \ from $\O(\log n)$ down to $\O(\log n/\log\log n)$, +To bring the complexity of the operation \ from $\O(\log n)$ down to $\O(\log n/\log\log n)$, we apply the trick from Example \ref{accel} and store~$F_0$ in a~ET-tree with $a=\max(\lfloor\log n\rfloor,2)$. This does not hurt the complexity of insertions and deletions, but allows for faster queries. \qed +\rem\id{dclower}% +An~$\Omega(\log n/\log\log n)$ lower bound for the amortized complexity of the dynamic connectivity +problem has been proven by Henzinger and Fredman \cite{henzinger:lowerbounds} in the cell +probe model with $\O(\log n)$-bit words. Thorup has answered by a~faster algorithm +\cite{thorup:nearopt} that achieves $\O(\log n\log^3\log n)$ time per update and +$\O(\log n/\log^{(3)} n)$ per query on a~RAM with $\O(\log n)$-bit words. (He claims +that the algorithm runs on a~Pointer Machine, but it uses arithmetic operations, +so it does not fit the definition of the PM we use. The algorithm only does not +need direct indexing of arrays.) So far, it is not known how to extend this algorithm +to fit our needs, so we omit the details. + %-------------------------------------------------------------------------------- -\section{Dynamic MSF} +\section{Dynamic spanning forests}\id{dynmstsect}% +Let us turn our attention back to the dynamic MSF now. Most of the early algorithms for dynamic connectivity also imply $\O(n^\varepsilon)$ algorithms for dynamic maintenance of the MSF. Henzinger and King \cite{henzinger:twoec,henzinger:randdyn} have generalized their randomized connectivity algorithm to maintain the MSF in $\O(\log^5 n)$ time per -operation, or $\O(k\log^3 n)$ if only~$k$ different values of edge weights are allowed. They have solved +operation, or $\O(k\log^3 n)$ if only $k$ different values of edge weights are allowed. They have solved the decremental version of the problem first (which starts with a~given graph and only edge deletions are allowed) and then presented a~general reduction from the fully dynamic MSF to its decremental version. We will describe the algorithm of Holm, de Lichtenberg and Thorup \cite{holm:polylog}, who have followed @@ -502,12 +530,12 @@ the same path. They have modified their dynamic connectivity algorithm to solve in $\O(\log^2 n)$ and obtained the fully dynamic MSF working in $\O(\log^4 n)$ per operation. \paran{Decremental MSF}% -Turning the algorithm from the previous section to decremental MSF requires only two +Turning the algorithm from the previous section to the decremental MSF requires only two changes: First, we have to start with the forest~$F$ equal to the MSF of the initial graph. As we require to pay $\O(\log^2 n)$ for every insertion, we can use almost arbitrary MSF algorithm to find~$F$. Second, when we search for an~replacement edge, we need to pick the lightest possible choice. We will therefore use the weighted version of the ET-trees (Corollary \ref{wtet}) -and try the lightest non-tree edge incident with the examined tree first. We must ensure +and scan the lightest non-tree edge incident with the examined tree first. We must ensure that the lower levels cannot contain a~lighter replacement edge, but fortunately the light edges tend to ``bubble up'' in the hierarchy of levels. This can be formalized as the following invariant: @@ -520,73 +548,79 @@ the following invariant: \endlist } -\>This easily implies that we select the right replacement edge: +\>This immediately implies that we always select the right replacement edge: \lemma\id{msfrepl}% Let $F$~be the minimum spanning forest and $e$ any its edge. Then among all replacement -edges for~$e$, the lightest one is on the maximum level. +edges for~$e$, the lightest one is at the maximum level. \proof Let us consider any two edges $f_1$ and~$f_2$ replacing~$e$. By minimality of~$F$ and the Cycle -rule (\ref{cyclerule}), each $f_i$ is the heaviest edge on the cycle~$C_i = F[f_i] + f_i$. +rule (Lemma \ref{redlemma}), each $f_i$ is the heaviest edge on the cycle~$C_i = F[f_i] + f_i$. In a~moment, we will show that the symmetric difference~$C$ of these two cycles is again a~cycle. -This implies that if $f_1$ is heavier than~$f_2$, then by I3 $f_1$~is the heaviest edge on~$C$, so -$\ell(f_1) \le \ell(f_2)$. Therefore the lightest of all replacement edges must have +This implies that if $f_1$ is heavier than~$f_2$, then $f_1$~is the heaviest edge on~$C$, so +$\ell(f_1) \le \ell(f_2)$ by I3. Therefore the lightest of all replacement edges must have the maximum level. Why is~$C$ a~cycle: Let $F^a$ and~$F^b$ be the trees of~$F-e$ in which the endpoints of~$e$ lie, and for -every edge~$g$ between $F^a$ and~$F^b$ let $g^a$ and~$g^b$ be its respective endpoints. +every edge~$g$ going between $F^a$ and~$F^b$ let $g^a$ and~$g^b$ be its respective endpoints. We know that $C_i$ consists of the path $F[f_i^a,e^a]$, the edge~$e$, the path $F[e^b,f_i^b]$, and the edge~$f_i$. Thus~$C$ must contain the paths $F[f_1^a,f_2^a]$ and $F[f_1^b,f_2^b]$ and the edges $f_1$ and~$f_2$, which together form a~simple cycle. \qed -We now have to make sure that the additional invariant is observed: +We now have to make sure that the additional invariant is indeed observed: \lemma\id{ithree}% -After every operation, I3 is satisfied. +After every operation, the invariant I3 is satisfied. \proof -When the structure is freshly initialized, I3 is obviously satisfied, because all edges +When the structure is freshly initialized, I3 is obviously satisfied, as all edges are at level~0. Sole deletions of edges (both tree and non-tree) cannot violate I3, so we need -to check only the replaces, in particular when an~edge~$e$ either gets its level increased -or becomes a~tree edge. - -For the violation to happen, $e$~must be the heaviest on some cycle~$C$, so by the Cycle -rule, $e$~must be non-tree. The increase of $\ell(e)$ must therefore happen when~$e$ is -considered as a~replacement edge incident with some tree~$T_1$ at level $\ell=\ell(e)$. -We will pause the computation just before this increase and we will prove that -all other edges of~$C$ already are at levels greater than~$\ell$. - -Let us first show that for edges of~$C$ incident with~$T_1$. All edges of~$T_1$ itself -already are on the higher levels as they were moved there at the very beginning of the -search for the replacement edge. As the algorithm always tries the lightest candidate -for the replacement edge first, all non-tree edges incident with~$T_1$ which are lighter -than~$e$ were already considered and thus also moved one level up. This includes all -other edges of~$C$ that are incident with~$T_1$. - -The case of edges that do not touch~$T_1$ is easy to handle: Such edges do not exist. -If they did, at least two edges of~$C$ would have to be non-tree edges connecting~$T_1$ -with the other trees at level~$\ell$, so one of them that is lighter than~$e$ would be selected as the -replacement edge before~$e$ could be considered. +to check only the replaces, in particular the place when an~edge~$e$ gets its level increased. + +For the violation to happen for the first time, $e$~must be the heaviest on +some cycle~$C$, so by the Cycle rule, $e$~must be non-tree. The increase of +$\ell(e)$ must therefore take place when~$e$ is considered as a~replacement +edge incident with some tree~$T_1$ at level $\ell=\ell(e)$. We will pause the +computation just before this increase and we will prove that all other edges +of~$C$ already are at levels greater than~$\ell$, so the violation cannot occur. + +Let us first show this for edges of~$C$ incident with~$T_1$. All edges of~$T_1$ itself +already are at the higher levels as they were moved there at the very beginning of the +search for the replacement edge. The other tree edges incident with~$T_1$ would have +lower levels, which is impossible since the invariant would be already violated. +Non-tree edges of~$C$ incident with~$T_1$ are lighter than~$e$, so they were already considered +as~candidates for the replacement edge, because the algorithm always picks the lightest +candidate first. Such edges therefore have been already moved a~level up. + +The case of edges of~$C$ that do not touch~$T_1$ is easy to handle: Such edges do not exist. +If they did, at least one more edge of~$C$ besides~$e$ would have to connect~$T_1$ with the other +trees of level~$\ell$. We already know that this could not be a~tree edge. If it were a~non-tree +edge, it could not have level greater than~$\ell$ by~I1 nor smaller than~$\ell$ by~I3. Therefore +it would be a~level~$\ell$ edge lighter than~$e$, and as such it would have been selected as the +replacement edge before $e$~was. \qed We can conclude: \thmn{Decremental MSF, Holm et al.~\cite{holm:polylog}} -When we start with a~graph with $n$~vertices and~$m\ge n$ edges and we perform -$d$~edge deletions, the MSF can be maintained in time $\O((m+d)\cdot\log^2 n)$. +When we start with a~graph on $n$~vertices with~$m$ edges and we perform a~sequence of +edge deletions, the MSF can be initialized in time $\O((m+n)\cdot\log^2 n)$ and then +updated in time $\O(\log^2 n)$ amortized per operation. \paran{Fully dynamic MSF}% The decremental MSF algorithm can be turned to a~fully dynamic one by a~blackbox -reduction whose properties are summarized by the following theorem: +reduction whose properties are summarized in the following theorem: -\thmn{MSF dynamization, Henzinger et al.~\cite{henzinger:twoec}, Holm et al.~\cite{holm:polylog}} -Suppose that we have a~decremental MSF algorithm that for any $a$,~$b$ can be initialized -on a~graph with~$a$ vertices and~$b$ edges and then it executes an~arbitrary sequence -of deletions in time $\O(b\cdot t(a,b))$, where~$t$ is a~non-decreasing function. -Then there exists a~fully dynamic MSF algorithm for a~graph on $n$~vertices, starting +\thmn{MSF dynamization, Holm et al.~\cite{holm:polylog}} +Suppose that we have a~decremental MSF algorithm with the following properties: +\numlist\ndotted +\:For any $a$,~$b$, it can be initialized on a~graph with~$a$ vertices and~$b$ edges. +\:Then it executes an~arbitrary sequence of deletions in time $\O(b\cdot t(a,b))$, where~$t$ is a~non-decreasing function. +\endlist +\>Then there exists a~fully dynamic MSF algorithm for a~graph on $n$~vertices, starting with no edges, that performs $m$~insertions and deletions in amortized time: $$ \O\left( \log^3 n + \sum_{i=1}^{\log m} \sum_{j=1}^i \; t(\min(n,2^j), 2^j) \right) \hbox{\quad per operation.} @@ -594,7 +628,7 @@ $$ \proofsketch The reduction is very technical, but its essence is the following: We maintain a~logarithmic number -of decremental structures $A_0,\ldots,A_{\log n}$ of exponentially increasing sizes. Every non-tree +of decremental structures $A_0,\ldots,A_{\lfloor\log n\rfloor}$ of exponentially increasing sizes. Every non-tree edge is contained in exactly one~$A_i$, tree edges can belong to multiple structures. When an~edge is inserted, we union it with some of the $A_i$'s, build a~new decremental structure @@ -603,20 +637,20 @@ Delete of a~non-tree edge is performed on all $A_i$'s containing it and the repl sought among the replacement edges found in these structures. The unused replacement edges then have to be reinserted back to the structure. -The original reduction of Henzinger et al.~handles the reinserts by a~mechanism of batch insertions +The original reduction of Henzinger et al.~\cite{henzinger:twoec} handles these reinserts by a~mechanism of batch insertions supported by their decremental structure, which is not available in our case. Holm et al.~have replaced it by a~system of auxiliary edges inserted at various places in the structure. We refer to the article \cite{holm:polylog} for details. \qed -\corn{Fully dynamic MSF} -There is a~fully dynamic MSF algorithm that works in amortized time $\O(\log^4 n)$ +\corn{Fully dynamic MSF}\id{dynmsfcorr}% +There is a~fully dynamic MSF algorithm that works in time $\O(\log^4 n)$ amortized per operation for graphs on $n$~vertices. \proof Apply the reduction from the previous theorem to the decremental algorithm we have -developed. This results in an~algorithm of amortized complexity $\O(\log^4 m)$ where~$m$ -is the number of operations performed. This could be more than $\O(\log^4 n)$ if +developed. This results in an~algorithm of amortized complexity $\O(\log^4\max(m,n))$ where~$m$ +is the number of operations performed. This could exceed $\O(\log^4 n)$ if $m$~is very large, but we can rebuild the whole structure after $n^2$~operations, which brings $\log m$ down to $\O(\log n)$. The $\O(n^2\log^4 n)$ cost of the rebuild then incurs only $\O(\log^4 n)$ additional cost on each operation. @@ -624,33 +658,31 @@ rebuild then incurs only $\O(\log^4 n)$ additional cost on each operation. \rem The limitation of MSF structures based on the Holm's algorithm for connectivity -to edge deletions only seems to be unavoidable. The invariant I3 could be easily +to only edge deletions seems to be unavoidable. The invariant I3 could be easily broken for many cycles at once whenever a~very light non-tree edge is inserted. We could try increasing the level of the newly inserted edge, but we would quite -possibly hit I1 before we skipped the levels of all the heavies edges on the +likely hit I1 before we managed to skip the levels of all the heaviest edges on the particular cycles. -On the other hand, if we decided to drop I3, we would encounter different problems. -We have enough time to scan all non-tree edges incident to the current tree~$T_1$ ---- we can charge it on the level increases of its tree edges and if we use the -degree reduction from Lemma \ref{degred}, there are at most two non-tree edges -per vertex. (The reduction can be used dynamically as it always translates a~single -change of the original graph to $\O(1)$ changes of the reduced graph.) The lightest -replacement edge however could also be located in the super-trees of~$T_1$ on the -lower levels, which are too large to scan and both I1 and I2 prevent us from -charging the time on increasing levels there. +On the other hand, if we decided to drop I3, we would encounter different problems. The ET-trees can +bring the lightest non-tree incident with the current tree~$T_1$, but the lightest replacement edge +could also be located in the super-trees of~$T_1$ at the lower levels, which are too large to scan +and both I1 and I2 prevent us from charging the time on increasing levels there. An~interesting special case in which insertions are possible is when all non-tree edges have the same weight. This leads to the following algorithm for dynamic MSF -on~graphs with a~small set of allowed edge weights. +on~graphs with a~small set of allowed edge weights. It is based on an~idea similar +to the $\O(k\log^3 n)$ algorithm of Henzinger and King \cite{henzinger:randdyn}, +but adapted to use the better results on dynamic connectivity we have at hand. \paran{Dynamic MSF with limited edge weights}% Let us assume for a~while that our graph has edges of only two different weights (let us say -1~and~2). We will drop our rule that all edge weights are distinct for a~moment and we recall that -the basic structural properties of the MST's from Section \ref{mstbasics} still hold. +1~and~2). We will forget our rule that all edge weights are distinct for a~moment and we recall +the observation in \ref{multiweight} that the basic structural properties of +the MST's from Section \ref{mstbasics} still hold. We split the graph~$G$ to two subgraphs~$G_1$ and~$G_2$ according to the edge -weights. We use one instance~$\C_1$ of the dynamic connectivity algorithm to maintain +weights. We use one instance~$\C_1$ of the dynamic connectivity algorithm maintaining an~arbitrary spanning forest~$F_1$ of~$G_1$, which is obviously minimum. Then we add another instance~$\C_2$ to maintain a~spanning forest~$F_2$ of the graph $G_2\cup F_1$ such that all edges of~$F_1$ are forced to be in~$F_2$. Obviously, $F_2$~is the @@ -681,7 +713,7 @@ a~replacement in~$\C_2$. This way, we can handle every insertion and deletion in time $\O(\log^2 n)$ amortized. This construction can be iterated in an~obvious way: if we have $k$~distinct edge weights, we build $k$~connectivity structures $\C_1,\ldots,\C_k$. The structure~$\C_i$ contains edges of -weight~$i$ and the MSF edges from~$\C_{i-1}$. Bounding the time complexity is then easy: +weight~$i$ together with the MSF edges from~$\C_{i-1}$. Bounding the time complexity is then easy: \thmn{MSF with limited edge weights} There is a~fully dynamic MSF algorithm that works in time $\O(k\cdot\log^2 n)$ amortized @@ -691,8 +723,242 @@ per operation for graphs on $n$~vertices with only $k$~distinct edge weights all A~change in the graph~$G$ involving an~edge of weight~$w$ causes a~change in~$\C_w$, which can propagate to~$\C_{w+1}$ and so on, possibly up to~$\C_k$. In each~$\C_i$, we spend time $\O(\log^2 n)$ by updating the connectivity structure according to -Theorem \ref{dyncon}. +Theorem \ref{dyncon} and $\O(\log n)$ on operations with the Sleator-Tarjan trees +by Theorem \ref{sletar}. +\qed + +%-------------------------------------------------------------------------------- + +\section{Almost minimum trees}\id{kbestsect}% + +In some situations, finding the single minimum spanning tree is not enough and we are interested +in the $K$~lightest spanning trees, usually for some small value of~$K$. Katoh, Ibaraki +and Mine \cite{katoh:kmin} have given an~algorithm of time complexity $\O(m\log\beta(m,n) + Km)$, +building on the MST algorithm of Gabow et al.~\cite{gabow:mst}. +Subsequently, Eppstein \cite{eppstein:ksmallest} has discovered an~elegant preprocessing step which allows to reduce +the running time to $\O(m\log\beta(m,n) + \min(K^2,Km))$ by eliminating edges +which are either present in all $K$ trees or in none of them. +We will show a~variant of their algorithm based on the MST verification +procedure of Section~\ref{verifysect}. + +In this section, we will require the edge weights to be real numbers (or integers), because +comparisons are certainly not enough to determine the second best spanning tree. We will +assume that our computation model is able to add, subtract and compare the edge weights +in constant time. + +Let us focus on finding the second best spanning tree first. + +\paran{Second best spanning tree}% +Suppose that we have a~weighted graph~$G$ and a~sequence $T_1,\ldots,T_z$ of all its spanning +trees. Also suppose that the weights of these spanning trees are distinct and that the sequence +is ordered by weight, i.e., $w(T_1) < \ldots < w(T_z)$ and $T_1 = \mst(G)$. Let us observe +that each tree is similar to at least one of its predecessors: + +\lemma\id{kbl}% +For each $i>1$ there exists $jConversely, a~single exchange in $(G_1,T_1\sgc e)$ or in $(G_2,T_2)$ corresponds +to an~exchange in either~$(G,T_1)$ or $(G,T_2)$. +Even stronger, a~spanning tree~$T$ of~$G$ either contains~$e$ and then $T\sgc +e$ is a~spanning tree of~$G_1$, or $T$~doesn't contain~$e$ and so it is +a~spanning tree of~$G_2$. + +Thus we can run the previous algorithm for finding the best edge exchange +on both~$G_1$ and~$G_2$ and find~$T_3$ again in time $\O(m)$. + +\paran{Further spanning trees}% +The construction of auxiliary graphs can be iterated to obtain $T_1,\ldots,T_K$ +for an~arbitrary~$K$. We will build a~\df{meta-tree} of auxilary graphs. Each node of this meta-tree +is assigned a~minor of~$G$ and its minimum spanning tree. The root node contains~$(G,T_1)$, +its sons have $(G_1,T_1\sgc e)$ and $(G_2,T_2)$. When $T_3$ is obtained by an~exchange +in one of these sons, we attach two new leaves to that son and we assign the two auxiliary +graphs derived by contracting or deleting the exchanged edge. Then we find the best +edge exchanges among the new sons and repeat the process. By the above observation, +each spanning tree of~$G$ is generated exactly once. Lemma \ref{kbl} guarantees that +the trees are enumerated in the increasing order. + +Recalculating the best exchanges in all leaves of the meta-tree after generating each~$T_i$ +is of course not necessary, because most leaves stay unchanged. We will rather remember +the best exchange for each leaf and keep their values in a~heap. In every step, we will +delete the minimum from the heap and use the exchange in the particular leaf to generate +a~new tree. Then we will create the new leaves, calculate their best exchanges and insert +them into the heap. The algorithm is now straightforward and so will be its analysis: + +\algn{Finding $K$ best spanning trees}\id{kbestalg}% +\algo +\algin A~weighted graph~$G$, its MST~$T_1$ and an~integer $K>0$. +\:$R\=$ a~meta tree whose vertices carry triples $(G',T',F')$. Initially + it contains just a~root with $(G,T_1,\emptyset)$. + \hfil\break + \cmt{$G'$ is a~minor of~$G$, $T'$~is its MST, and~$F'$ is a~set of edges of~$G$ + that are contracted in~$G'$.} +\:$H\=$ a~heap of quadruples $(\delta,r,e,f)$ ordered on~$\delta$, initially empty. + \hfil\break + \cmt{Each quadruple describes an~exchange of~$e$ for~$f$ in a~leaf~$r$ of~$R$ and $\delta=w(f)-w(e)$ + is the weight gain of this exchange.} +\:Find the best edge exchange in~$(G,T_1)$ and insert it to~$H$. +\:$i\= 1$. +\:While $i1$, which again does not influence comparison +of originally distinct differences. If the differences were the same, it is sufficient +to look at their values of~$i$ and~$j$, i.e., at the identifiers of the edges. + +\paran{Invariant edges}% +Our algorithm can be further improved for small values of~$K$ (which seems to be the common +case in most applications) by the reduction of Eppstein \cite{eppstein:ksmallest}. +We will observe that there are many edges of~$T_1$ +which are guaranteed to be contained in $T_2,\ldots,T_K$ as well, and likewise there are +many edges of $G\setminus T_1$ which are certainly not present in those spanning trees. +The idea is the following (we again assume that the tree weights are distinct): + +\defn +For an~edge $e\in T_1$, we define its \df{gain} $g(e)$ as the minimum weight gained by an~edge exchange +replacing~$e$. Similarly, we define $G(f)$ for $f\not\in T_1$. Put formally: +$$\eqalign{ +g(e) &:= \min\{ w(f)-w(e) \mid f\in E, e\in T[f] \} \cr +G(f) &:= \min\{ w(f)-w(e) \mid e\in E, e\in T[f] \}.\cr +}$$ + +\lemma\id{gaina}% +When $t_1,\ldots,t_{n-1}$ are the edges of~$T_1$ in order of increasing gain, +the edges $t_K,\ldots,t_n$ are present in all trees $T_2,\ldots,T_K$. + +\proof +The best exchanges in~$T_1$ involving $t_1,\ldots,t_{K-1}$ produce~$K-1$ spanning trees +of increasing weights. Any exchange involving $t_K,\ldots,t_n$ produces a~tree +which is heavier or equal than those. (We are ascertained by the Monotone exchange lemma +that the gain of such exchanges cannot be reverted by any later exchanges.) +\qed + +\lemma\id{gainb}% +When $q_1,\ldots,q_{m-n+1}$ are the edges of $G\setminus T_1$ in order of increasing gain, +the edges $q_K,\ldots,q_{m-n+1}$ are not present in any of $T_2,\ldots,T_K$. + +\proof +Similar to the previous lemma. +\qed + +\para +It is therefore sufficient to find $T_2,\ldots,T_K$ in the graph obtained from~$G$ by +contracting the edges $t_K,\ldots,t_n$ and deleting $q_K,\ldots,q_{m-n+1}$. This graph +has only $\O(K)$ vertices and $\O(K)$ edges. The only remaining question is how to +calculate the gains. For edges outside~$T_1$, it again suffices to find the peaks of the +covered paths. The gains of MST edges require a~different algorithm, but Tarjan \cite{tarjan:applpc} +has shown that they can be obtained in time $\O(m\timesalpha(m,n))$. + +When we put the results of this section together, we obtain: + +\thmn{Finding $K$ best spanning trees}\id{kbestthm}% +For a~given graph~$G$ with real edge weights, the $K$~best spanning trees can be found +in time $\O(m\timesalpha(m,n) + \min(K^2,Km + K\log K))$. + +\proof +First we find the MST of~$G$ in time $\O(m\timesalpha(m,n))$ using the Pettie's Optimal +MST algorithm (Theorem \ref{optthm}). Then we calculate the gains of MST edges by the +Tarjan's algorithm from \cite{tarjan:applpc}, again in $\O(m\timesalpha(m,n))$, and +the gains of the other edges using our MST verification algorithm (Corollary \ref{rampeaks}) +in $\O(m)$. We Lemma \ref{gaina} to identify edges that are required, and Lemma \ref{gainb} +to find edges that are useless. We contract the former edges, remove the latter ones +and run Algorithm \ref{kbestalg} to find the trees. By Lemma \ref{kbestl}, it runs in +the desired time. + +If~$K\ge m$, this reduction does not pay off, so we run Algorithm \ref{kbestalg} +directly on the input graph. \qed +\paran{Improvements}% +It is an~interesting open question whether the algorithms of Section \ref{verifysect} can +be modified to calculate all gains. The main procedure can, but it requires to reduce +the input to a~balance tree first and here the Bor\o{u}vka trees fail. The Buchsbaum's +Pointer-Machine algorithm (\ref{pmverify}) is more promising. + +\paran{Large~$K$}% +When $K$~is large, re-running the verification algorithm for every change of the graph +is too costly. Frederickson \cite{frederickson:ambivalent} has shown how to find the best +swaps dynamically, reducing the overall time complexity of Algorithm \ref{kbestalg} +to $\O(Km^{1/2})$ and improving Theorem \ref{kbestthm} to $\O(m\timesalpha(m,n) ++ \min( K^{3/2}, Km^{1/2} ))$. It is open if the dynamic data structures of this +chapter could be modified to bring the complexity of finding the next tree down +to polylogarithmic. + +\paran{Multiple minimum trees}% +Another nice application of Theorem \ref{kbestthm} is finding all minimum spanning +trees in a~graph that does not have distinct edge weights. \endpart