X-Git-Url: http://mj.ucw.cz/gitweb/?a=blobdiff_plain;f=dyn.tex;h=f07103ac08cc19f4d596c73cca1c8d1aee0545b5;hb=2673d7b09e7049aa35f3f81b6e7b395c934bdfbd;hp=502f618eedbed8667013e091250787056db9ebb2;hpb=e06622aee7d6c553e34f5446db1dce0029025d70;p=saga.git diff --git a/dyn.tex b/dyn.tex index 502f618..f07103a 100644 --- a/dyn.tex +++ b/dyn.tex @@ -6,86 +6,92 @@ \section{Dynamic graph algorithms} -In many applications, we often need to solve a~certain graph problem for a~sequence -of graphs that differ only a~little, so recomputing the solution from scratch for -every graph would be a~waste of time. In such cases, we usually turn our attention -to \df{dynamic graph algorithms.} A~dynamic algorithm is in fact a~data structure -that remembers a~graph and offers operations that modify the structure of the graph -(let's say by insertion and deletion of edges) and also operations that query the -result of the problem for the current state of the graph. -A~typical example of a~problem solved by such algorithms is dynamic maintenance of -connected components: +In many applications, we often need to solve a~certain graph problem for a~sequence of graphs that +differ only a~little, so recomputing the solution for every graph from scratch would be a~waste of +time. In such cases, we usually turn our attention to \df{dynamic graph algorithms.} A~dynamic +algorithm is in fact a~data structure that remembers a~graph. It offers operations that modify the +structure of the graph and also operations that query the result of the problem for the current +state of the graph. A~typical example of a~problem of this kind is dynamic maintenance of connected +components: \problemn{Dynamic connectivity} Maintain an~undirected graph under a~sequence of the following operations: \itemize\ibull \:$\(n)$ --- Create a~graph with $n$~isolated vertices $\{1,\ldots,n\}$.\foot{% The structure could support dynamic addition and removal of vertices, too, -but this is easy to add and infrequently used, so we will rather define -the structure with a~fixed set of vertices for clarity.} +but this is easy to add and infrequently used, so we will rather keep the set +of vertices fixed for clarity.} \:$\(G,u,v)$ --- Insert an~edge $uv$ to~$G$ and return its unique -identifier (assuming that the edge did not exist yet). +identifier. This assumes that the edge did not exist yet. \:$\(G,e)$ --- Delete an~edge specified by its identifier from~$G$. -\:$\(G,u,v)$ --- Test if $u$ and~$v$ are in the same connected component of~$G$. +\:$\(G,u,v)$ --- Test if vertices $u$ and~$v$ are in the same connected component of~$G$. \endlist \para We have already encountered a~special case of dynamic connectivity when implementing the Kruskal's algorithm in Section \ref{classalg}. At that time, we did not need to delete any edges from the graph, which makes the problem substantially easier. This special -case is sometimes called an~\df{incremental} or \df{semidynamic} graph algorithm. -We mentioned the Union-Find data structure of Tarjan and van Leeuwen \cite{tarjan:setunion} -which can be used for that: queries on connectedness translate to \, edge -insertions to \ followed by \ if the edge connects two different -components. This way, a~sequence of $m$~operations starting with an~empty graph -on $n$~vertices is processed in time $\O(m\timesalpha(m,n))$ and this works even -on the Pointer Machine. Fredman and Saks \cite{fredman:cellprobe} have proven -a~matching lower bound in the cell-probe model which is stronger than RAM with -$\O(\log n)$-bit words. - -The edges that have triggered the \s form a~spanning forest of the current graph. -So far, all known algorithms for dynamic connectivity maintain some sort of a~spanning -tree. This suggests that a~dynamic MST algorithm could be obtained by modifying the -dynamic connectivity algorithms. This will indeed turn out to be true. Semidynamic MST -is easy to achieve even in the few pages of this section, but making it fully dynamic will require -more effort, so we will review some of the required building blocks before going into that. +case is customarily called an~\df{incremental} or \df{semidynamic} graph algorithm. +We mentioned the Disjoint Set Union data structure of Tarjan and van Leeuwen (Theorem \ref{dfu}) +which can be used for that: Connected components are represented by equivalence classes. +Queries on connectedness translate to \, edge insertions to \ +followed by \ if the new edge joins two different components. This way, +a~sequence of $m$~operations starting with an~empty graph on $n$~vertices is +processed in time $\O(n+m\timesalpha(m,n))$ and this holds even for the Pointer +Machine. Fredman and Saks \cite{fredman:cellprobe} have proven a~matching lower +bound in the cell-probe model which is stronger than RAM with $\O(\log n)$-bit +words. + +In this chapter, we will focus on the dynamic version of the minimum spanning forest. +This problem seems to be intimately related to the dynamic connectivity. Indeed, all known +algorithms for dynamic connectivity maintain some sort of a~spanning forest. For example, in the +incremental algorithm we have just mentioned, this forest is formed by the edges that have +triggered the \s. This suggests that a~dynamic MSF algorithm could be obtained by modifying +the mechanism to keep the forest minimum. This will indeed turn out to be true, although we cannot +be sure that it will lead to the most efficient solution possible --- as of now, the known lower +bounds are very far. + +Incremental MST will be easy to achieve even in the few pages of this section, but making it fully +dynamic will require more effort, so we will review some of the required building blocks before +going into that. We however have to answer one important question first: What should be the output of our MSF data structure? Adding an~operation that would return the MSF of the current graph is of course possible, but somewhat impractical as this operation has to spend $\Omega(n)$ time on the mere writing of its output. A~better way seems to -be to make the \ and \ operations report the list of modifications -of the MSF caused by the change in the graph. +be making the \ and \ operations report the list of modifications +of the MSF implied by the change in the graph. -Let us see what happens when we \ an~edge~$e$ to a~graph~$G$ with a~minimum spanning +Let us see what happens when we \ an~edge~$e$ to a~graph~$G$ with its minimum spanning forest~$F$, obtaining a~new graph~$G'$ with its MSF~$F'$. If $e$~connects two components of~$G$ (and therefore also of~$F$), we have to add~$e$ to~$F$. Otherwise, one of the following cases happens: Either $e$~is $F$-heavy and so the forest~$F$ is also the MSF of the new graph. Or it is $F$-light and we have to modify~$F$ by exchanging the heaviest edge~$f$ of the path $F[e]$ with~$e$. -Correctness of the former case follows immediately from the Theorem on Minimality by order -(\ref{mstthm}), because all $F'$-light would be also $F$-light, which is impossible as $F$~was +Correctness of the former case follows immediately from the Minimality Theorem (\ref{mstthm}), +because any $F'$-light would be also $F$-light, which is impossible as $F$~was minimum. In the latter case, the edge~$f$ is not contained in~$F'$ because it is the heaviest on the cycle $F[e]+e$ (by the Red lemma, \ref{redlemma}). We can now use the Blue lemma (\ref{bluelemma}) to prove that it should be replaced with~$e$. Consider the tree~$T$ -of~$F$ containing both endpoints of the edge~$e$. When we remove~$f$ from~$F$, this tree falls -apart to two components $T_1$ and~$T_2$. The edge~$f$ was the lightest edge of the cut~$\delta_G(T_1)$ +of~$F$ that contains both endpoints of the edge~$e$. When we remove~$f$ from~$F$, this tree falls +apart to two components $T_1$ and~$T_2$. The edge~$f$ was the lightest in the cut~$\delta_G(T_1)$ and $e$~is lighter than~$f$, so $e$~is the lightest in~$\delta_{G'}(T_1)$ and hence $e\in F'$. A~\ of an~edge that is not contained in~$F$ does not change~$F$. When we delete an~MSF edge, we have to reconnect~$F$ by choosing the lightest edge of the cut separating -the new components. If there is no such replacement edge, we have deleted a~bridge, -so the MSF has to remain disconnected. +the new components (again the Blue lemma in action). If there is no such +replacement edge, we have deleted a~bridge, so the MSF has to remain +disconnected. The idea of reporting differences in the MSF indeed works very well. We can summarize -what we have shown by the following lemma and use it to define the dynamic MSF. +what we have shown in the following lemma and use it to define the dynamic MSF. \lemma An~\ or \ of an~edge in~$G$ causes at most one edge addition, edge removal or edge exchange in $\msf(G)$. \problemn{Dynamic minimum spanning forest} -Maintain an~undirected graph with weights on edges (drawn from a~totally ordered set) +Maintain an~undirected graph with distinct weights on edges (drawn from a~totally ordered set) and its minimum spanning forest under a~sequence of the following operations: \itemize\ibull \:$\(n)$ --- Create a~graph with $n$~isolated vertices $\{1,\ldots,n\}$. @@ -95,8 +101,8 @@ and its minimum spanning forest under a~sequence of the following operations: Return the list of additions and deletions of edges in $\msf(G)$. \endlist -\paran{Semidynamic MSF}% -To obtain a~semidynamic MSF algorithm, we need to keep the forest in a~data structure which +\paran{Incremental MSF}% +To obtain an~incremental MSF algorithm, we need to keep the forest in a~data structure that supports search for the heaviest edge on the path connecting a~given pair of vertices. This can be handled efficiently by the Link-Cut trees of Sleator and Tarjan: @@ -104,22 +110,21 @@ of vertices. This can be handled efficiently by the Link-Cut trees of Sleator an There is a~data structure that represents a~forest of rooted trees on~$n$ vertices. Each edge of the forest has a~weight drawn from a~totally ordered set. The structure supports the following operations in time $\O(\log n)$ amortized:\foot{% -The Link-Cut trees offer many other operations, but we do not mention them +The Link-Cut trees can offer a~plethora of other operations, but we do not mention them as they are not needed in our application.} \itemize\ibull \:$\(v)$ --- Return the parent of~$v$ in its tree or \ if $v$~is a~root. \:$\(v)$ --- Return the root of the tree containing~$v$. \:$\(v)$ --- Return the weight of the edge $(\,v)$. \:$\(v)$ --- Return the vertex~$w$ closest to $\(v)$ such that the edge - $(\(w),w)$ is the heaviest of those on the path from~$v$ to the root. + $(\(w),w)$ is the heaviest of those on the path from the root to~$v$. If more edges have the maximum weight, break the tie arbitrarily. If there is no such edge ($v$~is the root itself), return \. \:$\(u,v,x)$ --- Connect the trees containing $u$ and~$v$ by an~edge $(u,v)$ of - weight~$x$. Assumes that $u~$is a tree root and $v$~lies in a~different tree. + weight~$x$. Assumes that $v~$is a tree root and $u$~lies in a~different tree. \:$\(v)$ --- Split the tree containing the non-root vertex $v$ to two trees by - removing the edge $(v,\(v))$. Returns the weight of this edge. -\:$\(v)$ --- Modify the orientations of the edges in the tree containing~$v$ - to make~$v$ the tree's root. + removing the edge $(\(v),v)$. Returns the weight of this edge. +\:$\(v)$ --- Modify the orientations of edges to make~$v$ the root of its tree. \endlist %% \>Additionally, all edges on the path from~$v$ to $\(v)$ can be enumerated in @@ -139,28 +144,28 @@ See \cite{sleator:trees}. \qed Once we have this structure, we can turn our ideas on updating of the MSF to -an~algorithm: +an~incremental algorithm: -\algn{\ in a~semidynamic MSF} +\algn{\ in an~incremental MSF} \algo \algin A~graph~$G$ with its MSF $F$ represented as a~Link-Cut forest, an~edge~$uv$ with weight~$w$ to be inserted. \:$\(u)$. \cmt{$u$~is now the root of its tree.} \:If $\(v) \ne u$: \cmt{$u$~and~$v$ lie in different trees.} -\::$\(u,v,w)$. \cmt{Connect the trees.} +\::$\(v,u,w)$. \cmt{Connect the trees.} \::Return ``$uv$ added''. -\:Otherwise: +\:Otherwise: \cmt{both are in the same tree} \::$y\=\(v)$. -\::$x\=\(x)$. \cmt{Edge~$xy$ is the heaviest on $F[uv]$.} -\::If $\(x) > w$: \cmt{We have to exchange~$xy$ with~$uv$.} -\:::$\(x)$, $\(v)$, $\(v,y,w)$. +\::$x\=\(y)$. \cmt{Edge~$xy$ is the heaviest on $F[uv]$.} +\::If $\(y) > w$: \cmt{We have to exchange~$xy$ with~$uv$.} +\:::$\(y)$, $\(v)$, $\(u,v,w)$. \:::Return ``$uv$~added, $xy$~removed''. \::Otherwise return ``no changes''. \algout The list of changes in~$F$. \endalgo -\thmn{Semidynamic MSF} -When only edge insertions are allowed, the MSF can be maintained in time $\O(\log n)$ +\thmn{Incremental MSF} +When only edge insertions are allowed, the dynamic MSF can be maintained in time $\O(\log n)$ amortized per operation. \proof @@ -168,10 +173,559 @@ Every \ performs $\O(1)$ operations on the Link-Cut forest, which take $\O(\log n)$ each by Theorem \ref{sletar}. \qed +\rem +We can easily extend the semidynamic MSF algorithm to allow an~operation commonly called +\ --- removal of the most recently inserted edge. It is sufficient to keep the +history of all MSF changes in a~stack and reverse the most recent change upon backtrack. + +What are the obstacles to making the structure fully dynamic? +Deletion of edges that do not belong to the MSF is trivial (we do not +need to change anything) and so is deletion of bridges (we just remove the bridge +from the Link-Cut tree, knowing that there is no edge to replace it). The hard part +is the search for replacement edges after an~edge belonging to the MSF is deleted. + +This very problem also has to be solved by algorithms for fully dynamic connectivity, +we will take a~look at them first. + +%-------------------------------------------------------------------------------- + +\section{Eulerian Tour trees} + +An~important stop on the road to fully dynamic algorithms has the name \df{Eulerian Tour trees} or +simply \df{ET-trees}. It is a~representation of forests introduced by Henzinger and King +\cite{henzinger:randdyn} in their randomized dynamic algorithms. It is similar to the Link-Cut +trees, but it is much simpler and instead of path operations it offers efficient operations on +subtrees. It is also possible to attach auxiliary data to vertices and edges of the original tree. + +\defn\id{eulseq}% +Let~$T$ be a~rooted tree. We will call a~sequence of vertices of~$T$ its \df{Eulerian Tour sequence (ET-sequence)} +if it lists the vertices visited by the depth-first traversal of~$T$. +More precisely, it can be generated by the following procedure $\(v)$ +when it is invoked on the root of the tree: +\algo +\:Record~$v$ in the sequence. +\:For each son~$w$ of~$v$: +\::Call $\(w)$. +\::Record~$w$. +\endalgo +\>A~single tree can have multiple ET-sequences, corresponding to different orders in which the +sons can be enumerated in step~2. + +In every ET-tree, one of the occurrences of each vertex is defined as its \df{active occurrence} and +it will be used to store auxiliary data associated with that vertex. + +\obs +An~ET-sequence contains a~vertex of degree~$d$ exactly $d$~times except for the root which +occurs $d+1$ times. The whole sequence therefore contains $2n-1$ elements. It indeed describes the +order vertices on an~Eulerian tour in the tree with all edges doubled. Let us observe what happens +to an~ET-sequence when we modify the tree. + +When we \em{delete} an~edge $uv$ from the tree~$T$ (let $u$~be the parent of~$v$), the sequence +$AuvBvuC$ (with no~$u$ nor~$v$ in~$B$) splits to two sequences $AuC$ and $vBv$. +If there was only a~single occurrence of~$v$, then $v$~was a~leaf and thus the sequence +transforms from $AuvuC$ to $AuC$ and $v$~alone. + +\em{Changing the root} of the tree~$T$ from~$v$ to~$w$ changes its ET-sequence from $vAwBwCv$ to $wBwCvAw$. +If $w$~was a~leaf, the sequence changes from $vAwCv$ to $wCvAw$. If $vw$ was the only edge of~$T$, +the sequence $vw$ becomes $wv$. Note that this works regardless of the possible presence of~$w$ inside~$B$. + +\em{Joining} the roots of two trees by a~new edge makes their ET-sequences $vAv$ and~$wBw$ +combine to $vAvwBwv$. Again, we have to handle the cases when $v$ or~$w$ has degree~1 separately: +$v$~and~$wBw$ combine to $vwBwv$, and $v$~with~$w$ makes $vwv$. + +\float{\valign{\vfil#\vfil\cr +\hbox{\epsfbox{pic/ettree.eps}}\cr +\noalign{\qquad\quad} + \halign{#\hfil\cr + $T_1: 0121034546474308980$,~~$T_2: aba$. \cr + $T_1-34: 01210308980, 4546474$. \cr + $T_1\hbox{~rooted at~3}: 3454647430898012103$. \cr + $T_1+0a+T_2$: $0121034546474308980aba0$. \cr + }\cr +}}{Trees and their ET-sequences} + +If any of the occurrences that we have removed from the sequence was active, there is always +a~new occurrence of the same vertex that can stand in its place and inherit the auxiliary data. + +The ET-trees will store the ET-sequences as $(a,b)$-trees with the parameter~$a$ set upon +initialization of the structure and with $b=2a$. We know from the standard theorems of $(a,b)$-trees +(see for example \cite{clrs}) that the depth of a~tree with $n$~leaves is always $\O(\log_a n)$ +and that all basic operations including insertion, deletion, search, splitting and joining the trees +run in time $\O(b\log_a n)$ in the worst case. + +We will use the ET-trees to maintain a~spanning forest of the dynamic graph. The auxiliary data of +each vertex will hold a~list of edges incident with the given vertex, that do not lie in the +forest. Such edges are usually called the \df{non-tree edges.} + +\defn +\df{Eulerian Tour trees (ET-trees)} are a~data structure that represents a~forest of trees and a~set of non-tree +edges associated with the vertices of the forest. To avoid confusion, we will distinguish between +\df{original} vertices and edges (of the given trees) and the vertices and edges of the +data structure. The structure consists of: +\itemize\ibull +\:A~collection of $(a,b)$-trees of some fixed parameters $a$ and~$b$. + Each such tree corresponds to one of the original trees~$T$. Its + leaves (in the usual tree order) correspond to the elements + of an~ET-sequence for~$T$. Each two consecutive leaves $u$ and~$v$ are separated + by a~unique key stored in an~internal vertex of the $(a,b)$-tree. This key is used to represent + the original edge~$uv$. Each original edge is therefore kept in both its orientations. +\:Mappings \, \ and \: + \itemize\icirc + \:$\(v)$ maps each original vertex to the leaf containing its active occurrence; + \:$\(e)$ of an~original edge~$e$ is one of the internal keys representing~it; + \:$\(k)$ pairs an~internal key~$k$ with the other internal key of the same original edge. + \endlist +\:A~list of non-tree edges placed in each leaf. The lists are allowed to be non-empty only + in the leaves that represent active occurrences of original vertices. +\:Boolean \df{markers} in the internal vertices that signal presence of a~non-tree + edge anywhere in the subtree rooted at the internal vertex. +\:Counters $\(v)$ that contain the number of leaves in the subtree rooted at~$v$. +\endlist + +\defn +The ET-trees support the following operations on the original trees: +\itemize\ibull +\:\ --- Create a~single-vertex tree. +\:$\(u,v)$ --- Join two different trees by an~edge~$uv$ and return a~unique identifier + of this edge. +\:$\(e)$ --- Split a~tree by removing the edge~$e$ given by its identifier. +\:$\(u,v)$ --- Test if the vertices $u$ and~$v$ lie in the same tree. +\:$\(v)$ --- Return the number of vertices in the tree containing the vertex~$v$. +\:$\(v,e)$ --- Add a~non-tree edge~$e$ to the list at~$v$ and return a~unique + identifier of this edge. +\:$\(e)$ --- Delete a~non-tree edge~$e$ given by its identifier. +\:$\(v)$ --- Return a~list of non-tree edges associated with the vertices + of the $v$'s tree. +\endlist + +\impl +We will implement the operations on the ET-trees by translating the intended changes of the +ET-sequences to operations on the $(a,b)$-trees. The role of identifiers of the original vertices +and edges will be of course played by pointers to the respective leaves and internal keys of +the $(a,b)$-trees. + +\ of an~edge splits the $(a,b)$-tree at both internal keys representing the given edge +and joins them back in the different order. + +\ of two trees can be accomplished by making both vertices the roots of their trees first +and joining the roots by an~edge afterwards. Re-rooting involves splits and joins of $(a,b)$-trees. +As we can split at any occurrence of the new root vertex, we will use the active occurrence +which we remember. Linking of the roots is translated to joining of the $(a,b)$-trees. + +\ follows parent pointers from both $u$ and~$v$ to the roots of their trees. +Then it checks if the roots are equal. + +\ finds the root and returns its counter. + +\ finds the leaf $\(v)$ containing the list of $v$'s non-tree edges +and inserts the new edge there. The returned identifier will consist from the pointer to +the edge and the vertex in whose list it is stored. Then we have to recalculate the markers +on the path from $\(v)$ to the root. \ is analogous. + +Whenever any other operation changes a~vertex of the tree, it will also update its marker and +counter and, if necessary, the markers and counters on the path to the root. + +\ traverses the tree recursively from the root, but it does not enter the +subtrees whose roots are not marked. + +Analysis of time complexity of the operations is now straightforward: + +\thmn{Eulerian Tour trees, Henzinger and Rauch \cite{henzinger:randdyn}}\id{etthm}% +The ET-trees perform the operations \ and \ in time $\O(a\log_a n)$, \ +in $\O(1)$, \, \, \, and \ in $\O(\log_a n)$, and +\ in $\O(a\log_a n)$ per edge reported. Here $n$~is the number of vertices +in the original forest and $a\ge 2$ is an~arbitrary constant. + +\proof +We set $b=2a$. Our implementation performs $\O(1)$ operations on the $(a,b)$-trees +per operation on the ET-tree, plus $\O(1)$ other work. We apply the standard theorems +on the complexity of $(a,b)$-trees \cite{clrs}. +\qed + +\examplen{Connectivity acceleration}\id{accel}% +In most cases, the ET-trees are used with $a$~constant, but sometimes choosing~$a$ as a~function +of~$n$ can also have its beauty. Suppose that there is a~data structure which maintains an~arbitrary +spanning forest of a~dynamic graph. Suppose also that the structure works in time $\O(\log^k n)$ +per operation and that it reports $\O(1)$ changes in the spanning forest for every change +in the graph. If we keep the spanning forest in ET-trees with $a=\log n$, the updates of the +data structure cost an~additional $\O(\log^2 n / \log\log n)$, but connectivity queries accelerate to $\O(\log +n/\log\log n)$. + +\paran{ET-trees with weights} +In some cases, we will also need a~representation of weighted graphs and enumerate the non-tree +edges in order of their increasing weights (in fact, it will be sufficient to find the +lightest one, remove it and iterate). This can be handled by a~minute modification of the +ET-trees. + +The tree edges will remember their weight in the corresponding internal keys of the ET-tree. +We replace each list of non-tree edges by an~$(a,b)$-tree keeping the edges sorted by weight. +We also store the minimum element of that tree separately, so that it can be accessed in constant +time. The boolean \em{marker} will then become the minimum weight of a~non-tree edge attached to the +particular subtree, which can be recalculated as easy as the markers can. Searching for the +lightest non-tree edge then just follows the modified markers. + +The time complexities of all operations therefore remain the same, with a~possible +exception of the operations on non-tree edges, to which we have added the burden of +updating the new $(a,b)$-trees. This however consists of $\O(1)$ updates per a~single +call to \ or \, which takes $\O(a\log_a n)$ time only. +We can therefore conclude: + +\corn{Weighted ET-trees}\id{wtet}% +The time bounds in Theorem \ref{etthm} hold for the weighted ET-trees, too. + + +%-------------------------------------------------------------------------------- + +\section{Dynamic connectivity} + +The fully dynamic connectivity problem has a~long and rich history. In the 1980's, Frederickson \cite{frederickson:dynamic} +has used his topological trees to construct a~dynamic connectivity algorithm of complexity $\O(\sqrt m)$ per update and +$\O(1)$ per query. Eppstein et al.~\cite{eppstein:sparsify} have introduced a~sparsification technique which can bring the +updates down to $\O(\sqrt n)$. Later, several different algorithms with complexity on the order of $n^\varepsilon$ +were presented by Henzinger and King \cite{henzinger:mst} and also by Mare\v{s} \cite{mares:dga}. +A~polylogarithmic time bound was first reached by the randomized algorithm of Henzinger and King \cite{henzinger:randdyn}. +The best result known as of now is the $\O(\log^2 n)$ time deterministic algorithm by Holm, +de~Lichtenberg and Thorup \cite{holm:polylog}, which will we describe in this section. + +The algorithm will maintain a~spanning forest~$F$ of the current graph~$G$, represented by an~ET-tree +which will be used to answer connectivity queries. The edges of~$G\setminus F$ will be stored as~non-tree +edges in the ET-tree. Hence, an~insertion of an~edge to~$G$ either adds it to~$F$ or inserts it as non-tree. +Deletions of non-tree edges are also easy, but when a~tree edge is deleted, we have to search for its +replacement among the non-tree edges. + +To govern the search in an~efficient way, we will associate each edge~$e$ with a~level $\ell(e) \le +L = \lfloor\log_2 n\rfloor$. For each level~$i$, we will use~$F_i$ to denote the subforest +of~$F$ containing edges of level at least~$i$. Therefore $F=F_0 \supseteq F_1 \supseteq \ldots \supseteq F_L$. +We will maintain the following \em{invariants:} + +{\narrower +\def\iinv{{\bo I\the\itemcount~}} +\numlist\iinv +\:$F$~is the maximum spanning forest of~$G$ with respect to the levels. (In other words, +if $uv$ is a~non-tree edge, then $u$ and~$v$ are connected in~$F_{\ell(uv)}$.) +\:For each~$i$, the components of~$F_i$ have at most $\lfloor n/2^i \rfloor$ vertices each. +(This implies that it does not make sense to define~$F_i$ for $i>L$, because it would be empty +anyway.) +\endlist +} + +At the beginning, the graph contains no edges, so both invariants are trivially +satisfied. Newly inserted edges can enter level~0, which cannot break I1 nor~I2. + +When we delete a~tree edge at level~$\ell$, we split a~tree~$T$ of~$F_\ell$ to two +trees $T_1$ and~$T_2$. Without loss of generality, let us assume that $T_1$ is the +smaller one. We will try to find the replacement edge of the highest possible +level that connects the spanning tree back. From I1, we know that such an~edge cannot belong to +a~level greater than~$\ell$, so we start looking for it at level~$\ell$. According +to~I2, the tree~$T$ had at most $\lfloor n/2^\ell\rfloor$ vertices, so $T_1$ has +at most $\lfloor n/2^{\ell+1} \rfloor$ of them. Thus we can increase the levels +of all edges of~$T_1$ without violating either invariant. + +We now start enumerating the non-tree edges incident with~$T_1$. Each such edge +is either local to~$T_1$ or it joins $T_1$ with~$T_2$. We will therefore check each edge +whether its other endpoint lies in~$T_2$ and if it does, we have found the replacement +edge, so we insert it to~$F_\ell$ and stop. Otherwise we move the edge one level up. (This +will be the grist for the mill of our amortization argument: We can charge most of the work at level +increases and we know that the level of each edge can reach at most~$L$.) + +If the non-tree edges at level~$\ell$ are exhausted, we try the same in the next +lower level and so on. If there is no replacement edge at level~0, the tree~$T$ +remains disconnected. + +\impl +For each level, we will use a~separate ET-tree ${\cal E}_\ell$ with~$a$ set to~2, +which will represent the forest~$F_i$ and the non-tree edges at that particular level. +Besides operations on the non-tree edges, we also need to find the tree edges of level~$\ell$ +when we want to bring them one level up. This can be accomplished either by modifying the ET-trees +to attach two lists of edges attached to vertices instead of one, or by using a~second ET-tree. + +\algn{Insertion of an~edge} +\algo +\algin An~edge $uv$ to insert. +\:$\ell(uv) \= 0$. +\:Ask the ET-tree ${\cal E}_0$ if $u$ and~$v$ are in the same component. If they are: +\::Add $uv$ to the list of non-tree edges in ${\cal E}_0$ at both $u$ and~$v$. +\:Otherwise: +\::Add $uv$ to~$F_0$. +\endalgo + +\algn{Deletion of an~edge} +\algo +\algin An~edge $uv$ to delete. +\:$\ell \= \ell(uv)$. +\:If $uv$ is a~non-tree edge: +\::Remove $uv$ from the lists of non-tree edges at both $u$ and~$v$ in~${\cal E}_{\ell}$. +\:Otherwise: +\::Remove $uv$ from~$F_\ell$ and hence also from $F_0,\ldots,F_{\ell-1}$. +\::Call $\(uv,\ell)$ to get the replacement edge~$f$. +\::Insert $f$ to~$F_0,\ldots,F_{\ell(f)}$. +\endalgo + +\algn{$\(uv,i)$ -- Search for replacement for edge~$uv$ at level~$i$} +\algo +\algin An~edge~$uv$ to replace and a~level~$i$ such that there is no replacement +at levels greater than~$i$. +\:Let $T_1$ and~$T_2$ be the trees in~$F_i$ containing $u$ and~$v$ respectively. +\:If $n(T_1) > n(T_2)$, swap $T_1$ with~$T_2$. +\:Find all level~$i$ edges in~$T_1$ using ${\cal E}_i$ and move them to level~$i+1$. +\:Enumerate non-tree edges incident with vertices of~$T_1$ and stored in ${\cal E}_i$. + For each edge~$xy$, $x\in T_1$, do: +\::If $y\in T_2$, remove~$xy$ from~${\cal E}_i$ and return it to the caller. +\::Otherwise increase $\ell(xy)$ by one. + \hfil\break + This includes deleting~$xy$ from~${\cal E}_i$ and inserting it to~${\cal E}_{i+1}$. +\:If $i>0$, call $\(xy,i-1)$. +\:Otherwise return \. +\algout The replacement edge. +\endalgo + +\>As promised, time complexity will be analysed by amortization on the levels. + +\thmn{Fully dynamic connectivity, Holm et al.~\cite{holm:polylog}}\id{dyncon}% +Dynamic connectivity can be maintained in time $\O(\log^2 n)$ amortized per +\ and \ and in time $\O(\log n/\log\log n)$ per \ +in the worst case. + +\proof +The direct cost of an~\ is $\O(\log n)$ for the operations on the ET-trees +(by Theorem \ref{etthm}). We will also have the insertion pre-pay all level increases of the new +edge. Since the levels never decrease, each edge can be brought a~level up at most +$L=\lfloor\log n\rfloor$ times. Every increase costs $\O(\log n)$ on the ET-tree +operations, so we pay $\O(\log^2 n)$ for all of them. + +A~\ costs $\O(\log^2 n)$ directly, as we might have to update all~$L$ +ET-trees. Additionally, we call \ up to $L$ times. The initialization of +\ costs $\O(\log n)$ per call, the rest is paid for by the edge level +increases. + +To bring the complexity of the operation \ from $\O(\log n)$ down to $\O(\log n/\log\log n)$, +we apply the trick from Example \ref{accel} and store~$F_0$ in a~ET-tree with $a=\max(\lfloor\log n\rfloor,2)$. +This does not hurt the complexity of insertions and deletions, but allows for faster queries. +\qed + +\rem +An~$\Omega(\log n/\log\log n)$ lower bound for the amortized complexity of the dynamic connectivity +problem has been proven by Henzinger and Fredman \cite{henzinger:lowerbounds} in the cell +probe model with $\O(\log n)$-bit words. Thorup has answered by a~faster algorithm +\cite{thorup:nearopt} that achieves $\O(\log n\log^3\log n)$ time per update and +$\O(\log n/\log^{(3)} n)$ per query on a~RAM with $\O(\log n)$-bit words. (He claims +that the algorithm runs on a~Pointer Machine, but it uses arithmetic operations, +so it does not fit the definition of the PM we use. The algorithm only does not +need direct indexing of arrays.) So far, it is not known how to extend this algorithm +to fit our needs, so we omit the details. %-------------------------------------------------------------------------------- -\section{ET trees} +\section{Dynamic spanning forests}\id{dynmstsect}% + +Let us turn our attention back to the dynamic MSF now. +Most of the early algorithms for dynamic connectivity also imply $\O(n^\varepsilon)$ +algorithms for dynamic maintenance of the MSF. Henzinger and King \cite{henzinger:twoec,henzinger:randdyn} +have generalized their randomized connectivity algorithm to maintain the MSF in $\O(\log^5 n)$ time per +operation, or $\O(k\log^3 n)$ if only $k$ different values of edge weights are allowed. They have solved +the decremental version of the problem first (which starts with a~given graph and only edge deletions +are allowed) and then presented a~general reduction from the fully dynamic MSF to its decremental version. +We will describe the algorithm of Holm, de Lichtenberg and Thorup \cite{holm:polylog}, who have followed +the same path. They have modified their dynamic connectivity algorithm to solve the decremental MSF +in $\O(\log^2 n)$ and obtained the fully dynamic MSF working in $\O(\log^4 n)$ per operation. + +\paran{Decremental MSF}% +Turning the algorithm from the previous section to the decremental MSF requires only two +changes: First, we have to start with the forest~$F$ equal to the MSF of the initial +graph. As we require to pay $\O(\log^2 n)$ for every insertion, we can use almost arbitrary +MSF algorithm to find~$F$. Second, when we search for an~replacement edge, we need to pick +the lightest possible choice. We will therefore use the weighted version of the ET-trees (Corollary \ref{wtet}) +and scan the lightest non-tree edge incident with the examined tree first. We must ensure +that the lower levels cannot contain a~lighter replacement edge, but fortunately the +light edges tend to ``bubble up'' in the hierarchy of levels. This can be formalized as +the following invariant: + +{\narrower +\def\iinv{{\bo I\the\itemcount~}} +\numlist\iinv +\itemcount=2 +\:On every cycle, the heaviest edge has the smallest level. +\endlist +} + +\>This immediately implies that we always select the right replacement edge: + +\lemma\id{msfrepl}% +Let $F$~be the minimum spanning forest and $e$ any its edge. Then among all replacement +edges for~$e$, the lightest one is at the maximum level. + +\proof +Let us consider any two edges $f_1$ and~$f_2$ replacing~$e$. By minimality of~$F$ and the Cycle +rule (Lemma \ref{redlemma}), each $f_i$ is the heaviest edge on the cycle~$C_i = F[f_i] + f_i$. +In a~moment, we will show that the symmetric difference~$C$ of these two cycles is again a~cycle. +This implies that if $f_1$ is heavier than~$f_2$, then $f_1$~is the heaviest edge on~$C$, so +$\ell(f_1) \le \ell(f_2)$ by I3. Therefore the lightest of all replacement edges must have +the maximum level. + +Why is~$C$ a~cycle: +Let $F^a$ and~$F^b$ be the trees of~$F-e$ in which the endpoints of~$e$ lie, and for +every edge~$g$ going between $F^a$ and~$F^b$ let $g^a$ and~$g^b$ be its respective endpoints. +We know that $C_i$ consists of the path $F[f_i^a,e^a]$, the edge~$e$, the path $F[e^b,f_i^b]$, +and the edge~$f_i$. Thus~$C$ must contain the paths $F[f_1^a,f_2^a]$ and $F[f_1^b,f_2^b]$ and +the edges $f_1$ and~$f_2$, which together form a~simple cycle. +\qed + +We now have to make sure that the additional invariant is indeed observed: + +\lemma\id{ithree}% +After every operation, the invariant I3 is satisfied. + +\proof +When the structure is freshly initialized, I3 is obviously satisfied, as all edges +are at level~0. Sole deletions of edges (both tree and non-tree) cannot violate I3, so we need +to check only the replaces, in particular the place when an~edge~$e$ gets its level increased. + +For the violation to happen for the first time, $e$~must be the heaviest on +some cycle~$C$, so by the Cycle rule, $e$~must be non-tree. The increase of +$\ell(e)$ must therefore take place when~$e$ is considered as a~replacement +edge incident with some tree~$T_1$ at level $\ell=\ell(e)$. We will pause the +computation just before this increase and we will prove that all other edges +of~$C$ already are at levels greater than~$\ell$, so the violation cannot occur. + +Let us first show this for edges of~$C$ incident with~$T_1$. All edges of~$T_1$ itself +already are at the higher levels as they were moved there at the very beginning of the +search for the replacement edge. The other tree edges incident with~$T_1$ would have +lower levels, which is impossible since the invariant would be already violated. +Non-tree edges of~$C$ incident with~$T_1$ are lighter than~$e$, so they were already considered +as~candidates for the replacement edge, because the algorithm always picks the lightest +candidate first. Such edges therefore have been already moved a~level up. + +The case of edges of~$C$ that do not touch~$T_1$ is easy to handle: Such edges do not exist. +If they did, at least one more edge of~$C$ besides~$e$ would have to connect~$T_1$ with the other +trees of level~$\ell$. We already know that this could not be a~tree edge. If it were a~non-tree +edge, it could not have level greater than~$\ell$ by~I1 nor smaller than~$\ell$ by~I3. Therefore +it would be a~level~$\ell$ edge lighter than~$e$, and as such it would have been selected as the +replacement edge before $e$~was. +\qed + +We can conclude: + +\thmn{Decremental MSF, Holm et al.~\cite{holm:polylog}} +When we start with a~graph on $n$~vertices with~$m$ edges and we perform a~sequence of +edge deletions, the MSF can be initialized in time $\O((m+n)\cdot\log^2 n)$ and then +updated in time $\O(\log^2 n)$ amortized per operation. + +\paran{Fully dynamic MSF}% +The decremental MSF algorithm can be turned to a~fully dynamic one by a~blackbox +reduction whose properties are summarized in the following theorem: + +\thmn{MSF dynamization, Holm et al.~\cite{holm:polylog}} +Suppose that we have a~decremental MSF algorithm with the following properties: +\numlist\ndotted +\:For any $a$,~$b$, it can be initialized on a~graph with~$a$ vertices and~$b$ edges. +\:Then it executes an~arbitrary sequence of deletions in time $\O(b\cdot t(a,b))$, where~$t$ is a~non-decreasing function. +\endlist +\>Then there exists a~fully dynamic MSF algorithm for a~graph on $n$~vertices, starting +with no edges, that performs $m$~insertions and deletions in amortized time: +$$ +\O\left( \log^3 n + \sum_{i=1}^{\log m} \sum_{j=1}^i \; t(\min(n,2^j), 2^j) \right) \hbox{\quad per operation.} +$$ + +\proofsketch +The reduction is very technical, but its essence is the following: We maintain a~logarithmic number +of decremental structures $A_0,\ldots,A_{\lfloor\log n\rfloor}$ of exponentially increasing sizes. Every non-tree +edge is contained in exactly one~$A_i$, tree edges can belong to multiple structures. + +When an~edge is inserted, we union it with some of the $A_i$'s, build a~new decremental structure +and amortize the cost of the build over the insertions. Deletes of non-tree edges are trivial. +Delete of a~non-tree edge is performed on all $A_i$'s containing it and the replacement edge is +sought among the replacement edges found in these structures. The unused replacement edges then have +to be reinserted back to the structure. + +The original reduction of Henzinger et al.~\cite{henzinger:twoec} handles these reinserts by a~mechanism of batch insertions +supported by their decremental structure, which is not available in our case. Holm et al.~have +replaced it by a~system of auxiliary edges inserted at various places in the structure. +We refer to the article \cite{holm:polylog} for details. +\qed + +\corn{Fully dynamic MSF} +There is a~fully dynamic MSF algorithm that works in time $\O(\log^4 n)$ amortized +per operation for graphs on $n$~vertices. + +\proof +Apply the reduction from the previous theorem to the decremental algorithm we have +developed. This results in an~algorithm of amortized complexity $\O(\log^4\max(m,n))$ where~$m$ +is the number of operations performed. This could exceed $\O(\log^4 n)$ if +$m$~is very large, but we can rebuild the whole structure after $n^2$~operations, +which brings $\log m$ down to $\O(\log n)$. The $\O(n^2\log^4 n)$ cost of the +rebuild then incurs only $\O(\log^4 n)$ additional cost on each operation. +\qed + +\rem +The limitation of MSF structures based on the Holm's algorithm for connectivity +to only edge deletions seems to be unavoidable. The invariant I3 could be easily +broken for many cycles at once whenever a~very light non-tree edge is inserted. +We could try increasing the level of the newly inserted edge, but we would quite +likely hit I1 before we managed to skip the levels of all the heaviest edges on the +particular cycles. + +On the other hand, if we decided to drop I3, we would encounter different problems. The ET-trees can +bring the lightest non-tree incident with the current tree~$T_1$, but the lightest replacement edge +could also be located in the super-trees of~$T_1$ at the lower levels, which are too large to scan +and both I1 and I2 prevent us from charging the time on increasing levels there. + +An~interesting special case in which insertions are possible is when all non-tree +edges have the same weight. This leads to the following algorithm for dynamic MSF +on~graphs with a~small set of allowed edge weights. It is based on an~idea similar +to the $\O(k\log^3 n)$ algorithm of Henzinger and King \cite{henzinger:randdyn}, +but adapted to use the better results on dynamic connectivity we have at hand. + +\paran{Dynamic MSF with limited edge weights}% +Let us assume for a~while that our graph has edges of only two different weights (let us say +1~and~2). We will forget our rule that all edge weights are distinct for a~moment and we recall +the observation in \ref{multiweight} that the basic structural properties of +the MST's from Section \ref{mstbasics} still hold. + +We split the graph~$G$ to two subgraphs~$G_1$ and~$G_2$ according to the edge +weights. We use one instance~$\C_1$ of the dynamic connectivity algorithm maintaining +an~arbitrary spanning forest~$F_1$ of~$G_1$, which is obviously minimum. Then we add +another instance~$\C_2$ to maintain a~spanning forest~$F_2$ of the graph $G_2\cup F_1$ +such that all edges of~$F_1$ are forced to be in~$F_2$. Obviously, $F_2$~is the +MSF of the whole graph~$G$ --- if any edge of~$F_1$ were not contained in~$\msf(G)$, +we could use the standard exchange argument to create an~even lighter spanning tree.\foot{This +is of course the Blue lemma in action, but we have to be careful as we did not have proven it +for graphs with multiple edges of the same weight.} + +When a~weight~2 edge is inserted to~$G$, we insert it to~$\C_2$ and it either enters~$F_2$ +or becomes a~non-tree edge. Similarly, deletion of a~weight~2 edge is a~pure deletion in~$\C_2$, +because such edges can be replaced only by other weight~2 edges. + +Insertion of edges of weight~1 needs more attention: We insert the edge to~$\C_1$. If~$F_1$ +stays unchanged, we are done. If the new edge enters~$F_1$, we use Sleator-Tarjan trees +kept for~$F_2$ to check if the new edge covers some tree edge of weight~2. If this is not +the case, we insert the new edge to~$\C_2$ and hence also to~$F_2$ and we are done. +Otherwise we exchange one of the covered weight~2 edges~$f$ for~$e$ in~$\C_2$. We note +that~$e$ can inherit the level of~$f$ and $f$~can become a~non-tree edge without +changing its level. This adjustment can be performed in time $\O(\log^2 n)$, including +paying for the future level increases of the new edge. + +Deletion of weight~1 edges is once more straightforward. We delete the edge from~$\C_1$. +If it has no replacement, we delete it from~$\C_2$ as well. If it has a~replacement, +we delete the edge from~$\C_2$ and insert the replacement on its place as described +above. We observe than this pair of operations causes an~insertion, deletion or +a~replacement in~$\C_2$. + +This way, we can handle every insertion and deletion in time $\O(\log^2 n)$ amortized. +This construction can be iterated in an~obvious way: if we have $k$~distinct edge weights, +we build $k$~connectivity structures $\C_1,\ldots,\C_k$. The structure~$\C_i$ contains edges of +weight~$i$ together with the MSF edges from~$\C_{i-1}$. Bounding the time complexity is then easy: + +\thmn{MSF with limited edge weights} +There is a~fully dynamic MSF algorithm that works in time $\O(k\cdot\log^2 n)$ amortized +per operation for graphs on $n$~vertices with only $k$~distinct edge weights allowed. + +\proof +A~change in the graph~$G$ involving an~edge of weight~$w$ causes a~change in~$\C_w$, +which can propagate to~$\C_{w+1}$ and so on, possibly up to~$\C_k$. In each~$\C_i$, +we spend time $\O(\log^2 n)$ by updating the connectivity structure according to +Theorem \ref{dyncon} and $\O(\log n)$ on operations with the Sleator-Tarjan trees +by Theorem \ref{sletar}. +\qed \endpart