dyn.tex

   1 \ifx\endpart\undefined
   2 \input macros.tex
   3 \fi
   4
   5 \chapter{Dynamic Spanning Trees}\id{dynchap}%
   6
   7 \section{Dynamic graph algorithms}
   8
   9 In many applications, we often need to solve a~certain graph problem for a~sequence
  10 of graphs that differ only a~little, so recomputing the solution from scratch for
  11 every graph would be a~waste of time. In such cases, we usually turn our attention
  12 to \df{dynamic graph algorithms.} A~dynamic algorithm is in fact a~data structure
  13 that remembers a~graph and offers operations that modify the structure of the graph
  14 and also operations that query the result of the problem for the current state
  15 of the graph. A~typical example of a~problem of this kind is dynamic
  16 maintenance of connected components:
  17
  18 \problemn{Dynamic connectivity}
  19 Maintain an~undirected graph under a~sequence of the following operations:
  20 \itemize\ibull
  21 \:$\<Init>(n)$ --- Create a~graph with $n$~isolated vertices $\{1,\ldots,n\}$.\foot{%
  22 The structure could support dynamic addition and removal of vertices, too,
  23 but this is easy to add and infrequently used, so we will rather keep the set
  24 of vertices fixed for clarity.}
  25 \:$\<Insert>(G,u,v)$ --- Insert an~edge $uv$ to~$G$ and return its unique
  26 identifier. This assumes that the edge did not exist yet.
  27 \:$\<Delete>(G,e)$ --- Delete an~edge specified by its identifier from~$G$.
  28 \:$\<Connected>(G,u,v)$ --- Test if $u$ and~$v$ are in the same connected component of~$G$.
  29 \endlist
  30
  31 \para
  32 We have already encountered a~special case of dynamic connectivity when implementing the
  33 Kruskal's algorithm in Section \ref{classalg}. At that time, we did not need to delete
  34 any edges from the graph, which makes the problem substantially easier. This special
  35 case is customarily called an~\df{incremental} or \df{semidynamic} graph algorithm.
  36 We mentioned the Disjoint Set Union data structure of Tarjan and van Leeuwen (Theorem \ref{dfu})
  37 which can be used for that: Connected components are represented as an~equivalence classes.
  38 Queries on connectedness translate to \<Find>, edge insertions to \<Find>
  39 followed by \<Union> if the new edge joins two different components. This way,
  40 a~sequence of $m$~operations starting with an~empty graph on $n$~vertices is
  41 processed in time $\O(n+m\timesalpha(m,n))$ and this holds even for the Pointer
  42 Machine. Fredman and Saks \cite{fredman:cellprobe} have proven a~matching lower
  43 bound in the cell-probe model which is stronger than RAM with $\O(\log n)$-bit
  44 words.
  45
  46 The edges that have triggered the \<Union>s form a~spanning forest of the current graph.
  47 So far, all known algorithms for dynamic connectivity maintain some sort of a~spanning
  48 tree. This suggests that a~dynamic MST algorithm could be obtained by modifying the
  49 dynamic connectivity algorithms. This will indeed turn out to be true. Semidynamic MST
  50 is easy to achieve even in the few pages of this section, but making it fully dynamic will require
  51 more effort, so we will review some of the required building blocks before going into that.
  52
  53 We however have to answer one important question first: What should be the output of
  54 our MSF data structure? Adding an~operation that would return the MSF of the current
  55 graph is of course possible, but somewhat impractical as this operation has to
  56 spend $\Omega(n)$ time on the mere writing of its output. A~better way seems to
  57 be making the \<Insert> and \<Delete> operations report the list of modifications
  58 of the MSF implied by the change in the graph.
  59
  60 Let us see what happens when we \<Insert> an~edge~$e$ to a~graph~$G$ with its minimum spanning
  61 forest~$F$, obtaining a~new graph~$G'$ with its MSF~$F'$. If $e$~connects two components of~$G$ (and
  62 therefore also of~$F$), we have to add~$e$ to~$F$. Otherwise, one of the following cases happens:
  63 Either $e$~is $F$-heavy and so the forest~$F$ is also the MSF of the new graph. Or it is $F$-light
  64 and we have to modify~$F$ by exchanging the heaviest edge~$f$ of the path $F[e]$ with~$e$.
  65
  66 Correctness of the former case follows immediately from the Theorem on Minimality by order
  67 (\ref{mstthm}), because all $F'$-light would be also $F$-light, which is impossible as $F$~was
  68 minimum. In the latter case, the edge~$f$ is not contained in~$F'$ because it is the heaviest
  69 on the cycle $F[e]+e$ (by the Red lemma, \ref{redlemma}). We can now use the Blue lemma
  70 (\ref{bluelemma}) to prove that it should be replaced with~$e$. Consider the tree~$T$
  71 of~$F$ that contains both endpoints of the edge~$e$. When we remove~$f$ from~$F$, this tree falls
  72 apart to two components $T_1$ and~$T_2$. The edge~$f$ was the lightest edge of the cut~$\delta_G(T_1)$
  73 and $e$~is lighter than~$f$, so $e$~is the lightest in~$\delta_{G'}(T_1)$ and hence $e\in F'$.
  74
  75 A~\<Delete> of an~edge that is not contained in~$F$ does not change~$F$. When we delete
  76 an~MSF edge, we have to reconnect~$F$ by choosing the lightest edge of the cut separating
  77 the new components (again the Blue lemma in action). If there is no such
  78 replacement edge, we have deleted a~bridge, so the MSF has to remain
  79 disconnected.
  80
  81 The idea of reporting differences in the MSF indeed works very well. We can summarize
  82 what we have shown in the following lemma and use it to define the dynamic MSF.
  83
  84 \lemma
  85 An~\<Insert> or \<Delete> of an~edge in~$G$ causes at most one edge addition, edge
  86 removal or edge exchange in $\msf(G)$.
  87
  88 \problemn{Dynamic minimum spanning forest}
  89 Maintain an~undirected graph with distinct weights on edges (drawn from a~totally ordered set)
  90 and its minimum spanning forest under a~sequence of the following operations:
  91 \itemize\ibull
  92 \:$\<Init>(n)$ --- Create a~graph with $n$~isolated vertices $\{1,\ldots,n\}$.
  93 \:$\<Insert>(G,u,v)$ --- Insert an~edge $uv$ to~$G$. Return its unique
  94         identifier and the list of additions and deletions of edges in $\msf(G)$.
  95 \:$\<Delete>(G,e)$ --- Delete an~edge specified by its identifier from~$G$.
  96         Return the list of additions and deletions of edges in $\msf(G)$.
  97 \endlist
  98
  99 \paran{Semidynamic MSF}%
 100 To obtain a~semidynamic MSF algorithm, we need to keep the forest in a~data structure that
 101 supports search for the heaviest edge on the path connecting a~given pair
 102 of vertices. This can be handled efficiently by the Link-Cut trees of Sleator and Tarjan:
 103
 104 \thmn{Link-Cut Trees, Sleator and Tarjan \cite{sleator:trees}}\id{sletar}%
 105 There is a~data structure that represents a~forest of rooted trees on~$n$ vertices.
 106 Each edge of the forest has a~weight drawn from a~totally ordered set. The structure
 107 supports the following operations in time $\O(\log n)$ amortized:\foot{%
 108 The Link-Cut trees offer many other operations, but we do not mention them
 109 as they are not needed in our application.}
 110 \itemize\ibull
 111 \:$\<Parent>(v)$ --- Return the parent of~$v$ in its tree or \<null> if $v$~is a~root.
 112 \:$\<Root>(v)$ --- Return the root of the tree containing~$v$.
 113 \:$\<Weight>(v)$ --- Return the weight of the edge $(\<Parent(v)>,v)$.
 114 \:$\<PathMax>(v)$ --- Return the vertex~$w$ closest to $\<Root>(v)$ such that the edge
 115         $(\<Parent>(w),w)$ is the heaviest of those on the path from the root to~$v$.
 116         If more edges have the maximum weight, break the tie arbitrarily.
 117         If there is no such edge ($v$~is the root itself), return \<null>.
 118 \:$\<Link>(u,v,x)$ --- Connect the trees containing $u$ and~$v$ by an~edge $(u,v)$ of
 119         weight~$x$. Assumes that $u~$is a tree root and $v$~lies in a~different tree.
 120 \:$\<Cut>(v)$ --- Split the tree containing the non-root vertex $v$ to two trees by
 121         removing the edge $(\<Parent>(v),v)$. Returns the weight of this edge.
 122 \:$\<Evert>(v)$ --- Modify the orientations of the edges in the tree containing~$v$
 123         to make~$v$ the tree's root.
 124 \endlist
 125
 126 %% \>Additionally, all edges on the path from~$v$ to $\<Root>(v)$ can be enumerated in
 127 %% time $\O(\ell + \log n)$, where $\ell$~is the length of that path. This operation
 128 %% (and also the path itself) is called $\<Path>(v)$.
 129 %%
 130 %% \>If the weights are real numbers (or in general an~arbitrary group), the $\O(\log n)$
 131 %% operations also include:
 132 %%
 133 %% \itemize\ibull
 134 %% \:$\<PathWeight>(v)$ --- Return the sum of the weights on $\<Path>(v)$.
 135 %% \:$\<PathUpdate>(v,x)$ --- Add~$x$ to the weights of all edges on $\<Path>(v)$.
 136 %% \endlist
 137
 138 \proof
 139 See \cite{sleator:trees}.
 140 \qed
 141
 142 Once we have this structure, we can turn our ideas on updating of the MSF to
 143 an~incremental algorithm:
 144
 145 \algn{\<Insert> in a~semidynamic MSF}
 146 \algo
 147 \algin A~graph~$G$ with its MSF $F$ represented as a~Link-Cut forest, an~edge~$uv$
 148 with weight~$w$ to be inserted.
 149 \:$\<Evert>(u)$. \cmt{$u$~is now the root of its tree.}
 150 \:If $\<Root>(v) \ne u$: \cmt{$u$~and~$v$ lie in different trees.}
 151 \::$\<Link>(u,v,w)$. \cmt{Connect the trees.}
 152 \::Return ``$uv$ added''.
 153 \:Otherwise: \cmt{both are in the same tree}
 154 \::$y\=\<PathMax>(v)$.
 155 \::$x\=\<Parent>(y)$.  \cmt{Edge~$xy$ is the heaviest on $F[uv]$.}
 156 \::If $\<Weight>(y) > w$: \cmt{We have to exchange~$xy$ with~$uv$.}
 157 \:::$\<Cut>(y)$, $\<Evert>(v)$, $\<Link>(v,y,w)$.
 158 \:::Return ``$uv$~added, $xy$~removed''.
 159 \::Otherwise return ``no changes''.
 160 \algout The list of changes in~$F$.
 161 \endalgo
 162
 163 \thmn{Incremental MSF}
 164 When only edge insertions are allowed, the dynamic MSF can be maintained in time $\O(\log n)$
 165 amortized per operation.
 166
 167 \proof
 168 Every \<Insert> performs $\O(1)$ operations on the Link-Cut forest, which take
 169 $\O(\log n)$ each by Theorem \ref{sletar}.
 170 \qed
 171
 172 \rem
 173 We can easily extend the semidynamic MSF algorithm to allow an~operation commonly called
 174 \<Backtrack> --- removal of the most recently inserted edge. It is sufficient to keep the
 175 history of all MSF changes in a~stack and reverse the most recent change upon backtrack.
 176
 177 What are the obstacles to making the structure fully dynamic?
 178 Deletion of edges that do not belong to the MSF is trivial (we do not
 179 need to change anything) and so is deletion of bridges (we just remove the bridge
 180 from the Link-Cut tree, knowing that there is no edge to replace it). The hard part
 181 is the search for replacement edges after an~edge of the MSF is deleted.
 182
 183 This very problem also has to be solved by algorithms for fully dynamic connectivity,
 184 we will take a~look on them first.
 185
 186 %--------------------------------------------------------------------------------
 187
 188 \section{Eulerian Tour trees}
 189
 190 An~important stop on the road to fully dynamic algorithms has the name \df{Eulerian Tour trees} or
 191 simply \df{ET-trees}. It is a~representation of forests introduced by Henzinger and King
 192 \cite{henzinger:randdyn} in their randomized dynamic algorithms. It is similar to the one by Sleator
 193 and Tarjan, but it is much simpler and instead of path operations it offers efficient operations on
 194 subtrees. It is also possible to attach auxiliary data to vertices and edges of the original tree.
 195
 196 \defn\id{eulseq}%
 197 The \df{Eulerian Tour sequence} $\Eul(T)$ of a~rooted tree~$T$ is the sequence of vertices of~$T$ as visited
 198 by the depth-first traversal of~$T$. More precisely, it is generated by the following algorithm $\<ET>(v)$
 199 when it is invoked on the root of the tree:
 200 \algo
 201 \:Record~$v$ in the sequence.
 202 \:For each son~$w$ of~$v$:
 203 \::Call $\<ET>(w)$ recursively.
 204 \::Record~$w$.
 205 \endalgo
 206 \>One of the occurrences of each vertex is defined as its \df{active occurrence} and it will
 207 be used to store auxiliary data associated with the vertex.
 208
 209 \obs
 210 The ET-sequence contains a~vertex of degree~$d$ exactly $d$~times except for the root which
 211 occurs $d+1$ times. The whole sequence therefore contains $2n-1$ elements. It indeed describes the
 212 order vertices on an~Eulerian tour in the tree with all edges doubled. Let us observe what happens
 213 to the ET-sequence when we modify the tree.
 214
 215 When we \em{delete} an~edge $uv$ from the tree~$T$ (let $u$~be the parent of~$v$), the sequence
 216 $\Eul(T) = AuvBvuC$ (with no~$u$ nor~$v$ in~$B$) splits to two ET-sequences $AuC$ and $vBv$.
 217 If there was only a~single occurrence of~$v$, it corresponded to a~leaf and thus the second
 218 sequence should consist of $v$~alone.
 219
 220 \em{Changing the root} of the tree~$T$ from~$v$ to~$w$ changes $\Eul(T)$ from $vAwBwCv$ to $wBwCvAw$.
 221 If $w$~was a~leaf, the sequence changes from $vAwCv$ to $wCvAw$. If $vw$ was the only edge of~$T$,
 222 the sequence $vw$ becomes $wv$. Note that this works regardless of the presence of~$w$ inside~$B$.
 223
 224 \em{Joining} the roots of two trees by a~new edge makes their ET-sequences $vAv$ and~$wBw$
 225 combine to $vAvwBwv$. Again, we have to handle the cases when $v$ or~$w$ has degree~1 separately:
 226 $v$~and~$wBw$ combine to $vwBwv$, and $v$~with~$w$ makes $vwv$.
 227
 228 If any of the occurrences that we have removed from the sequence was active, there is always
 229 a~new occurrence of the same vertex that can stand in its place and inherit the auxiliary data.
 230
 231 The ET-trees will represent the ET-sequences by $(a,b)$-trees with the parameter~$a$ set upon
 232 initialization of the structure and with $b=2a$. We know from the standard theorems of $(a,b)$-trees
 233 (see for example \cite{clrs}) that the depth of a~tree with $n$~leaves is always $\O(\log_a n)$
 234 and that all basic operations including insertion, deletion, search, splitting and joining the trees
 235 run in time $\O(a\log_a n)$ in the worst case.
 236
 237 We will use the ET-trees to maintain a~spanning forest of the current graph. The auxiliary data of
 238 each vertex will hold a~list of edges incident with the given vertex, which do not lie in the
 239 forest. Such edges are usually called the \df{non-tree edges.}
 240
 241 \defn
 242 \df{Eulerian Tour trees} are a~data structure that represents a~forest of trees and a~set of non-tree
 243 edges associated with the vertices of the forest. To avoid confusion, we will distinguish between
 244 \df{original} vertices and edges (of the original trees) and the vertices and edges of the
 245 data structure. The structure consists of:
 246 \itemize\ibull
 247 \:A~collection of $(a,b)$-trees with some fixed parameters $a$ and~$b$.
 248         Each such tree corresponds to one of the original trees~$T$. Its
 249         leaves (in the usual tree order) correspond to the elements
 250         of the ET-sequence $\Eul(T)$. Each two consecutive leaves $u$ and~$v$ are separated
 251         by a~unique key stored in an~internal vertex of the $(a,b)$-tree. This key is used to represent
 252         the original edge~$uv$. Each original edge is therefore kept in both its orientations.
 253 \:A~mapping $\<act>(v)$ that maps each original vertex to the leaf containing its active occurrence.
 254 \:A~mapping $\<edge>(e)$ that maps an~original edge~$e$ to one of the internal keys representing it.
 255 \:A~mapping $\<twin>(k)$ that maps an~internal key~$k$ to the other internal key of the same
 256         original edge.
 257 \:A~list of non-tree edges placed in each leaf. The lists are allowed to be non-empty only
 258         in the leaves that represent active occurrences of original vertices.
 259 \:Boolean \df{markers} in the internal vertices that signal presence of a~non-tree
 260         edge anywhere in the subtree rooted at that internal vertex.
 261 \endlist
 262 \>The structure supports the following operations on the original trees:
 263 \itemize\ibull
 264 \:\<Create> --- Create a~single-vertex tree.
 265 \:$\<Link>(u,v)$ --- Join two different trees by an~edge~$uv$ and return a~unique identifier
 266         of this edge.
 267 \:$\<Cut>(e)$ --- Split a~tree by removing the edge~$e$ given by its identifier.
 268 \:$\<Root>(v)$ --- Return the root of the ET-tree containing the vertex~$v$. This can be used
 269         to test whether two vertices lie in the same tree. However, the root is not guaranteed
 270         to stay the same when the tree is modified by a~\<Link> or \<Cut>.
 271 \:$\<InsertNontree>(v,e)$ --- Add a~non-tree edge~$e$ to the list at~$v$ and return a~unique
 272         identifier of this edge.
 273 \:$\<DeleteNontree>(e)$ --- Delete a~non-tree edge~$e$ given by its identifier.
 274 \:$\<ScanNontree>(v)$ --- Return non-tree edges stored in the same tree as~$v$.
 275 \endlist
 276
 277 \impl
 278 We will implement the operations on the ET-trees by translating the intended changes of the
 279 ET-sequences to operations on the $(a,b)$-trees. The role of identifiers of the original vertices
 280 and edges will be of course played by pointers to the respective leaves and internal keys of
 281 the $(a,b)$-trees.
 282
 283 \<Cut> of an~edge splits the $(a,b)$-tree at both internal keys representing the given edge
 284 and joins them back in the different order.
 285
 286 \<Link> of two trees can be accomplished by making both vertices the roots of their trees first
 287 and joining the roots by an~edge afterwards. Re-rooting again involves splits and joins of $(a,b)$-trees.
 288 As we can split at any occurrence of the new root vertex, we will use the active occurrence
 289 which we remember. Joining of the roots translated to joining of the $(a,b)$-trees.
 290
 291 \<Root> just follows parent pointers in the $(a,b)$-tree and it walks the path from the leaf
 292 to the root.
 293
 294 \<InsertNontree> finds the leaf $\<act>(v)$ containing the list of $v$'s non-tree edges
 295 and inserts the new edge there. The returned identifier will consist from the pointer to
 296 the edge and the vertex in whose list it is stored. Then we have to recalculate the markers
 297 on the path from $\<act>(v)$ to the root. \<DeleteNontree> is analogous. Whenever any
 298 other operation changes a~vertex of the tree, it will also update its marker and, if necessary,
 299 the markers on the path to the root.
 300
 301 \<ScanNontree> traverses the tree recursively from the root, but it does not enter the
 302 subtrees whose roots are not marked.
 303
 304 Analysis of the time complexity is now straightforward:
 305
 306 \thmn{Eulerian Tour trees, Henzinger and Rauch \cite{henzinger:randdyn}}
 307 The ET-trees perform the operations \<Link> and \<Cut> in time $\O(a\log_a n)$, \<Create>
 308 in $\O(1)$, \<Root>, \<InsertNontree>, and \<DeleteNontree> in $\O(\log_a n)$, and
 309 \<ScanNontree> in $\O(a\log_a n)$ per edge reported. Here $n$~is the number of vertices
 310 in the original forest and $a\ge 2$ is an~arbitrary constant.
 311
 312 \proof
 313 We set $b=2a$. Our implementation performs $\O(1)$ operations on the $(a,b)$-trees
 314 per operation on the ET-tree, plus $\O(1)$ other operations. We apply the standard theorems
 315 on the complexity of $(a,b)$-trees \cite{clrs}.
 316 \qed
 317
 318 \examplen{Connectivity acceleration}
 319 In most cases, the ET-trees are used with $a$~constant, but sometimes choosing~$a$ as a~function
 320 of~$n$ can also have its beauty. Suppose that there is a~data structure which maintains an~arbitrary
 321 spanning forest of a~dynamic graph. Suppose also that the structure works in time $\O(\log^k n)$
 322 per operation and that it reports $\O(1)$ changes in the spanning forest for every change
 323 in the graph. If we keep the spanning forest in ET-trees with $a=\log n$, the updates of the
 324 data structure cost an~extra $\O(\log^2 n / \log\log n)$, but queries accelerate to $\O(\log
 325 n/\log\log n)$.
 326
 327
 328 \endpart