From 3235ed72f55ba2d4fb4df9d03e7a166a971bbd9c Mon Sep 17 00:00:00 2001 From: Martin Mares Date: Fri, 11 Apr 2008 13:00:29 +0200 Subject: [PATCH] Analysis of ET-trees. --- PLAN | 1 + dyn.tex | 104 +++++++++++++++++++++++++++++++++++++++++++++++--------- 2 files changed, 89 insertions(+), 16 deletions(-) diff --git a/PLAN b/PLAN index 02e8af2..11f71ac 100644 --- a/PLAN +++ b/PLAN @@ -78,6 +78,7 @@ Models: - Tarjan79 is mentioned by Pettie to define Pointer machines - add references to the C language - PM: unify yardsticks +- all data structures should mention space complexity Dynamic: diff --git a/dyn.tex b/dyn.tex index bdaf64e..967c866 100644 --- a/dyn.tex +++ b/dyn.tex @@ -190,9 +190,8 @@ we will take a~look on them first. An~important stop on the road to fully dynamic algorithms has the name \df{Eulerian Tour trees} or simply \df{ET-trees}. It is a~representation of forests introduced by Henzinger and King \cite{henzinger:randdyn} in their randomized dynamic algorithms. It is similar to the one by Sleator -and Tarjan, but without orientation and without the path operations. Their advantage lies in -simplicity and also in their ease of extension to contain various auxiliary data attached to -vertices and edges of the original tree. +and Tarjan, but it is much simpler and instead of path operations it offers efficient operations on +subtrees. It is also possible to attach auxiliary data to vertices and edges of the original tree. \defn\id{eulseq}% The \df{Eulerian Tour sequence} $\Eul(T)$ of a~rooted tree~$T$ is the sequence of vertices of~$T$ as visited @@ -204,7 +203,7 @@ when it is invoked on the root of the tree: \::Call $\(w)$ recursively. \::Record~$w$. \endalgo -\>One of the occurrences of each vertex is defined as its \df{designated occurrence} and it will +\>One of the occurrences of each vertex is defined as its \df{active occurrence} and it will be used to store auxiliary data associated with the vertex. \obs @@ -218,15 +217,15 @@ $\Eul(T) = AuvBvuC$ (with no~$u$ nor~$v$ in~$B$) splits to two ET-sequences $AuC If there was only a~single occurrence of~$v$, it corresponded to a~leaf and thus the second sequence should consist of $v$~alone. -\em{Changing the root} of the tree~$T$ from~$v$ to~$w$ changes $\Eul(T) = vAwBwCv$ (no~$w$ in~$B$) -to $wBwCvAw$. If $w$~was a~leaf, the sequence changes from $vAwCv$ to $wCvAw$. If $vw$ was the only -edge of~$T$, the sequence $vw$ becomes $wv$. +\em{Changing the root} of the tree~$T$ from~$v$ to~$w$ changes $\Eul(T)$ from $vAwBwCv$ to $wBwCvAw$. +If $w$~was a~leaf, the sequence changes from $vAwCv$ to $wCvAw$. If $vw$ was the only edge of~$T$, +the sequence $vw$ becomes $wv$. Note that this works regardless of the presence of~$w$ inside~$B$. \em{Joining} the roots of two trees by a~new edge makes their ET-sequences $vAv$ and~$wBw$ combine to $vAvwBwv$. Again, we have to handle the cases when $v$ or~$w$ has degree~1 separately: $v$~and~$wBw$ combine to $vwBwv$, and $v$~with~$w$ makes $vwv$. -If any of the occurrences that we have removed from the sequence was designated, there is always +If any of the occurrences that we have removed from the sequence was active, there is always a~new occurrence of the same vertex that can stand in its place and inherit the auxiliary data. The ET-trees will represent the ET-sequences by $(a,b)$-trees with the parameter~$a$ set upon @@ -240,17 +239,90 @@ each vertex will hold a~list of edges incident with the given vertex, which do n forest. Such edges are usually called the \df{non-tree edges.} \defn -An~\df{Eulerian Tour tree} is a~data structure that represents a~tree~$T$ and a~set of non-tree -edges associated with the vertices of the tree. It contains: +\df{Eulerian Tour trees} are a~data structure that represents a~forest of trees and a~set of non-tree +edges associated with the vertices of the forest. To avoid confusion, we will distinguish between +\df{original} vertices and edges (of the original trees) and the vertices and edges of the +data structure. The structure consists of: \itemize\ibull -\:An~$(a,b)$-tree~$Q$ whose leaves (in the usual tree order) correspond to the elements +\:A~collection of $(a,b)$-trees with some fixed parameters $a$ and~$b$. + Each such tree corresponds to one of the original trees~$T$. Its + leaves (in the usual tree order) correspond to the elements of the ET-sequence $\Eul(T)$. Each two consecutive leaves $u$ and~$v$ are separated - by a~unique key stored in an~internal vertex of~$Q$ which is used to represent the - edge~$uv$. This way, each edge is stored in both its orientations. -\:A~mapping $\(v)$ which maps each vertex of~$T$ to the leaf of~$Q$ containing its - designated occurence. -\:Mappings $\_1(e)$ and $\_2(e)$ that map each edge of~$T$ to one of the keys inside~$Q$ + by a~unique key stored in an~internal vertex of the $(a,b)$-tree. This key is used to represent + the original edge~$uv$. Each original edge is therefore kept in both its orientations. +\:A~mapping $\(v)$ that maps each original vertex to the leaf containing its active occurrence. +\:A~mapping $\(e)$ that maps an~original edge~$e$ to one of the internal keys representing it. +\:A~mapping $\(k)$ that maps an~internal key~$k$ to the other internal key of the same + original edge. +\:A~list of non-tree edges placed in each leaf. The lists are allowed to be non-empty only + in the leaves that represent active occurrences of original vertices. +\:Boolean \df{markers} in the internal vertices that signal presence of a~non-tree + edge anywhere in the subtree rooted at that internal vertex. \endlist +\>The structure supports the following operations on the original trees: +\itemize\ibull +\:\ --- Create a~single-vertex tree. +\:$\(u,v)$ --- Join two different trees by an~edge~$uv$ and return a~unique identifier + of this edge. +\:$\(e)$ --- Split a~tree by removing the edge~$e$ given by its identifier. +\:$\(v)$ --- Return the root of the ET-tree containing the vertex~$v$. This can be used + to test whether two vertices lie in the same tree. However, the root is not guaranteed + to stay the same when the tree is modified by a~\ or \. +\:$\(v,e)$ --- Add a~non-tree edge~$e$ to the list at~$v$ and return a~unique + identifier of this edge. +\:$\(e)$ --- Delete a~non-tree edge~$e$ given by its identifier. +\:$\(v)$ --- Return non-tree edges stored in the same tree as~$v$. +\endlist + +\impl +We will implement the operations on the ET-trees by translating the intended changes of the +ET-sequences to operations on the $(a,b)$-trees. The role of identifiers of the original vertices +and edges will be of course played by pointers to the respective leaves and internal keys of +the $(a,b)$-trees. + +\ of an~edge splits the $(a,b)$-tree at both internal keys representing the given edge +and joins them back in the different order. + +\ of two trees can be accomplished by making both vertices the roots of their trees first +and joining the roots by an~edge afterwards. Re-rooting again involves splits and joins of $(a,b)$-trees. +As we can split at any occurrence of the new root vertex, we will use the active occurrence +which we remember. Joining of the roots translated to joining of the $(a,b)$-trees. + +\ just follows parent pointers in the $(a,b)$-tree and it walks the path from the leaf +to the root. + +\ finds the leaf $\(v)$ containing the list of $v$'s non-tree edges +and inserts the new edge there. The returned identifier will consist from the pointer to +the edge and the vertex in whose list it is stored. Then we have to recalculate the markers +on the path from $\(v)$ to the root. \ is analogous. Whenever any +other operation changes a~vertex of the tree, it will also update its marker and, if necessary, +the markers on the path to the root. + +\ traverses the tree recursively from the root, but it does not enter the +subtrees whose roots are not marked. + +Analysis of the time complexity is now straightforward: + +\thmn{Eulerian Tour trees, Henzinger and Rauch \cite{henzinger:randdyn}} +The ET-trees perform the operations \ and \ in time $\O(a\log_a n)$, \ +in $\O(1)$, \, \, and \ in $\O(\log_a n)$, and +\ in $\O(a\log_a n)$ per edge reported. Here $n$~is the number of vertices +in the original forest and $a\ge 2$ is an~arbitrary constant. + +\proof +We set $b=2a$. Our implementation performs $\O(1)$ operations on the $(a,b)$-trees +per operation on the ET-tree, plus $\O(1)$ other operations. We apply the standard theorems +on the complexity of $(a,b)$-trees \cite{clrs}. +\qed + +\examplen{Connectivity acceleration} +In most cases, the ET-trees are used with $a$~constant, but sometimes choosing~$a$ as a~function +of~$n$ can also have its beauty. Suppose that there is a~data structure which maintains an~arbitrary +spanning forest of a~dynamic graph. Suppose also that the structure works in time $\O(\log^k n)$ +per operation and that it reports $\O(1)$ changes in the spanning forest for every change +in the graph. If we keep the spanning forest in ET-trees with $a=\log n$, the updates of the +data structure cost an~extra $\O(\log^2 n / \log\log n)$, but queries accelerate to $\O(\log +n/\log\log n)$. \endpart -- 2.39.2