Analysis of ET-trees.

author Martin Mares <mj@ucw.cz>

Fri, 11 Apr 2008 11:00:29 +0000 (13:00 +0200)

committer Martin Mares <mj@ucw.cz>

Fri, 11 Apr 2008 11:00:29 +0000 (13:00 +0200)
author Martin Mares <mj@ucw.cz>
Fri, 11 Apr 2008 11:00:29 +0000 (13:00 +0200)
committer Martin Mares <mj@ucw.cz>
Fri, 11 Apr 2008 11:00:29 +0000 (13:00 +0200)
diff --git a/PLAN b/PLAN

index 02e8af206f40fa59ae90008cb53e7dcadd05a20b..11f71acdc022bdceff15f336586448ce689a6f54 100644 (file)
--- a/PLAN
+++ b/PLAN
@@ -78,6 +78,7 @@ Models:
  - Tarjan79 is mentioned by Pettie to define Pointer machines
  - add references to the C language
  - PM: unify yardsticks
+- all data structures should mention space complexity
  
  Dynamic:
  
diff --git a/dyn.tex b/dyn.tex

index bdaf64ee068e87f108affae32887dc2c02c3ab65..967c86675473d9c19a272b648cac1492b6174fc8 100644 (file)
--- a/dyn.tex
+++ b/dyn.tex
@@ -190,9 +190,8 @@ we will take a~look on them first.
  An~important stop on the road to fully dynamic algorithms has the name \df{Eulerian Tour trees} or
  simply \df{ET-trees}. It is a~representation of forests introduced by Henzinger and King
  \cite{henzinger:randdyn} in their randomized dynamic algorithms. It is similar to the one by Sleator
-and Tarjan, but without orientation and without the path operations. Their advantage lies in
-simplicity and also in their ease of extension to contain various auxiliary data attached to
-vertices and edges of the original tree.
+and Tarjan, but it is much simpler and instead of path operations it offers efficient operations on
+subtrees. It is also possible to attach auxiliary data to vertices and edges of the original tree.
  
  \defn\id{eulseq}%
  The \df{Eulerian Tour sequence} $\Eul(T)$ of a~rooted tree~$T$ is the sequence of vertices of~$T$ as visited
@@ -204,7 +203,7 @@ when it is invoked on the root of the tree:
  \::Call $\<ET>(w)$ recursively.
  \::Record~$w$.
  \endalgo
-\>One of the occurrences of each vertex is defined as its \df{designated occurrence} and it will
+\>One of the occurrences of each vertex is defined as its \df{active occurrence} and it will
  be used to store auxiliary data associated with the vertex.
  
  \obs
@@ -218,15 +217,15 @@ $\Eul(T) = AuvBvuC$ (with no~$u$ nor~$v$ in~$B$) splits to two ET-sequences $AuC
  If there was only a~single occurrence of~$v$, it corresponded to a~leaf and thus the second
  sequence should consist of $v$~alone.
  
-\em{Changing the root} of the tree~$T$ from~$v$ to~$w$ changes $\Eul(T) = vAwBwCv$ (no~$w$ in~$B$)
-to $wBwCvAw$. If $w$~was a~leaf, the sequence changes from $vAwCv$ to $wCvAw$. If $vw$ was the only
-edge of~$T$, the sequence $vw$ becomes $wv$.
+\em{Changing the root} of the tree~$T$ from~$v$ to~$w$ changes $\Eul(T)$ from $vAwBwCv$ to $wBwCvAw$.
+If $w$~was a~leaf, the sequence changes from $vAwCv$ to $wCvAw$. If $vw$ was the only edge of~$T$,
+the sequence $vw$ becomes $wv$. Note that this works regardless of the presence of~$w$ inside~$B$.
  
  \em{Joining} the roots of two trees by a~new edge makes their ET-sequences $vAv$ and~$wBw$
  combine to $vAvwBwv$. Again, we have to handle the cases when $v$ or~$w$ has degree~1 separately:
  $v$~and~$wBw$ combine to $vwBwv$, and $v$~with~$w$ makes $vwv$.
  
-If any of the occurrences that we have removed from the sequence was designated, there is always
+If any of the occurrences that we have removed from the sequence was active, there is always
  a~new occurrence of the same vertex that can stand in its place and inherit the auxiliary data.
  
  The ET-trees will represent the ET-sequences by $(a,b)$-trees with the parameter~$a$ set upon
@@ -240,17 +239,90 @@ each vertex will hold a~list of edges incident with the given vertex, which do n
  forest. Such edges are usually called the \df{non-tree edges.}
  
  \defn
-An~\df{Eulerian Tour tree} is a~data structure that represents a~tree~$T$ and a~set of non-tree
-edges associated with the vertices of the tree. It contains:
+\df{Eulerian Tour trees} are a~data structure that represents a~forest of trees and a~set of non-tree
+edges associated with the vertices of the forest. To avoid confusion, we will distinguish between
+\df{original} vertices and edges (of the original trees) and the vertices and edges of the
+data structure. The structure consists of:
  \itemize\ibull
-\:An~$(a,b)$-tree~$Q$ whose leaves (in the usual tree order) correspond to the elements
+\:A~collection of $(a,b)$-trees with some fixed parameters $a$ and~$b$.
+       Each such tree corresponds to one of the original trees~$T$. Its
+       leaves (in the usual tree order) correspond to the elements
         of the ET-sequence $\Eul(T)$. Each two consecutive leaves $u$ and~$v$ are separated
-       by a~unique key stored in an~internal vertex of~$Q$ which is used to represent the
-       edge~$uv$. This way, each edge is stored in both its orientations.
-\:A~mapping $\<dsg>(v)$ which maps each vertex of~$T$ to the leaf of~$Q$ containing its
-       designated occurence.
-\:Mappings $\<edge>_1(e)$ and $\<edge>_2(e)$ that map each edge of~$T$ to one of the keys inside~$Q$ 
+       by a~unique key stored in an~internal vertex of the $(a,b)$-tree. This key is used to represent
+       the original edge~$uv$. Each original edge is therefore kept in both its orientations.
+\:A~mapping $\<act>(v)$ that maps each original vertex to the leaf containing its active occurrence.
+\:A~mapping $\<edge>(e)$ that maps an~original edge~$e$ to one of the internal keys representing it.
+\:A~mapping $\<twin>(k)$ that maps an~internal key~$k$ to the other internal key of the same
+       original edge.
+\:A~list of non-tree edges placed in each leaf. The lists are allowed to be non-empty only
+       in the leaves that represent active occurrences of original vertices.
+\:Boolean \df{markers} in the internal vertices that signal presence of a~non-tree
+       edge anywhere in the subtree rooted at that internal vertex.
  \endlist
+\>The structure supports the following operations on the original trees:
+\itemize\ibull
+\:\<Create> --- Create a~single-vertex tree.
+\:$\<Link>(u,v)$ --- Join two different trees by an~edge~$uv$ and return a~unique identifier
+       of this edge.
+\:$\<Cut>(e)$ --- Split a~tree by removing the edge~$e$ given by its identifier.
+\:$\<Root>(v)$ --- Return the root of the ET-tree containing the vertex~$v$. This can be used
+       to test whether two vertices lie in the same tree. However, the root is not guaranteed
+       to stay the same when the tree is modified by a~\<Link> or \<Cut>.
+\:$\<InsertNontree>(v,e)$ --- Add a~non-tree edge~$e$ to the list at~$v$ and return a~unique
+       identifier of this edge.
+\:$\<DeleteNontree>(e)$ --- Delete a~non-tree edge~$e$ given by its identifier.
+\:$\<ScanNontree>(v)$ --- Return non-tree edges stored in the same tree as~$v$.
+\endlist
+
+\impl
+We will implement the operations on the ET-trees by translating the intended changes of the
+ET-sequences to operations on the $(a,b)$-trees. The role of identifiers of the original vertices
+and edges will be of course played by pointers to the respective leaves and internal keys of
+the $(a,b)$-trees.
+
+\<Cut> of an~edge splits the $(a,b)$-tree at both internal keys representing the given edge
+and joins them back in the different order.
+
+\<Link> of two trees can be accomplished by making both vertices the roots of their trees first
+and joining the roots by an~edge afterwards. Re-rooting again involves splits and joins of $(a,b)$-trees.
+As we can split at any occurrence of the new root vertex, we will use the active occurrence
+which we remember. Joining of the roots translated to joining of the $(a,b)$-trees.
+
+\<Root> just follows parent pointers in the $(a,b)$-tree and it walks the path from the leaf
+to the root.
+
+\<InsertNontree> finds the leaf $\<act>(v)$ containing the list of $v$'s non-tree edges
+and inserts the new edge there. The returned identifier will consist from the pointer to
+the edge and the vertex in whose list it is stored. Then we have to recalculate the markers
+on the path from $\<act>(v)$ to the root. \<DeleteNontree> is analogous. Whenever any
+other operation changes a~vertex of the tree, it will also update its marker and, if necessary,
+the markers on the path to the root.
+
+\<ScanNontree> traverses the tree recursively from the root, but it does not enter the
+subtrees whose roots are not marked.
+
+Analysis of the time complexity is now straightforward:
+
+\thmn{Eulerian Tour trees, Henzinger and Rauch \cite{henzinger:randdyn}}
+The ET-trees perform the operations \<Link> and \<Cut> in time $\O(a\log_a n)$, \<Create>
+in $\O(1)$, \<Root>, \<InsertNontree>, and \<DeleteNontree> in $\O(\log_a n)$, and
+\<ScanNontree> in $\O(a\log_a n)$ per edge reported. Here $n$~is the number of vertices
+in the original forest and $a\ge 2$ is an~arbitrary constant.
+
+\proof
+We set $b=2a$. Our implementation performs $\O(1)$ operations on the $(a,b)$-trees
+per operation on the ET-tree, plus $\O(1)$ other operations. We apply the standard theorems
+on the complexity of $(a,b)$-trees \cite{clrs}.
+\qed
+
+\examplen{Connectivity acceleration}
+In most cases, the ET-trees are used with $a$~constant, but sometimes choosing~$a$ as a~function
+of~$n$ can also have its beauty. Suppose that there is a~data structure which maintains an~arbitrary
+spanning forest of a~dynamic graph. Suppose also that the structure works in time $\O(\log^k n)$
+per operation and that it reports $\O(1)$ changes in the spanning forest for every change
+in the graph. If we keep the spanning forest in ET-trees with $a=\log n$, the updates of the
+data structure cost an~extra $\O(\log^2 n / \log\log n)$, but queries accelerate to $\O(\log
+n/\log\log n)$.
  
  
  \endpart
author	Martin Mares <mj@ucw.cz>
	Fri, 11 Apr 2008 11:00:29 +0000 (13:00 +0200)
committer	Martin Mares <mj@ucw.cz>
	Fri, 11 Apr 2008 11:00:29 +0000 (13:00 +0200)