Cleaned up topological computations and unification.

author Martin Mares <mj@ucw.cz>

Mon, 7 Apr 2008 14:38:20 +0000 (16:38 +0200)

committer Martin Mares <mj@ucw.cz>

Mon, 7 Apr 2008 14:38:20 +0000 (16:38 +0200)
author Martin Mares <mj@ucw.cz>
Mon, 7 Apr 2008 14:38:20 +0000 (16:38 +0200)
committer Martin Mares <mj@ucw.cz>
Mon, 7 Apr 2008 14:38:20 +0000 (16:38 +0200)
diff --git a/PLAN b/PLAN

index 74de0983f0537df4838ccd0bb32cfed0440bb3b0..342bf8173a96ad7f8c1666229535ff1d039d3f54 100644 (file)
--- a/PLAN
+++ b/PLAN
@@ -98,6 +98,7 @@ Notation:
  - change the notation for contractions -- use double slash instead of the dot?
  - introduce \widehat\O early
  - unify { x ; ... }, { x | ...} and { x : ... }
+- capitalize Pointer Machine
  
  Varia:
  
diff --git a/ram.tex b/ram.tex

index af55492abec14903bc412626abf0502e02065959..6691434d1da432ecb9d1e5c89ae4c2216b46e029 100644 (file)
--- a/ram.tex
+++ b/ram.tex
@@ -304,6 +304,13 @@ Each such structure will be endowed with a~pointer to the head of the list of it
  the corresponding bucket. Inserting an~edge to a~bucket can be then done in constant time
  and scanning the contents of all~$n$ buckets takes $\O(n+m)$ time.
  
+At last, we must not forget that while it was easy to \df{normalize} the pairs on the RAM
+by putting the smaller identifier first, this fails on the PM because we can directly
+compare the identifiers only for equality. We can work around this again by bucket-sorting:
+we sort the multiset $\{ (x,i) \mid \hbox{$x$~occurs in the $i$-th pair} \}$ on~$x$.
+Then we reset all pairs and re-insert the values back in their increasing order.
+This is also $\O(n+m)$.
+
  \paran{Tree isomorphism}%
  Another nice example of pointer-based radix sorting is a~Pointer machine algorithm for
  deciding whether two rooted trees are isomorphic. Let us assume for a~moment that
@@ -329,11 +336,8 @@ reordering of elements. We therefore sort the codes inside each tuple and then
  sort the tuples, which brings the equivalent tuples together.
  
  The first sort (inside the tuples) would be easy on the RAM, but on the PM we
-have no means of comparing two identifiers for anything else than equality.
-To work around this, we sort the set $\{ (x,i,j) \mid \hbox{$x$ is the $i$-th
-element of the $j$-th tuple} \}$ on~$x$, reset all tuples and insert the elements
-back in the increasing order of~$x$, ignoring the original positions. The second
-sort is a~straightforward $k$-pass bucket sort.
+have to use the normalization trick mentioned above. The second sort is
+a~straightforward $k$-pass bucket sort.
  
  If we are not careful, a~single sorting pass takes $\O(n_d + n)$ time, because
  while we have only $n_d$~items to sort, we have to scan all $n$~buckets. This can
@@ -356,7 +360,7 @@ in time $\O(\sum_d (n_d + n_{d+1})) = \O(n)$.
  The unification of sequences by bucket sorting will be useful in many
  other situations, so we will state it as a~separate lemma:
  
-\lemman{Unification of sequences}\id{uniflemma}%
+\lemman{Sequence unification}\id{suniflemma}%
  Partitioning of a~collection of sequences $S_1,\ldots,S_n$, whose elements are
  arbitrary pointers and symbols from a~finite alphabet, to equality classes can
  be performed on the Pointer machine in time $\O(n + \sum_i \vert S_i \vert)$.
@@ -436,18 +440,27 @@ by the edges' labels. Finally we append a~special terminator to mark the boundar
  between the code of this vertex and its successor.
  
  \obs
-The canonical encoding is well defined in the sense that non-isomorphic graphs always
+The canonical encoding is well defined in the sense that non-iso\-morphic graphs always
  receive different encodings. Obviously, encodings of isomorphic graphs can differ,
  depending on the order of vertices and also of the adjacency lists. A~graph
  on~$n$ vertices with $m$~edges is assigned an~encoding of length at most $2n+2m$ ---
  for each vertex, we record its label and a~single terminator; edges contribute
  by identifiers and labels. These encodings can be constructed in linear time and
-we will use them for our unification of graphs:
+in the same time we can also create a~graph corresponding to any encoding.
+We will use the encodings for our unification of graphs:
+
+\defn
+For a~collection~$\C$ of graphs, we define $\vert\C\vert$ as the number of graphs in
+the collection and $\Vert\C\Vert$ as their total size, i.e., $\Vert\C\Vert = \sum_{G\in\C} n(G) + m(G)$.
  
-\lemman{Unification of graphs}\id{uniflemma}%
-A~collection~$\C$ of labeled graphs can be partitioned into classes which
-share the same canonical encoding in time $\O(\Vert\C\Vert)$, where $\Vert\C\Vert$
-is the total size of the collection, i.e., $\sum_{G\in\C} n(G) + m(G)$.
+\lemman{Graph unification}\id{guniflemma}%
+A~collection~$\C$ of labeled graphs can be partitioned into classes which share the same
+canonical encoding in time $\O(\Vert\C\Vert)$ on the Pointer machine.
+
+\proof
+Construct canonical encodings of all the graphs and then apply the Sequence unification lemma
+(\ref{suniflemma}) on them.
+\qed
  
  \para
  When we want to perform a~topological computation on a~collection~$\C$ of graphs
@@ -467,23 +480,24 @@ A~graph on~$k$ vertices has less than~$k^2/2$ edges, so the canonical encodings
  all such graphs are shorter than $2k + 2k^2/2 = k(k+2)$. Each element of the encoding
  is either a~vertex identifier, or a~symbol, or a~separator, so it can attain at most $k+s$
  possible values for some fixed~$s$.
-We can therefore enumerate all possible encodings, convert them to a~collection $\cal G$
-of all generic graphs and run the computation on all of them in time $\O(\vert{\cal G}\vert \cdot T(k))
-= \O((k+s)^{k(k+2)}\cdot T(k))$.
+We can therefore enumerate all possible encodings and convert them to a~collection $\cal G$
+of all generic graphs such that $\vert{\cal G}\vert \le (k+s)^{k(k+2)}$ and $\Vert{\cal G}\Vert
+\le \vert{\cal G}\vert \cdot k^2$.
  
-Then we use the Unification lemma (\ref{uniflemma}) on the union of the collections
+We run the computation on all generic graphs in time $\O(\vert{\cal G}\vert \cdot T(k))$
+and then we use the Unification lemma (\ref{guniflemma}) on the union of the collections
  $\C$ and~$\cal G$ to match the generic graphs with the equivalent actual graphs in~$\C$
-in time $\O(\Vert\C\Vert + \Vert{\cal G}\Vert) = \O(\Vert\C\Vert + \vert{\cal G}\vert \cdot k^2)$.
+in time $\O(\Vert\C\Vert + \Vert{\cal G}\Vert)$.
  Finally we create a~copy of the generic result for each of the actual graphs.
  If the computation uses pointers to the input vertices in its output, we have to
-redirect them to the actual input vertices, but we can do that by associating
+redirect them to the actual input vertices, which we can do by associating
  the output vertices that refer to an~input vertex with the corresponding places
  in the encoding of the input graph. This way, the whole output can be generated in time
  $\O(\Vert\C\Vert + \Vert{\cal G}\Vert)$.
  \qed
  
  \rem
-The topological computations and the Unification lemma will play important
+The topological computations and the Graph unification lemma will play important
  roles in Sections \ref{verifysect} and \ref{optalgsect}.
  
  %--------------------------------------------------------------------------------
author	Martin Mares <mj@ucw.cz>
	Mon, 7 Apr 2008 14:38:20 +0000 (16:38 +0200)
committer	Martin Mares <mj@ucw.cz>
	Mon, 7 Apr 2008 14:38:20 +0000 (16:38 +0200)