From: Martin Mares <mj@ucw.cz>
Date: Tue, 3 Jun 2008 20:34:58 +0000 (+0200)
Subject: Abstracted the chapter on ranks.
X-Git-Tag: phd-final~17
X-Git-Url: http://mj.ucw.cz/gitweb/?a=commitdiff_plain;h=4e6cc98bb3be0f4de2a15e5f1e8729c39069f90b;p=saga.git

Abstracted the chapter on ranks.
---

diff --git a/abstract.tex b/abstract.tex
index 575371b..06184e1 100644
--- a/abstract.tex
+++ b/abstract.tex
@@ -1152,9 +1152,7 @@ on both~$G_1$ and~$G_2$ and find~$T_3$ again in time $\O(m)$.
 \paran{Further spanning trees}%
 The construction of auxiliary graphs can be iterated to obtain $T_1,\ldots,T_K$
 for an~arbitrary~$K$. We will build a~\df{meta-tree} of auxiliary graphs. Each node of this meta-tree
-carries a~graph\foot{This graph is always derived from~$G$ by a~sequence of edge deletions
-and contractions. It is tempting to say that it is a~minor of~$G$, but this is not true as we
-preserve multiple edges.} and its minimum spanning tree. The root node contains~$(G,T_1)$,
+carries a~graph and its minimum spanning tree. The root node contains~$(G,T_1)$,
 its sons have $(G_1,T_1/e)$ and $(G_2,T_2)$. When $T_3$ is obtained by an~exchange
 in one of these sons, we attach two new leaves to that son and we let them carry the two auxiliary
 graphs derived by contracting or deleting the exchanged edge. Then we find the best
@@ -1177,9 +1175,277 @@ When we combine this with the previous construction, we get the following theore
 For a~given graph~$G$ with real edge weights and a~positive integer~$K$, the $K$~best spanning trees can be found
 in time $\O(m\timesalpha(m,n) + \min(K^2,Km + K\log K))$.
 
+\chapter{Ranking Combinatorial Structures}\id{rankchap}%
+
+\section{Ranking and unranking}\id{ranksect}%
+
+The techniques for building efficient data structures on the RAM, which we have described
+in Section~\ref{ramds}, can be also used for a~variety of problems related
+to ranking of combinatorial structures. Generally, the problems are stated
+in the following way:
+
+\defn\id{rankdef}%
+Let~$C$ be a~set of objects and~$\prec$ a~linear order on~$C$. The \df{rank}
+$R_{C,\prec}(x)$ of an~element $x\in C$ is the number of elements $y\in C$ such that $y\prec x$.
+We will call the function $R_{C,\prec}$ the \df{ranking function} for $C$ ordered by~$\prec$
+and its inverse $R^{-1}_{C,\prec}$ the \df{unranking function} for $C$ and~$\prec$. When the set
+and the order are clear from the context, we will use plain~$R(x)$ and $R^{-1}(x)$.
+Also, when $\prec$ is defined on a~superset~$C'$ of~$C$, we naturally extend $R_C(x)$
+to elements $x\in C'\setminus C$.
+
+\example
+Let us consider the set $C_k=\{\0,\1\}^k$ of all binary strings of length~$k$ ordered
+lexicographically. Then $R^{-1}(i)$ is the $i$-th smallest element of this set, that
+is the number~$i$ written in binary and padded to~$k$ digits (i.e., $\(i)_k$ in the
+notation of Section~\ref{ramds}). Obviously, $R(x)$ is the integer whose binary
+representation is the string~$x$.
+
 %--------------------------------------------------------------------------------
 
-\chapter{Ranking Combinatorial Structures}\id{rankchap}%
+\section{Ranking of permutations}
+\id{pranksect}
+
+One of the most common ranking problems is ranking of permutations on the set~$[n]=\{1,2,\ldots,n\}$.
+This is frequently used to create arrays indexed by permutations: for example in Ruskey's algorithm
+for finding Hamilton cycles in Cayley graphs (see~\cite{ruskey:ham} and \cite{ruskey:hce})
+or when exploring state spaces of combinatorial puzzles like the Loyd's Fifteen \cite{ss:fifteen}.
+Many other applications are surveyed by Critani et al.~\cite{critani:rau} and in
+most cases, the time complexity of the whole algorithm is limited by the efficiency
+of the (un)ranking functions.
+
+The permutations are usually ranked according to their lexicographic order.
+In fact, an~arbitrary order is often sufficient if the ranks are used solely
+for indexing of arrays. The lexicographic order however has an~additional advantage
+of a~nice structure, which allows various operations on permutations to be
+performed directly on their ranks.
+
+Na\"\i{}ve algorithms for lexicographic ranking require time $\Theta(n^2)$ in the
+worst case \cite{reingold:catp} and even on average~\cite{liehe:raulow}.
+This can be easily improved to $O(n\log n)$ by using either a binary search
+tree to calculate inversions, or by a divide-and-conquer technique, or by clever
+use of modular arithmetic (all three algorithms are described in Knuth
+\cite{knuth:sas}). Myrvold and Ruskey \cite{myrvold:rank} mention further
+improvements to $O(n\log n/\log \log n)$ by using the RAM data structures of Dietz
+\cite{dietz:oal}.
+
+Linear time complexity was reached by Myrvold and Ruskey \cite{myrvold:rank}
+for a~non-lexicographic order, which is defined locally by the history of the
+data structure --- in fact, they introduce a linear-time unranking algorithm
+first and then they derive an inverse algorithm without describing the order
+explicitly. However, they leave the problem of lexicographic ranking open.
+
+We will describe a~general procedure which, when combined with suitable
+RAM data structures, yields a~linear-time algorithm for lexicographic
+(un)ranking.
+
+\nota\id{brackets}%
+We will view permutations on a~finite set $A\subseteq {\bb N}$ as ordered $\vert A\vert$-tuples
+(in other words, arrays) containing every element of~$A$ exactly once. We will
+use square brackets to index these tuples: $\pi=(\pi[1],\ldots,\pi[\vert A\vert])$,
+and sub-tuples: $\pi[i\ldots j] = (\pi[i],\pi[i+1],\ldots,\pi[j])$.
+The lexicographic ranking and unranking functions for the permutations on~$A$
+will be denoted by~$L(\pi,A)$ and $L^{-1}(i,A)$ respectively.
+
+\obs\id{permrec}%
+Let us first observe that permutations have a simple recursive structure.
+If we fix the first element $\pi[1]$ of a~permutation~$\pi$ on the set~$[n]$, the
+elements $\pi[2], \ldots, \pi[n]$ form a~permutation on $[n]-\{\pi[1]\} = \{1,\ldots,\pi[1]-1,\pi[1]+1,\ldots,n\}$.
+The lexicographic order of two permutations $\pi$ and~$\pi'$ on the original set is then determined
+by $\pi[1]$ and $\pi'[1]$ and only if these elements are equal, it is decided
+by the lexicographic comparison of permutations $\pi[2\ldots n]$ and $\pi'[2\ldots n]$.
+Moreover, when we fix $\pi[1]$, all permutations on the smaller set occur exactly
+once, so the rank of $\pi$ is $(\pi[1]-1)\cdot (n-1)!$ plus the rank of
+$\pi[2\ldots n]$.
+
+This gives us a~reduction from (un)ranking of permutations on $[n]$ to (un)rank\-ing
+of permutations on a $(n-1)$-element set, which suggests a straightforward
+algorithm, but unfortunately this set is different from $[n-1]$ and it even
+depends on the value of~$\pi[1]$. We could renumber the elements to get $[n-1]$,
+but it would require linear time per iteration. To avoid this, we generalize the
+problem to permutations on subsets of $[n]$. For a permutation $\pi$ on a~set
+$A\subseteq [n]$ of size~$m$, similar reasoning gives a~simple formula:
+$$
+L((\pi[1],\ldots,\pi[m]),A) = R_A(\pi[1]) \cdot (m-1)! +
+L((\pi[2],\ldots,\pi[m]), A\setminus\{\pi[1]\}),
+$$
+which uses the ranking function~$R_A$ for~$A$. This recursive formula immediately
+translates to the following recursive algorithms for both ranking and unranking
+(described for example in \cite{knuth:sas}):
+
+\alg $\<Rank>(\pi,i,n,A)$: Compute the rank of a~permutation $\pi[i\ldots n]$ on~$A$.
+\id{rankalg}
+\algo
+\:If $i\ge n$, return~0.
+\:$a\=R_A(\pi[i])$.
+\:$b\=\<Rank>(\pi,i+1,n,A \setminus \{\pi[i]\})$.
+\:Return $a\cdot(n-i)! + b$.
+\endalgo
+
+\>We can call $\<Rank>(\pi,1,n,[n])$ for ranking on~$[n]$, i.e., to calculate
+$L(\pi,[n])$.
+
+\alg $\<Unrank>(j,i,n,A)$: Return an~array~$\pi$ such that $\pi[i\ldots n]$ is the $j$-th permutation on~$A$.
+\id{unrankalg}
+\algo
+\:If $i>n$, return $(0,\ldots,0)$.
+\:$x\=R^{-1}_A(\lfloor j/(n-i)! \rfloor)$.
+\:$\pi\=\<Unrank>(j\bmod (n-i)!,i+1,n,A\setminus \{x\})$.
+\:$\pi[i]\=x$.
+\:Return~$\pi$.
+\endalgo
+
+\>We can call $\<Unrank>(j,1,n,[n])$ for the unranking problem on~$[n]$, i.e., to calculate $L^{-1}(j,[n])$.
+
+\paran{Representation of sets}%
+The most time-consuming parts of the above algorithms are of course operations
+on the set~$A$. If we store~$A$ in a~data structure of a~known time complexity, the complexity
+of the whole algorithm is easy to calculate:
+
+\lemma\id{ranklemma}%
+Suppose that there is a~data structure maintaining a~subset of~$[n]$ under a~sequence
+of deletions, which supports ranking and unranking of elements, and that
+the time complexity of a~single operation is at most~$t(n)$.
+Then lexicographic ranking and unranking of permutations can be performed in time $\O(n\cdot t(n))$.
+
+If we store~$A$ in an~ordinary array, we have insertion and deletion in constant time,
+but ranking and unranking in~$\O(n)$, so $t(n)=\O(n)$ and the algorithm is quadratic.
+Binary search trees give $t(n)=\O(\log n)$. The data structure of Dietz \cite{dietz:oal}
+improves it to $t(n)=O(\log n/\log \log n)$. In fact, all these variants are equivalent
+to the classical algorithms based on inversion vectors, because at the time of processing~$\pi[i]$,
+the value of $R_A(\pi[i])$ is exactly the number of elements forming inversions with~$\pi[i]$.
+
+To obtain linear time complexity, we will make use of the representation of
+vectors by integers on the RAM as developed in Section~\ref{ramds}. We observe
+that since the words of the RAM need to be able to hold integers as large as~$n!$,
+the word size must be at least $\log n! = \Theta(n\log n)$. Therefore the whole
+set~$A$ fits in~$\O(1)$ words and we get:
+
+\thmn{Lexicographic ranking of permutations}
+When we order the permutations on the set~$[n]$ lexicographically, both ranking
+and unranking can be performed on the RAM in time~$\O(n)$.
+
+\paran{The case of $k$-permutations}%
+Our algorithm can be also generalized to lexicographic ranking of
+\df{$k$-permutations,} that is of ordered $k$-tuples of distinct elements drawn from the set~$[n]$.
+There are $n^{\underline k} = n\cdot(n-1)\cdot\ldots\cdot(n-k+1)$
+such $k$-permutations and they have a~recursive structure similar to the one of
+the permutations.
+Unfortunately, the ranks of $k$-permutations can be much smaller, so we can no
+longer rely on the same data structure fitting in a constant number of word-sized integers.
+For example, if $k=1$, the ranks are $\O(\log n)$-bit numbers, but the data
+structure still requires $\Theta(n\log n)$ bits.
+
+We do a minor side step by remembering the complement of~$A$ instead, that is
+the set of the at most~$k$ elements we have already seen. We will call this set~$H$
+(because it describes the ``holes'' in~$A$). Since $\Omega(k\log n)$ bits are needed
+to represent the rank, the vector representation of~$H$ certainly fits in a~constant
+number of words. When we translate the operations on~$A$ to operations on~$H$,
+again stored as a~vector, we get:
+
+\thmn{Lexicographic ranking of $k$-permutations}
+When we order the $k$-per\-mu\-ta\-tions on the set~$[n]$ lexicographically, both
+ranking and unranking can be performed on the RAM in time~$\O(k)$.
+
+\section{Restricted permutations}
+
+Another interesting class of combinatorial objects that can be counted and
+ranked are restricted permutations. An~archetypal member of this class are
+permutations without a~fixed point, i.e., permutations~$\pi$ such that $\pi(i)\ne i$
+for all~$i$. These are also called \df{derangements} or \df{hatcheck permutations.}
+We will present a~general (un)ranking method for any class of restricted
+permutations and derive a~linear-time algorithm for the derangements from it.
+
+\defn\id{permnota}%
+We will fix a~non-negative integer~$n$ and use ${\cal P}$ for the set of
+all~permutations on~$[n]$.
+A~\df{restriction graph} is a~bipartite graph~$G$ whose parts are two copies
+of the set~$[n]$. A~permutation $\pi\in{\cal P}$ satisfies the restrictions
+if $(i,\pi(i))$ is an~edge of~$G$ for every~$i$.
+
+\paran{Equivalent formulations}%
+We will follow the path unthreaded by Kaplansky and Riordan
+\cite{kaplansky:rooks} and charted by Stanley in \cite{stanley:econe}.
+We will relate restricted permutations to placements of non-attacking
+rooks on a~hollow chessboard.
+
+\defn
+\itemize\ibull
+\:A~\df{board} is the grid $B=[n]\times [n]$. It consists of $n^2$ \df{squares.}
+\:A~\df{trace} of a~permutation $\pi\in{\cal P}$ is the set of squares \hbox{$T(\pi)=\{ (i,\pi(i)) ; i\in[n] \}$. \hskip-4em}  %%HACK
+\endlist
+
+\obs\id{rooksobs}%
+The traces of permutations (and thus the permutations themselves) correspond
+exactly to placements of $n$ rooks at the board in a~way such that the rooks do
+not attack each other (i.e., there is at most one rook in every row and
+likewise in every column; as there are $n$~rooks, there must be exactly one of them in
+every row and column). When speaking about \df{rook placements,} we will always
+mean non-attacking placements.
+
+Restricted permutations then correspond to placements of rooks on a~board with
+some of the squares removed. The \df{holes} (missing squares) correspond to the
+non-edges of~$G$, so $\pi\in{\cal P}$ satisfies the restrictions iff
+$T(\pi)$ avoids the holes.
+
+Placements of~$n$ rooks (and therefore also restricted permutations) can be
+also equated with perfect matchings in the restriction graph~$G$. The edges
+of the matching correspond to the squares occupied by the rooks, the condition
+that no two rooks share a~row nor column translates to the edges not touching
+each other, and the use of exactly~$n$ rooks is equivalent to the matching
+being perfect.
+
+There is also a~well-known correspondence between the perfect matchings
+in a~bipartite graph and non-zero summands in the formula for the permanent
+of the bipartite adjacency matrix~$M$ of the graph. This holds because the
+non-zero summands are in one-to-one correspondence with the placements
+of~$n$ rooks on the corresponding board. The number of restricted
+permutations is therefore equal to the permanent of the matrix~$M$.
+
+The diversity of the characterizations of restricted permutations brings
+both good and bad news. The good news is that we can use the
+plethora of known results on bipartite matchings. Most importantly, we can efficiently
+determine whether there exists at least one permutation satisfying a~given set of restrictions:
+
+\thm
+There is an~algorithm which decides in time $\O(n^{1/2}\cdot m)$ whether there exists
+a~permutation satisfying a~given restriction graph. The $n$ and~$m$ are the number
+of vertices and edges of the restriction graph.
+
+The bad news is that computing the permanent is known to be~$\#\rm P$-complete even
+for zero-one matrices (as proven by Valiant \cite{valiant:permanent}).
+As a~ranking function for a~set of~matchings can be used to count all such
+matchings, we obtain the following theorem:
+
+\thm\id{pcomplete}%
+If there is a~polynomial-time algorithm for lexicographic ranking of permutations with
+a~set of restrictions which is a~part of the input, then $\rm P=\#P$.
+
+However, the hardness of computing the permanent is the only obstacle.
+We show that whenever we are given a~set of restrictions for which
+the counting problem is easy (and it is also easy for subgraphs obtained
+by deleting vertices), ranking is easy as well. The key will be once again
+a~recursive structure, similar to the one we have seen in the case of plain
+permutations in \ref{permrec}. We get:
+
+\thmn{Lexicographic ranking of restricted permutations}
+Suppose that we have a~family of matrices ${\cal M}=\{M_1,M_2,\ldots\}$ such that $M_n\in \{0,1\}^{n\times n}$
+and it is possible to calculate the permanent of~$M'$ in time $\O(t(n))$ for every matrix $M'$
+obtained by deletion of rows and columns from~$M_n$. Then there exist algorithms
+for ranking and unranking in ${\cal P}_{A,M_n}$ in time $\O(n^4 + n^2\cdot t(n))$
+if $M_n$ and an~$n$-element set~$A$ are given as a~part of the input.
+
+Our time bound for ranking of general restricted permutations section is obviously very coarse.
+Its main purpose was to demonstrate that many special cases of the ranking problem can be indeed computed in polynomial time.
+For most families of restriction matrices, we can do much better. These speedups are hard to state formally
+in general (they depend on the structure of the matrices), but we demonstrate them on the
+specific case of derangements. We show that each matrix can be sufficiently characterized
+by two numbers: the order of the matrix and the number of zeroes in it. We find a~recurrent
+formula for the permanent, based on these parameters, which we use to precalculate all
+permanents in advance. When we plug it in the general algorithm, we get:
+
+\thmn{Ranking of derangements}%
+For every~$n$, the derangements on the set~$[n]$ can be ranked and unranked according to the
+lexicographic order in time~$\O(n)$ after spending $\O(n^2)$ on initialization of auxiliary tables.
 
 \chapter{Bibliography}