X-Git-Url: http://mj.ucw.cz/gitweb/?a=blobdiff_plain;ds=sidebyside;f=notation.tex;h=e02cccf67e49f23f78e77d15a54e569d54adbcb5;hb=124f193d14bed24539c5cd59407637cc38dbe740;hp=1466c9da9292ede751410cb5e7deab2b3db94fd7;hpb=786b715f6f60744b90f73ad7d283277a40c4aeca;p=saga.git diff --git a/notation.tex b/notation.tex index 1466c9d..e02cccf 100644 --- a/notation.tex +++ b/notation.tex @@ -2,46 +2,106 @@ \input macros.tex \fi -\chapter{Notation} +\chapter{Notation}\id{notapp}% + +\section{Symbols} {\obeylines\parskip=0pt -\def\n#1#2{\>\hbox to 6em{#1 \dotfill} #2} -\def\[#1]{[\ref{#1}]} -\n{$\bb R$}{the set of all real numbers} -\n{$\bb N$}{the set of all natural numbers, including 0} -\n{${\bb N}^+$}{the set of all positive integers} +\def\n#1#2{\>\hangindent=6em\hangafter=1 \hbox to 6em{#1 \dotfill~}#2} +\def\[#1]{~{\it(\ref{#1})}} + +\n{$A(x,y)$}{Ackermann's function \[ackerdef]} +\n{$A(x)$}{diagonal Ackermann's function \[ackerdef]} +\n{$\band$}{bitwise conjunction: $(x\band y)[i]=1$ iff $x[i]=1 \land y[i]=1$} +\n{$C_k$}{cycle on~$k$ vertices} +\n{${\cal D}(G)$}{optimal MSF decision tree for a~graph~$G$ \[decdef]} +\n{$D(G)$}{depth of ${\cal D}(G)$ \[decdef]} +\n{$D(m,n)$}{decision tree complexity of MSF for $m$~edges and $n$~vertices \[decdef]} +\n{$D_n$}{$n\times n$ matrix with 0's on the main diagonal and 1's elsewhere \[hatrank]} +\n{$\deg_G(v)$}{degree of vertex~$v$ in graph~$G$; we omit $G$ if it is clear from context} +\n{$E(G)$}{set of edges of a graph~$G$} +\n{$E$}{$E(G)$ when the graph~$G$ is clear from context} +\n{${\E}X$}{expected value of a~random variable~$X$} +\n{$K_k$}{complete graph on~$k$ vertices} +\n{$L(\pi,A)$}{lexicographic ranking function for permutations on a~set~$A\subseteq{\bb N}$ \[brackets]} +\n{$L^{-1}(i,A)$}{lexicographic unranking function, the inverse of~$L$ \[brackets]} +\n{$\log n$}{binary logarithm of the number~$n$} +\n{$\log^* n$}{iterated logarithm: $\log^*n := \min\{i \mid \log^{(i)}n \le 1\}$; the inverse of~$2\tower n$} +\n{$\(x)$}{position of the lowest bit set in~$x$ \[lsbmsb]} +\n{$\(x)$}{position of the highest bit set in~$x$ \[lsbmsb]} +\n{MSF}{minimum spanning forest \[mstdef]} +\n{$\msf(G)$}{the unique minimum spanning forest of a graph~$G$ \[mstnota]} +\n{MST}{minimum spanning tree \[mstdef]} +\n{$\mst(G)$}{the unique minimum spanning tree of a graph~$G$ \[mstnota]} +\n{$m(G)$}{number of edges of a graph~$G$, that is $\vert E(G)\vert$} +\n{$m$}{$m(G)$ when the graph~$G$ is clear from context} +\n{$\bb N$}{set of all non-negative integers} +\n{${\bb N}^+$}{set of all positive integers} +\n{$N_0(M)$}{number of permutations satisfying the restrictions~$M$ \[restnota]} +\n{$n(G)$}{number of vertices of a graph~$G$, that is $\vert V(G)\vert$} +\n{$n$}{$n(G)$ when the graph~$G$ is clear from context} +\n{$\bnot$}{bitwise negation: $(\bnot x)[i]=1-x[i]$} +\n{$\O(g)$}{asymptotic~$O$: $f=\O(g)$ iff $\exists c>0: f(n)\le g(n)$ for all~$n\ge n_0$} +\n{$\widetilde\O(g)$}{$f=\widetilde\O(g)$ iff $f=\O(g\cdot\log^{\O(1)} g)$} +\n{$\bor$}{bitwise disjunction: $(x\bor y)[i]=1$ iff $x[i]=1 \lor y[i]=1$} +\n{${\cal P}_A$}{set of all permutations on a~set~$A$ \[restnota]} +\n{${\cal P}_{A,M}$}{set of all permutations on~$A$ satisfying the restrictions~$M$ \[restnota]} +\n{$\per M$}{permanent of a~square matrix~$M$} +\n{$\poly(n)$}{$f=\poly(n)$ iff $f=\O(n^c)$ for some $c$} +\n{${\rm Pr}[\varphi]$}{probability that a predicate~$\varphi$ is true} +\n{$\bb R$}{set of all real numbers} +\n{$R_{C,\prec}(x)$}{rank of~$x$ in a~set~$C$ ordered by~$\prec$ \[rankdef]} +\n{$R^{-1}_{C,\prec}(i)$}{unrank of~$i$: the $i$-th smallest element of a~set~$C$ ordered by~$\prec$ \[rankdef]} +\n{$V(G)$}{set of vertices of a graph~$G$} +\n{$V$}{$V(G)$ when the graph~$G$ is clear from context} +\n{$W$}{word size of the RAM \[wordsize]} +\n{$w(e)$}{weight of an edge $e$} +\n{$\bxor$}{bitwise non-equivalence: $(x\bxor y)[i]=1$ iff $x[i]\ne y[i]$} + +\n{$\alpha(n)$}{diagonal inverse of the Ackermann's function \[ackerinv]} +\n{$\alpha(m,n)$}{$\alpha(m,n) := \min\{ x\ge 1 \mid A(x,4\lceil m/n\rceil) > \log n \}$ \[ackerinv]} +\n{$\beta(m,n)$}{$\beta(m,n) := \min\{i \mid \log^{(i)}n \le m/n \}$ \[itjarthm]} +\n{$\delta_G(U)$}{the cut separating $U\subset V(G)$ from $V(G)\setminus U$ \[deltanota]} +\n{$\delta_G(v)$}{edges of a one-vertex cut, i.e., $\delta_G(\{v\})$ \[deltanota]} +\n{$\Theta(g)$}{asymptotic~$\Theta$: $f=\Theta(g)$ iff $f=\O(g)$ and $f=\Omega(g)$} +\n{$\lambda_i(n)$}{inverse of the $i$-th row of the Ackermann's function \[ackerinv]} +\n{$\varrho({\cal C})$}{edge density of a graph class~$\cal C$ \[density]} +\n{$\Omega(g)$}{asymptotic~$\Omega$: $f=\Omega(g)$ iff $\exists c>0: f(n)\ge g(n)$ for all~$n\ge n_0$} + +%%\n{$x := y$}{$x$ is defined as~$y$} \n{$T[u,v]$}{the path in a tree~$T$ joining vertices $u$ and $v$ \[heavy]} \n{$T[e]$}{the path in a tree~$T$ joining the endpoints of an~edge~$e$ \[heavy]} \n{$A\symdiff B$}{symetric difference of sets: $(A\setminus B) \cup (B\setminus A)$} \n{$G-e$}{graph $G$ with edge $e$ removed} \n{$G+e$}{graph $G$ with edge $e$ added} -\n{$w(e)$}{weight of an edge $e$} -\n{$V(G)$}{set of vertices of a graph~$G$} -\n{$E(G)$}{set of edges of a graph~$G$} -\n{$n(G)$}{number of vertices of a graph~$G$, that is $\vert V(G)\vert$} -\n{$m(G)$}{number of edges of a graph~$G$, that is $\vert E(G)\vert$} -\n{$V,E,n,m$}{when used without $(G)$, they refer to the input of the current algorithm} -\n{$\delta_G(U)$}{all edges connecting $U\subset V(G)$ with $V(G)\setminus U$; we usually omit the~$G$} -\n{$\delta_G(v)$}{the edges of a one-vertex cut, i.e., $\delta_G(\{v\})$} -\n{MST}{minimum spanning tree \[mstdef]} -\n{MSF}{minimum spanning forest \[mstdef]} -\n{$\mst(G)$}{the unique minimum spanning tree of a graph~$G$ \[mstnota]} -\n{$X \choose k$}{a set of all $k$-element subsets of a set~$X$} +\n{$G[U]$}{subgraph induced by a~set $U\subset V(G)$} +\n{$X \choose k$}{the set of all $k$-element subsets of a set~$X$} \n{$G/e$}{multigraph contraction \[contract]} -\n{$G.e$}{simple graph contraction \[simpcont]} -\n{$\alpha(n)$}{the inverse Ackermann's function} -\n{$f[X]$}{function applied to a set: $f[X]:=\{ f(x) ; x\in X \}$} +\n{$G\sgc e$}{simple graph contraction \[simpcont]} +\n{$G/X$, $G\sgc X$}{contraction by a~set $X$ of vertices or edges \[setcont]} +\n{$f[X]$}{function applied to a set: $f[X]:=\{ f(x) \mid x\in X \}$} \n{$f[e]$}{as edges are two-element sets, $f[e]$ maps both endpoints of an edge~$e$} -\n{$\varrho({\cal C})$}{edge density of a graph class~$\cal C$ \[density]} -\n{$\deg_G(v)$}{degree of vertex~$v$ in graph~$G$; we omit $G$ if it is clear from context} -\n{${\bb E}X$}{expected value of a~random variable~$X$} -\n{${\rm Pr}[\varphi]$}{probability that a predicate~$\varphi$ is true} -\n{$\log n$}{a binary logarithm of the number~$n$} \n{$f^{(i)}$}{function~$f$ iterated $i$~times: $f^{(0)}(x):=x$, $f^{(i+1)}(x):=f(f^{(i)}(x))$} \n{$2\tower n$}{the tower function (iterated exponential): $2\tower 0:=1$, $2\tower (n+1):=2^{2\tower n}$} -\n{$\log^* n$}{the iterated logarithm: $\log^*n := \min\{i: \log^{(i)}n < 1\}$; the inverse of~$2\tower n$} -\n{$\beta(m,n)$}{$\beta(m,n) := \min\{i: \log^{(i)}n < m/n \}$ \[itjarthm]} -\n{$W$}{word size of the RAM \[wordsize]} +\n{$\(x)$}{number~$x\in{\bb N}$ written in binary \[bitnota]} +\n{$\(x)_b$}{$\(x)$ zero-padded to exactly $b$ bits \[bitnota]} +\n{$x[i]$}{when $x\in{\bb N}$: the value of the $i$-th bit of~$x$ \[bitnota]} +\n{$x[B]$}{when $x\in{\bb N}$: the values of the bits at positions in the set~$B$ \[qhnota]} +\n{$\pi[i]$}{when $\pi$ is a~sequence: the $i$-th element of~$\pi$, starting with $\pi[1]$ \[brackets]} +\n{$\pi[i\ldots j]$}{the subsequence $\pi[i], \pi[i+1], \ldots, \pi[j]$} +\n{$\sigma^k$}{the string~$\sigma$ repeated $k$~times \[bitnota]} +\n{$\0$, $\1$}{bits in a~bit string \[bitnota]} +\n{$\equiv$}{congruence modulo a~given number} +\n{$\bf x$}{vector with elements $x_1,\ldots,x_d$; $x$ is its bitwise encoding \[vecnota]} +\n{$x \shl n$}{bitwise shift of~$x$ by $n$~positions to the left: $x\shl n = x\cdot 2^n$} +\n{$x \shr n$}{bitwise shift of~$x$ by $n$~positions to the right: $x\shr n = \lfloor x/2^n \rfloor$} +\n{$[n]$}{the set $\{1,2,\ldots,n\}$ \[pranksect]} +\n{$n^{\underline k}$}{$k$-th falling factorial power: $n\cdot(n-1)\cdot\ldots\cdot(n-k+1)$ \[kpranksect]} +\n{$H\minorof G$}{$H$ is a~minor of~$G$ \[minordef]} +\n{$G\crpt R$}{graph~$G$ with edges in~$R$ corrupted \[corrnota]} +\n{$R^C$}{$R^C = R\cap \delta(C)$ \[corrnota]} +\n{$M^{i,j}$}{the matrix $M$ with $i$-th row and $j$-th column deleted \[restnota]} + } %-------------------------------------------------------------------------------- @@ -53,167 +113,144 @@ better define it carefully, following \cite{diestel:gt}: \defn A~\df{multigraph} is an ordered triple $(V,E,M)$, where $V$~is the set of vertices, $E$~is the set of edges, taken as abstract objects disjoint -with the vertices, and $M$ is a mapping $E\mapsto V \cup {V \choose 2}$ +with the vertices, and $M$ is a mapping $E\rightarrow {V \choose 2} \cup {V \choose 1}$ which assigns to each edge either a pair of vertices or a single vertex (if the edge is a loop). \proclaim{Notation}% -When the meaning is clear from the context, we use our notation originally -defined for graphs even for multigraphs. For example, $xy\in E(G)$ becomes a +When the meaning is clear from the context, we use the standard graph notation +even for multigraphs. For example, $xy\in E(G)$ becomes a shorthand for $\exists e\in E(G)$ such that $M(G)(e) = \{x,y\}$. Also, we consider multigraphs with no multiple edges nor loops and simple graphs to be the same objects, although they formally differ. \defn\id{contract}% -Let $G=(V,E,M)$ be a multigraph and $e=xy$ its edge. \df{(Multigraph) contraction of~$G$ along~$e$} +Let $G=(V,E,M)$ be a multigraph and $e=xy$ its arbitrary edge. +The \df{(multigraph) contraction of~$e$ in~$G$} produces a multigraph $G/e=(V',E',M')$ such that: $$\eqalign{ V' &= (V(G) \setminus \{x,y\}) \cup \{v_e\},\quad\hbox{where $v_e$ is a new vertex,}\cr E' &= E(G) - \{e\},\cr -M'(f) &= \{ m(v) ; v\in M(f) \} \quad\hbox{for every $f=\in E'$, and}\cr -m(x) &= \cases{v_e & \hbox{for $v=x,y,$}\cr v & \hbox{otherwise.}} \cr +M'(f) &= \{ m(v) \mid v\in M(f) \} \quad\hbox{for every $f\in E'$, and}\cr +m(x) &= \cases{v_e & \hbox{for $v=x,y,$}\cr \noalign{\vskip5pt} v & \hbox{otherwise.}} \cr }$$ -Sometimes we need contraction for simple graphs as well. It corresponds to performing -the multigraph contraction, unifying parallel edges and deleting loops. +We sometimes also need to contract edges in simple graphs. It is equivalent to performing +the multigraph contraction and then unifying parallel edges and deleting loops. \defn\id{simpcont}% -Let $G=(V,E)$ a simple graph and $e=xy$ its edge. \df{(Simple graph) contraction of~$G$ along~$e$} -produces a graph $G.e=(V',E')$ such that: +Let $G=(V,E)$ a simple graph and $e=xy$ its arbitrary edge. +The \df{(simple graph) contraction of~$e$ in~$G$} +produces a graph $G\sgc e=(V',E')$ such that: $$\eqalign{ V' &= (V(G) \setminus \{x,y\}) \cup \{v_e\},\quad\hbox{where $v_e$ is a new vertex,}\cr -E' &= \{ \{m(x),m(y)\} ; xy\in E \land m(x)\ne m(y) \},\cr -m(x) &= \cases{v_e & \hbox{for $v=x,y,$}\cr v & \hbox{otherwise.}} \cr +E' &= \{ \{m(x),m(y)\} \mid xy\in E \land m(x)\ne m(y) \},\cr +m(x) &= \cases{v_e & \hbox{for $v=x,y,$}\cr \noalign{\vskip5pt} v & \hbox{otherwise.}} \cr }$$ +\defn\id{setcont}% +We can also extend the above definitions to contractions of a~set of vertices or edges. +For $F\subseteq E(G)$, the graph $G/F$ is defined as $(G/f_1)/f_2/\ldots/f_k$ where +$f_1,\ldots,f_k$ are the elements of~$F$ (the result obviously does not depend on the order of edges). +For $U\subseteq V(G)$, we define $G/U$ +as the graph with all vertices of~$U$ merged to a~single vertex, that is $(G\cup U^*)/U^*$, +where $U^*$ is the complete graph on~$U$. Similarly for $G\sgc F$ and $G\sgc U$. + %-------------------------------------------------------------------------------- -\section{Models of computation} - -Traditionally, computer scientists use a~variety of computational models -for a~formalism in which their algorithms are stated. If we were studying -NP-completeness, we could safely assume that all the models are equivalent, -possibly up to polynomial slowdown, which is negligible. In our case, the -differences between good and not-so-good algorithms are on a~much smaller -scale, so we need to replace yardsticks with a~micrometer and define our -computation model carefully. - -We would like to keep the formalism close enough to the reality of the contemporary -computers. This rules out Turing machines and similar sequentially addressed -models, but even the remaining models are subtly different. For example, some of them -allow indexing of arrays in constant time, while the others have no such operation -and arrays have to be emulated with pointer structures, requiring $\Omega(\log n)$ -time to access a~single element of an~$n$-element array. It is hard to say which -way is superior --- most ``real'' computers have instructions for constant-time -indexing, but on the other hand it seems to be physically impossible to fulfil -this promise with an~arbitrary size of memory. Indeed, at the level of logical -gates, the depth of the actual indexing circuits is logarithmic. - -In recent decades, most researchers in the area of combinatorial algorithms -were considering two computational models: the Random Access Machine and the Pointer -Machine. The former one is closer to the programmer's view of a~real computer, -the latter one is slightly more restricted and ``asymptotically safe.'' -We will follow this practice and study our algorithms in both models. - -\para -The \df{Random Access Machine (RAM)} is not a~single model, but rather a~family -of closely related models, sharing the following properties. -(See Cook and Reckhow \cite{cook:ram} for one of the common formal definitions -and Hagerup \cite{hagerup:wordram} for a~thorough description of the differences -between the RAM variants.) - -The \df{memory} of the model is represented by an~array of \df{memory cells} -addressed by non-negative integers, each of them containing a~single non-negative integer. -The \df{program} is a~sequence of \df{instructions} of two basic kinds: calculation -instructions and control instructions. - -\df{Calculation instructions} have two source arguments and one destination -argument, each \df{argument} being either an~immediate constant (not available -as destination), a~directly addressed memory cell (specified by its number) -or an~indirectly addressed memory cell (its address is stored in a~directly -addressed memory cell). - -\df{Control instructions} include branches (to a~specific instruction in -the program), conditional branches (e.g., jump if two arguments specified as -in the calculation instructions are equal) and an~instruction to halt the program. - -At the beginning of the computation, the memory contains the input data -in specified memory cells and arbitrary values in all other cells. -Then the program is executed one instruction at a~time. When it halts, -specified memory cells are interpreted as the program's output. - -\para\id{wordsize}% -In the description of the RAM family, we have omitted several properties -on~purpose, because different members of the family define them differently. -These are: the size of the available integers, the time complexity of a~single -instruction, the space complexity of a~single memory cell and the repertoire -of operations available in calculation instructions. - -If we impose no limits on the magnitude of the numbers and we assume that -arithmetic and logical operations work on them in constant time, we get -a~very powerful parallel computer --- we can emulate an~exponential number -of parallel processors using arithmetics and suddenly almost everything can be -computed in constant time, modulo encoding and decoding of input and output. -Such models are unrealistic and there are two basic possibilities how to -avoid this behavior: - -\numlist\ndotted -\:Keep unbounded numbers, but increase costs of instructions: each instruction - consumes time proportional to the number of bits of the numbers it processes, - including memory addresses. Similarly, space usage is measured in bits, - counting not only the values, but also the addresses of the respective memory - cells. -\:Place a~limit on the size of the numbers ---define the \df{word size~$W$,} - the number of bits available in the memory cells--- and keep the cost of - of instructions and memory cells constant. The word size must not be constant, - since we can address only~$2^W$ cells of memory. If the input of the algorithm - is stored in~$N$ cells, we need~$W\ge\log N$ just to be able to read the input. - On the other hand, we are interested in polynomial-time algorithms only, so $\Theta(\log N)$-bit - numbers should be sufficient. In practice, we pick~$w$ to be the larger of - $\Theta(\log N)$ and the size of integers used in the algorithm's input and output. -\endlist - -Both restrictions easily avoid the problems of unbounded parallelism. The first -choice is theoretically cleaner and Cook et al.~show nice correspondences to the -standard complexity classes, but the calculations of time and space complexity tend -to be somewhat tedious. What more, when compared with the RAM with restricted -word size, the complexities are usually exactly $\Theta(w)$ times higher. -This does not hold in general (consider a~program which uses many small numbers -and $\O(1)$ large ones), but it is true for the algorithms we are interested in. -Therefore we will always assume that the operations have unit cost and we make -sure that all numbers are limited by the available word size. - -\para -As for the choice of RAM operations, the following three instruction sets are used: - -\itemize\ibull -\:\df{Word-RAM} --- allows the ``C-language operators'', i.e., addition, - subtraction, multiplication, division, remainder, bitwise {\sc and,} {\sc or,} exclusive - {\sc or} ({\sc xor}), negation ({\sc not}) and bitwise shifts ($\ll$ and~$\gg$). -\:\df{${\rm AC}^0$-RAM} --- allows all operations from the class ${\rm AC}^0$, i.e., - those computable by constant-depth polynomial-size boolean circuits with unlimited - fan-in and fan-out. This includes all operations of the Word-RAM except for multiplication, - division and remainders, and also many other operations like computing the Hamming - weight (number of bits set in a~given number). -\:Both restrictions at once. -\endlist - -Thorup discusses the usual techniques employed by RAM algorithms in~\cite{thorup:aczero} -and he shows that they work on both Word-RAM and ${\rm AC}^0$-RAM, but the combination -of the two restrictions is too weak. On the other hand, taking the intersection of~${\rm AC}^0$ -with the instruction set of modern processors (like the multimedia instructions of Intel's Pentium4) -is already strong enough. - -\FIXME{References to CPU manuals} - -We will therefore use the Word-RAM instruction set, mentioning differences with -${\rm AC}^0$-RAM where necessary. - -\nota -When speaking of the \df{RAM,} we implicitly mean the version with numbers limited -by a~specified word size of $W$~bits, unit cost of operations and memory cells and the instruction -set of the Word-RAM. This corresponds to the usage in recent algorithmic literature, -although the authors rarely mention the details. +\section{Ackermann's function and its inverses}\id{ackersec}% + +The Ackermann's function is an~extremely quickly growing function which has been +introduced by Ackermann \cite{ackermann:function} in the context of +computability theory. Its original purpose was to demonstrate that not every recursive +function is also primitive recursive. At the first sight, it does not +seem related to efficient algorithms at all. Its various inverses however occur in +analyses of algorithms and mathematical structures surprisingly often: +We meet them in Section \ref{classalg} in the time complexity of the Disjoint Set Union +data structure and also in the best known upper bound on the decision tree +complexity of minimum spanning trees in Section \ref{optalgsect}. Another +important application is in the complexity of Davenport-Schinzel sequences (see +Klazar's survey \cite{klazar:gdss}), but as far as we know, these are not otherwise +related to the topic of our study. + +Various sources differ in the exact definition of both the Ackermann's +function and its inverse, but most of these differences are in factors that +are negligible in the light of the giant asymptotic growth of the function.\foot{% +To quote Pettie \cite{pettie:onlineverify}: ``In the field of algorithms \& complexity, +Ackermann's function is rarely defined the same way twice. We would not presume to buck +such a~well-established precedent. Here is a~slight variant.''} +We will use the definition by double recursion given by Tarjan \cite{tarjan:setunion}, +which is predominant in the literature on graph algorithms. + +\defn\id{ackerdef}% +The \df{Ackermann's function} $A(x,y)$ is a~function on non-negative integers defined as follows: +$$\eqalign{ +A(0,y) &:= 2y, \cr +A(x,0) &:= 0, \cr +A(x,1) &:= 2 \quad \hbox{for $x\ge 1$}, \cr +A(x,y) &:= A(x-1, A(x,y-1)) \quad \hbox{for $x\ge 1$, $y\ge 2$}. \cr +}$$ +The functions $A(x,\cdot)$ are called the \df{rows} of $A(x,y)$, similarly $A(\cdot,y)$ are +its \df{columns.} + +Sometimes, a~single-parameter version of this function is also used. It is defined +as the diagonal of the previous function, i.e., $A(x):=A(x,x)$. + +\example +We can try evaluating $A(x,y)$ in some points: +$$\eqalign{ +A(x,2) &= A(x-1, A(x,1)) = A(x-1,2) = A(0,2) = 4, \cr +A(1,y) &= A(0, A(1,y-1)) = 2A(1,y-1) = 2^{y-1}A(1,1) = 2^y, \cr +A(2,y) &= A(1, A(2,y-1)) = 2^{A(2,y-1)} = 2\tower y \hbox{~~(the tower of exponentials),} \cr +A(3,y) &= \hbox{the tower function iterated $y$~times,} \cr +A(4,3) &= A(3,A(4,2)) = A(3,4) = A(2,A(3,3)) = A(2,A(2,A(3,2))) = \cr + &= A(2,A(2,4)) = 2\tower(2\tower 4) = 2\tower 65536. \cr +}$$ +\paran{Inverses}% +As common for functions of multiple parameters, there is no single function which +could claim the title of the only true Inverse Ackermann's function. +The following three functions related to the inverse of the function~$A$ are often considered: + +\defn\id{ackerinv}% +The \df{$i$-th row inverse} $\lambda_i(n)$ of the Ackermann's function is defined by: +$$ +\lambda_i(n) := \min\{ y \mid A(i,y) > \log n \}. +$$ +The \df{diagonal inverse} $\alpha(n)$ is defined by: +$$ +\alpha(n) := \min\{ x \mid A(x) > \log n \}. +$$ +The two-parameter \df{alpha function} $\alpha(m,n)$ is defined for $m\ge n$ by: +$$ +\alpha(m,n) := \min\{ x\ge 1 \mid A(x,4\lceil m/n\rceil) > \log n \}. +$$ + +\example +$\lambda_1(n) = \O(\log\log n)$, $\lambda_2(n) = \O(\log^* n)$, $\lambda_3(n)$ grows even slower +and $\alpha(n)$ is asymptotically smaller than $\lambda_i(n)$ for any fixed~$i$. + +\obs +It is easy to verify that all the rows are strictly increasing and so are all +columns, except the first three columns which are constant. Therefore for a~fixed~$n$, +$\alpha(m,n)$ is maximized at $m=n$. So $\alpha(m,n) \le 3$ when $\log n < A(3,4)$, +which covers all values of~$m$ that are likely to occur in practice. + +\lemma +$\alpha(m,n) \le \alpha(n)+1$. + +\proof +We know that +$A(x,4\lceil m/n\rceil) \ge A(x,4) = A(x-1,A(x,3)) \ge A(x-1,x-1)$, so $A(x,4\lceil m/n\rceil)$ +rises above $\log n$ no later than $A(x-1,x-1)$ does. +\qed + +\lemma\id{alphaconst}% +When $i$~is a~fixed constant and $m\ge n\cdot \lambda_i(n)$, then $\alpha(m,n) \le i$. + +\proof +The choice of~$m$ guarantees that $A(x,4\lceil m/n\rceil) \ge A(x,\lambda_i(n))$, which +is greater than $\log n$ for all $x \ge i$. +\qed \endpart