From 9a0ea083594219f5d391bf6f5cc59d867a815c9e Mon Sep 17 00:00:00 2001 From: Martin Mares Date: Tue, 3 Jun 2008 21:47:55 +0200 Subject: [PATCH] More abstract art. --- abstract.tex | 307 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 307 insertions(+) diff --git a/abstract.tex b/abstract.tex index 4028ebe..575371b 100644 --- a/abstract.tex +++ b/abstract.tex @@ -870,6 +870,313 @@ optimal algorithm: \thm The time complexity of the Optimal algorithm is $\O(m\timesalpha(m,n))$. +\chapter{Dynamic Spanning Trees}\id{dynchap}% + +\section{Dynamic graph algorithms} + +In many applications, we often need to solve a~certain graph problem for a~sequence of graphs that +differ only a~little, so recomputing the solution for every graph from scratch would be a~waste of +time. In such cases, we usually turn our attention to \df{dynamic graph algorithms.} A~dynamic +algorithm is in fact a~data structure that remembers a~graph. It offers operations that modify the +structure of the graph and also operations that query the result of the problem for the current +state of the graph. A~typical example of a~problem of this kind is dynamic maintenance of connected +components: + +\problemn{Dynamic connectivity} +Maintain an~undirected graph under a~sequence of the following operations: +\itemize\ibull +\:$\(n)$ --- Create a~graph with $n$~isolated vertices $\{1,\ldots,n\}$.\foot{% +The structure could support dynamic addition and removal of vertices, too, +but this is easy to add and infrequently used, so we will rather keep the set +of vertices fixed for clarity.} +\:$\(G,u,v)$ --- Insert an~edge $uv$ to~$G$ and return its unique +identifier. This assumes that the edge did not exist yet. +\:$\(G,e)$ --- Delete an~edge specified by its identifier from~$G$. +\:$\(G,u,v)$ --- Test if vertices $u$ and~$v$ are in the same connected component of~$G$. +\endlist + +In this chapter, we will focus on the dynamic version of the minimum spanning forest. +This problem seems to be intimately related to the dynamic connectivity. Indeed, all known +algorithms for dynamic connectivity maintain some sort of a~spanning forest. +This suggests that a~dynamic MSF algorithm could be obtained by modifying the +mechanics of the data structure to keep the forest minimum. +We however have to answer one important question first: What should be the output of +our MSF data structure? Adding an~operation that returns the MSF of the current +graph would be of course possible, but somewhat impractical as this operation would have to +spend $\Omega(n)$ time on the mere writing of its output. A~better way seems to +be making the \ and \ operations report the list of modifications +of the MSF implied by the change in the graph. It is easy to prove that $\O(1)$ +modifications always suffice, so we can formulate our problem as follows: + +\problemn{Dynamic minimum spanning forest} +Maintain an~undirected graph with distinct weights on edges (drawn from a~totally ordered set) +and its minimum spanning forest under a~sequence of the following operations: +\itemize\ibull +\:$\(n)$ --- Create a~graph with $n$~isolated vertices $\{1,\ldots,n\}$. +\:$\(G,u,v,w)$ --- Insert an~edge $uv$ of weight~$w$ to~$G$. Return its unique + identifier and the list of additions and deletions of edges in $\msf(G)$. +\:$\(G,e)$ --- Delete an~edge specified by its identifier from~$G$. + Return the list of additions and deletions of edges in $\msf(G)$. +\endlist + +\paran{Incremental MSF}% +In case only edge insertions are allowed, the problem reduces to finding the heaviest +edge (peak) on the tree path covered by the newly inserted edge and replacing the peak +if needed. This can be handled quite efficiently by using the Link-Cut trees of Sleator +and Tarjan \cite{sleator:trees}. We obtain logarithmic time bound: + +\thmn{Incremental MSF} +When only edge insertions are allowed, the dynamic MSF can be maintained in time $\O(\log n)$ +amortized per operation. + +\section{Dynamic connectivity} + +The fully dynamic connectivity problem has a~long and rich history. In the 1980's, Frederickson \cite{frederickson:dynamic} +has used his topological trees to construct a~dynamic connectivity algorithm of complexity $\O(\sqrt m)$ per update and +$\O(1)$ per query. Eppstein et al.~\cite{eppstein:sparsify} have introduced a~sparsification technique which can bring the +updates down to $\O(\sqrt n)$. Later, several different algorithms with complexity on the order of $n^\varepsilon$ +were presented by Henzinger and King \cite{henzinger:mst} and also by Mare\v{s} \cite{mares:dga}. +A~polylogarithmic time bound was first reached by the randomized algorithm of Henzinger and King \cite{henzinger:randdyn}. +The best result known as of now is the $\O(\log^2 n)$ time deterministic algorithm by Holm, +de~Lichtenberg and Thorup \cite{holm:polylog}, which will we describe in this section. + +The algorithm will maintain a~spanning forest~$F$ of the current graph~$G$, represented by an~ET-tree +which will be used to answer connectivity queries. The edges of~$G\setminus F$ will be stored as~non-tree +edges in the ET-tree. Hence, an~insertion of an~edge to~$G$ either adds it to~$F$ or inserts it as non-tree. +Deletions of non-tree edges are also easy, but when a~tree edge is deleted, we have to search for its +replacement among the non-tree edges. + +To govern the search in an~efficient way, we will associate each edge~$e$ with a~level $\ell(e) \le +L = \lfloor\log_2 n\rfloor$. For each level~$i$, we will use~$F_i$ to denote the subforest +of~$F$ containing edges of level at least~$i$. Therefore $F=F_0 \supseteq F_1 \supseteq \ldots \supseteq F_L$. +We will maintain the following \em{invariants:} + +{\narrower +\def\iinv{{\bo I\the\itemcount~}} +\numlist\iinv +\:$F$~is the maximum spanning forest of~$G$ with respect to the levels. (In other words, +if $uv$ is a~non-tree edge, then $u$ and~$v$ are connected in~$F_{\ell(uv)}$.) +\:For each~$i$, the components of~$F_i$ have at most $\lfloor n/2^i \rfloor$ vertices each. +(This implies that it does not make sense to define~$F_i$ for $i>L$, because it would be empty +anyway.) +\endlist +} + +At the beginning, the graph contains no edges, so both invariants are trivially +satisfied. Newly inserted edges enter level~0, which cannot break I1 nor~I2. + +When we delete a~tree edge at level~$\ell$, we split a~tree~$T$ of~$F_\ell$ to two +trees $T_1$ and~$T_2$. Without loss of generality, let us assume that $T_1$ is the +smaller one. We will try to find the replacement edge of the highest possible +level that connects the spanning tree back. From I1, we know that such an~edge cannot belong to +a~level greater than~$\ell$, so we start looking for it at level~$\ell$. According +to~I2, the tree~$T$ had at most $\lfloor n/2^\ell\rfloor$ vertices, so $T_1$ has +at most $\lfloor n/2^{\ell+1} \rfloor$ of them. Thus we can move all level~$\ell$ +edges of~$T_1$ to level~$\ell+1$ without violating either invariant. + +We now start enumerating the non-tree edges incident with~$T_1$. Each such edge +is either local to~$T_1$ or it joins $T_1$ with~$T_2$. We will therefore check each edge +whether its other endpoint lies in~$T_2$ and if it does, we have found the replacement +edge, so we insert it to~$F_\ell$ and stop. Otherwise we move the edge one level up. (This +will be the grist for the mill of our amortization argument: We can charge most of the work on level +increases and we know that the level of each edge can reach at most~$L$.) + +If the non-tree edges at level~$\ell$ are exhausted, we try the same in the next +lower level and so on. If there is no replacement edge at level~0, the tree~$T$ +remains disconnected. + +The implementation uses the Eulerian Tour trees of Henzinger and King \cite{henzinger:randdyn} +to represent the forests~$F_\ell$ together with the non-tree edges at each particular level. +A~simple amortized analysis using the levels yields the following result: + +\thmn{Fully dynamic connectivity, Holm et al.~\cite{holm:polylog}}\id{dyncon}% +Dynamic connectivity can be maintained in time $\O(\log^2 n)$ amortized per +\ and \ and in time $\O(\log n/\log\log n)$ per \ +in the worst case. + +\rem\id{dclower}% +An~$\Omega(\log n/\log\log n)$ lower bound for the amortized complexity of the dynamic connectivity +problem has been proven by Henzinger and Fredman \cite{henzinger:lowerbounds} in the cell +probe model with $\O(\log n)$-bit words. Thorup has answered by a~faster algorithm +\cite{thorup:nearopt} that achieves $\O(\log n\log^3\log n)$ time per update and +$\O(\log n/\log^{(3)} n)$ per query on a~RAM with $\O(\log n)$-bit words. (He claims +that the algorithm runs on a~Pointer Machine, but it uses arithmetic operations, +so it does not fit the definition of the PM we use. The algorithm only does not +need direct indexing of arrays.) So far, it is not known how to extend this algorithm +to fit our needs, so we omit the details. + +\section{Dynamic spanning forests}\id{dynmstsect}% + +Let us turn our attention back to the dynamic MSF. +Most of the early algorithms for dynamic connectivity also imply $\O(n^\varepsilon)$ +algorithms for dynamic maintenance of the MSF. Henzinger and King \cite{henzinger:twoec,henzinger:randdyn} +have generalized their randomized connectivity algorithm to maintain the MSF in $\O(\log^5 n)$ time per +operation, or $\O(k\log^3 n)$ if only $k$ different values of edge weights are allowed. They have solved +the decremental version of the problem first (which starts with a~given graph and only edge deletions +are allowed) and then presented a~general reduction from the fully dynamic MSF to its decremental version. +We will describe the algorithm of Holm, de Lichtenberg and Thorup \cite{holm:polylog}, who have followed +the same path. They have modified their dynamic connectivity algorithm to solve the decremental MSF +in $\O(\log^2 n)$ and obtained the fully dynamic MSF working in $\O(\log^4 n)$ per operation. + +\paran{Decremental MSF}% +Turning the algorithm from the previous section to the decremental MSF requires only two +changes: First, we have to start with the forest~$F$ equal to the MSF of the initial +graph. As we require to pay $\O(\log^2 n)$ for every insertion, we can use almost arbitrary +MSF algorithm to find~$F$. Second, when we search for an~replacement edge, we need to pick +the lightest possible choice. We will therefore use a~weighted version of the ET-trees. +We must ensure that the lower levels cannot contain a~lighter replacement edge, +but fortunately the light edges tend to ``bubble up'' in the hierarchy of +levels. This can be formalized in form of the following invariant: + +{\narrower +\def\iinv{{\bo I\the\itemcount~}} +\numlist\iinv +\itemcount=2 +\:On every cycle, the heaviest edge has the smallest level. +\endlist +} + +\>This immediately implies that we always select the right replacement edge: + +\lemma\id{msfrepl}% +Let $F$~be the minimum spanning forest and $e$ any its edge. Then among all replacement +edges for~$e$, the lightest one is at the maximum level. + +A~brief analysis also shows that the invariant I3 is observed by all operations +on the structure. We can conclude: + +\thmn{Decremental MSF, Holm et al.~\cite{holm:polylog}} +When we start with a~graph on $n$~vertices with~$m$ edges and we perform a~sequence of +edge deletions, the MSF can be initialized in time $\O((m+n)\cdot\log^2 n)$ and then +updated in time $\O(\log^2 n)$ amortized per operation. + +\paran{Fully dynamic MSF}% +The decremental MSF algorithm can be turned to a~fully dynamic one by a~blackbox +reduction whose properties are summarized in the following theorem: + +\thmn{MSF dynamization, Holm et al.~\cite{holm:polylog}} +Suppose that we have a~decremental MSF algorithm with the following properties: +\numlist\ndotted +\:For any $a$,~$b$, it can be initialized on a~graph with~$a$ vertices and~$b$ edges. +\:Then it executes an~arbitrary sequence of deletions in time $\O(b\cdot t(a,b))$, where~$t$ is a~non-decreasing function. +\endlist +\>Then there exists a~fully dynamic MSF algorithm for a~graph on $n$~vertices, starting +with no edges, that performs $m$~insertions and deletions in amortized time: +$$ +\O\left( \log^3 n + \sum_{i=1}^{\log m} \sum_{j=1}^i \; t(\min(n,2^j), 2^j) \right) \hbox{\quad per operation.} +$$ + +\corn{Fully dynamic MSF}\id{dynmsfcorr}% +There is a~fully dynamic MSF algorithm that works in time $\O(\log^4 n)$ amortized +per operation for graphs on $n$~vertices. + +\paran{Dynamic MSF with limited edge weights}% +If the set from which the edge weights are drawn is small, we can take a~different +approach. If only two values are allowed, we split the graph to subgraphs $G_1$ and~$G_2$ +induced by the edges of the respective weights and we maintain separate connectivity +structures (together with a~spanning tree) for $G_1$ and $G_2 \cup T_1$ (where $T_1$ +is a~spanning tree of~$G_1$). We can easily modify the structure for $G_2\cup +T_1$ to prefer the edges of~$T_1$. This ensures that the spanning tree of $G_2\cup T_1$ +will be the MST of the whole~$G$. + +If there are more possible values, we simply iterate this construction: the $i$-th +structure contains edges of weight~$i$ and the edges of the spanning tree from the +$(i-1)$-th structure. We get: + +\thmn{MSF with limited edge weights} +There is a~fully dynamic MSF algorithm that works in time $\O(k\cdot\log^2 n)$ amortized +per operation for graphs on $n$~vertices with only $k$~distinct edge weights allowed. + +\section{Almost minimum trees}\id{kbestsect}% + +In some situations, finding the single minimum spanning tree is not enough and we are interested +in the $K$~lightest spanning trees, usually for some small value of~$K$. Katoh, Ibaraki +and Mine \cite{katoh:kmin} have given an~algorithm of time complexity $\O(m\log\beta(m,n) + Km)$, +building on the MST algorithm of Gabow et al.~\cite{gabow:mst}. +Subsequently, Eppstein \cite{eppstein:ksmallest} has discovered an~elegant preprocessing step which allows to reduce +the running time to $\O(m\log\beta(m,n) + \min(K^2,Km))$ by eliminating edges +which are either present in all $K$ trees or in none of them. +We will show a~variant of their algorithm based on the MST verification +procedure of Section~\ref{verifysect}. + +In this section, we will require the edge weights to be numeric, because +comparisons are certainly not sufficient to determine the second best spanning tree. We will +assume that our computation model is able to add, subtract and compare the edge weights +in constant time. + +Let us focus on finding the second lightest spanning tree first. + +\paran{Second lightest spanning tree}% +Suppose that we have a~weighted graph~$G$ and a~sequence $T_1,\ldots,T_z$ of all its spanning +trees. Also suppose that the weights of these spanning trees are distinct and that the sequence +is ordered by weight, i.e., $w(T_1) < \ldots < w(T_z)$ and $T_1 = \mst(G)$. Let us observe +that each tree is similar to at least one of its predecessors: + +\lemman{Difference lemma}\id{kbl}% +For each $i>1$ there exists $jThus we can run the previous algorithm for finding the best edge exchange +on both~$G_1$ and~$G_2$ and find~$T_3$ again in time $\O(m)$. + +\paran{Further spanning trees}% +The construction of auxiliary graphs can be iterated to obtain $T_1,\ldots,T_K$ +for an~arbitrary~$K$. We will build a~\df{meta-tree} of auxiliary graphs. Each node of this meta-tree +carries a~graph\foot{This graph is always derived from~$G$ by a~sequence of edge deletions +and contractions. It is tempting to say that it is a~minor of~$G$, but this is not true as we +preserve multiple edges.} and its minimum spanning tree. The root node contains~$(G,T_1)$, +its sons have $(G_1,T_1/e)$ and $(G_2,T_2)$. When $T_3$ is obtained by an~exchange +in one of these sons, we attach two new leaves to that son and we let them carry the two auxiliary +graphs derived by contracting or deleting the exchanged edge. Then we find the best +edge exchanges among all leaves of the new meta-tree and repeat the process. By Observation \ref{tbobs}, +each spanning tree of~$G$ is generated exactly once. The Difference lemma guarantees that +the trees are enumerated in the increasing order. So we get: + +\lemma\id{kbestl}% +Given~$G$ and~$T_1$, we can find $T_2,\ldots,T_K$ in time $\O(Km + K\log K)$. + +\paran{Invariant edges}% +Our algorithm can be further improved for small values of~$K$ (which seems to be the common +case in most applications) by the reduction of Eppstein \cite{eppstein:ksmallest}. +We will observe that there are many edges of~$T_1$ +which are guaranteed to be contained in $T_2,\ldots,T_K$ as well, and likewise there are +many edges of $G\setminus T_1$ which are excluded from all those spanning trees. +When we combine this with the previous construction, we get the following theorem: + +\thmn{Finding $K$ lightest spanning trees}\id{kbestthm}% +For a~given graph~$G$ with real edge weights and a~positive integer~$K$, the $K$~best spanning trees can be found +in time $\O(m\timesalpha(m,n) + \min(K^2,Km + K\log K))$. + %-------------------------------------------------------------------------------- \chapter{Ranking Combinatorial Structures}\id{rankchap}% -- 2.39.2