From 7bc4f10b763512ad5777a1d1ea0d9069d78a72b2 Mon Sep 17 00:00:00 2001 From: Martin Mares Date: Wed, 2 Apr 2008 15:32:13 +0200 Subject: [PATCH] More robust contractions. --- adv.tex | 2 +- opt.tex | 132 ++++++++++++++++++++++++++++++++++++++++++++++++++------ 2 files changed, 121 insertions(+), 13 deletions(-) diff --git a/adv.tex b/adv.tex index 6836d43..f5f3360 100644 --- a/adv.tex +++ b/adv.tex @@ -449,7 +449,7 @@ Each edge is considered at most twice (once per its endpoint), so the number of the other heap operations is~$\O(m_i)$. Together, it equals $\O(m_i + n_i\log t_i) = \O(m_i+m) = \O(m)$. \qed -\lemma +\lemma\id{ijsize}% Unless the $i$-th phase is final, the forest~$F_i$ consists of at most $2m_i/t_i$ trees. \proof diff --git a/opt.tex b/opt.tex index 7aa4c54..07da833 100644 --- a/opt.tex +++ b/opt.tex @@ -28,11 +28,13 @@ The soft heap contains a~set of distinct items from a~totally ordered universe a supports the following operations: \itemize\ibull \:$\(\varepsilon)$ --- create an~empty soft heap with the given accuracy parameter~$\varepsilon$, -\:$\(H,x)$ --- insert a~new element~$x$ to the heap~$H$, -\:$\(P,Q)$ --- merge two heaps into one, more precisely move all elements of a~heap~$Q$ +\:$\(H,x)$ --- insert a~new item~$x$ to the heap~$H$, +\:$\(P,Q)$ --- merge two heaps into one, more precisely move all items of a~heap~$Q$ to the heap~$P$, destroying~$Q$ in the process (both heaps must have the same~$\varepsilon$), -\:$\(H)$ --- delete the minimum element of the heap~$H$ and return its value +\:$\(H)$ --- delete the minimum item of the heap~$H$ and return its value (optionally signalling that the value has been corrupted). +\:$\(H)$ --- destroy the heap and return a~list of all items contained in it + (again optionally marking those corrupted). \endlist \examplen{Linear-time selection} @@ -273,6 +275,11 @@ Let us translate these ideas to real (pseudo)code: \algout A~modified soft queue. \endalgo +\para +\ is trivial once we know how to recognize the corrupted items. It simply examines +all queues in the heap, walks the trees and the item lists of all vertices. It records +all items seen, the corrupted ones are those that different from their \. + \paran{Analysis of accuracy} The description of the operations is now complete, so let us analyse their behavior and verify that we have delivered what we promised --- first the accuracy of @@ -457,6 +464,17 @@ leftmost path and therefore it can be also paid for by \. (Incidentally, this was the only place where we needed the invariant.) \qed +\s are easy not only to implement, but also to analyse: + +\lemma +Every \ takes $\O(1)$ time amortized. + +\proof +As all queues, vertices and items examined by \ are forever gone from the heap, +we can charge the constant time spent on each of them against the operations +that have created them. +\qed + It remains to take care of the calculation with ranks: \lemma\id{shyards}% @@ -482,7 +500,7 @@ and their sum telescopes, again to the real cost of the meld. Now we can put the bits together and laurel our effort with the following theorem: -\thmn{Performance of soft heaps, Chazelle \cite{chazelle:softheap}} +\thmn{Performance of soft heaps, Chazelle \cite{chazelle:softheap}}\id{softheap}% A~soft heap with error rate~$\varepsilon$ ($0<\varepsilon\le 1/2$) processes a~sequence of operations starting with an~empty heap and containing $n$~\s in time $\O(n\log(1/\varepsilon))$ on the Pointer machine. At every moment, the @@ -509,8 +527,12 @@ by rebuilding the whole data structure completely after $n/2$ deletes. This increases the number of potentially corrupted items, but at worst twice, so it suffices to decrease~$\varepsilon$ twice. +%-------------------------------------------------------------------------------- + \section{Robust contractions} +\def\C{{\cal C}} + Having the soft heaps at hand, we would like to use them in a~conventional MST algorithm in place of a~usual heap. The most efficient specimen of a~heap-based algorithm we have seen so far is the Iterated Jarn\'\i{}k's algorithm (\ref{itjar}). @@ -561,7 +583,7 @@ between $G-C$ and~$G/C$, so we can write $G=C \cup (G/C)$. When~$C\subseteq G$ is a~contractible subgraph, then $\msf(G)=\msf(C) \cup \msf(G/C)$. \proof -As both sides of the equality are trees on the same number of vertices, it suffices +As both sides of the equality are forests spanning the same graph, it suffices to show that $\msf(G) \subseteq \msf(C)\cup\msf(G/C)$. We know that the edges which does not participate in the MSF are exactly those which are the heaviest on some cycle (this is the Cycle rule \ref{cyclerule}). @@ -615,14 +637,100 @@ trick with expanding the vertex~$c$ to~$P$ in the cycle~$C$. Hence $g\not\in\mst \qed \para -It therefore suffices to run the Jarn\'\i{}k's algorithm with a~soft heap to find the -$(G\crpt R)$-contractible subgraph~$C$ and then recurse three times: first find $\msf(C)$ -and $\msf((G/C)\setminus R_C)$, then put together the pieces which are known to contain -$\msf(G)$ by the Robust contraction lemma and find the MSF of this graph. As in the Iterative -Jarn\'\i{}k's algorithm, we will construct as many non-overlapping contractible subgraphs in a~single -phase of the algorithm as possible. Formally speaking, we will use the following partitioning -procedure. +Our plan still is to mimic the Iterative Jarn\'\i{}k's algorithm. We will grow a~collection~$\C$ +of non-overlapping contractible subgraphs and put aside the edges which got corrupted in the process. +We recursively compute the MSF of the subgraphs and of the contracted graph. Then we take the +union of these MSF's and add the corrupted edges. According to the previous lemma, this does not produce +the final MSF, but a~sparser graph containing it on which we can continue. More precisely: + +\corn{Gradual robust contraction}\id{gradrc}% +Suppose that $G$~is a~weighted graph, $R$~a~set of its edges, and $\C=\{C_1,\ldots,C_k\}$ is a~collection +of subgraphs contractible in~$G\crpt R_\C$, where $R_\C = R_{C_1} \cup \ldots \cup R_{C_k}$. Then: +$$ +\msf(G) = \msf\left( \bigcup_i \msf(C_i) \cup \msf\biggl( \biggl( G / \bigcup_i C_i \biggr) \setminus R_\C \biggr) \cup R_\C \right). +$$ + +\proof +By iterating the previous lemma and noting that subgraphs contractible in~$G\crpt R^\prime_\C$ +for a~subset~$R^\prime_\C$ of $R_\C$ are also contractible in~$G\crpt R_\C$. +\qed +We can formulate the exact partitioning algorithm as follows: +\algn{Partition a~graph to a~collection of contractible subgraphs}\id{partition}% +\algo +\algin A~graph~$G$ with an~edge comparison oracle, a~parameter~$t$ controlling the size of the subgraphs, + and an~accuracy parameter~$\varepsilon$. +\:Mark all vertices as ``live''. +\:$\C\=\emptyset$, $R_\C\=\emptyset$. \cmt{Start with an~empty collection and no corrupted edges.} +\:While there is a~live vertex~$v_0$: +\::$K=\{v_0\}$. \cmt{vertices of the tree we currently grow} +\::$R=\emptyset$. \cmt{edges known to be corrupted} +\::\ a~soft heap with accuracy~$\varepsilon$ and \ the edges adjacent to~$v_0$ in it. +\::While the heap is not empty and $\vert K\vert \le t$: +\:::\ an~edge $uv$ from the heap, assume $u\in K$. +\:::If $uv$ was corrupted, add it to~$R$. +\:::If $v\in K$, drop the edge and repeat the previous two steps. +\:::$K\=K\cup\{v\}$. +\:::If $v$ is dead, break out of the current loop. +\:::Insert all edges incident with~$v$ to the heap. +\::$\C\=\C \cup \{G[K]\}$. \cmt{Record the subgraph.} +\::\ the heap and add all remaining corrupted edges to~$R$. +\::$R_\C\=R_\C \cup \{ R_K \}$. \cmt{Record the ``interesting'' corrupted edges.} +\::$G\=G\setminus R_K$. \cmt{Remove the corrupted edges from~$G$.} +\::Mark all vertices of~$K$ as ``dead''. +\algout The collection $\C$ of contractible subgraphs and the set~$R_\C$ of +corrupted edges in the neighborhood of these subgraphs. +\endalgo + +\lemman{Partitioning to contractible subgraphs, Pettie \cite{pettie:optimal}}\id{partlemma}% +Given a~weighted graph~$G$ and parameters $\varepsilon$ ($0<\varepsilon\le 1/2$) +and~$t$, the Partition algorithm \ref{partition} constructs a~collection +$\C=\{C_1,\ldots,C_k\}$ of subgraphs and a~set~$R_\C$ of corrupted edges such that: + +\numlist\ndotted +\:All the $C_i$'s and the set~$R_\C$ are edge-disjoint. +\:Each $C_i$ contains at most~$t$ vertices. +\:Each vertex of~$G$ is contained in at least one~$C_i$. +\:The connected components of the union of all $C_i$'s have at least~$t$ vertices each, + except perhaps for those which are equal to a~connected component of~$G\setminus R_\C$. +\:$\vert R_\C\vert \le 2\varepsilon m$. +\:$\msf(G) = \msf\Bigl( \bigcup_i \msf(C_i) \cup \msf\bigl((G / \bigcup_i C_i) \setminus R_\C\bigr) \cup R_C \Bigr)$. +\:The algorithm runs in time $\O(n+m\log (1/\varepsilon))$. +\endlist + +\proof +The algorithm grows a~series of trees. A~tree is built from edges adjacent to live vertices +and once it is finished, all vertices of the tree die, so no edge is ever reused in another +tree. Edges of~$R_\C$ are immediately deleted from the graph, so they never participate +in any tree. The algorithm stops only if all vertices are dead, so each vertex must have +entered some tree. Each $C_i$ is induced by some tree and the trees have at most~$t$ vertices +each, which limits the size of the $C_i$'s as well. This proves claims 1--3. + +Claim~4: We can show that each connected component has $t$~or more vertices as we already did +in the proof of Lemma \ref{ijsize}: How can a~new tree stop growing? Either it already +has~$t$ vertices, or it joins one of the existing trees (which only increases the +size of the component), or the heap becomes empty (which means that the tree spans +a~full component of the current graph and hence also of~$G\setminus R_\C$). + +Claim~5: The set~$R_\C$ contains a~subset of edges corrupted by the soft heaps over +the course of the algorithm. As every edge is inserted to a~heap at most once +in every direction, the heaps can corrupt at worst $2\varepsilon m$ edges altogether. + +Claim~6 follows from the above Corollary \ref{gradrc} on Gradual robust contraction. +\FIXME{It does not so far!} + +It remains to calculate the time complexity (claim~7): +The inner loop (steps 7--13) processes the edges of the current subgraph~$G[K]$ and also +the edges in its neighborhood $\delta(G[K])$. Steps 6 and~13 insert each such edge to the heap +at most once, so steps 8--12 also visit each edge at most once. For every edge, we spend +$\O(\log(1/\varepsilon))$ time amortized on inserting it and $\O(1)$ on the other operations +(by Theorem \ref{softheap} on performance of the soft heaps). + +Each edge of~$G$ can participate in some $G[K]\cup \delta(G[K])$ only twice, then both of its ends +are dead. The inner loop therefore processes $\O(m)$ edges total, so we spend $\O(m\log(1/\varepsilon))$ +time there. The rest of the algorithm spends $\O(m)$ time on the other operations with subgraphs +and $\O(n)$ on identifying the live vertices. +\qed \endpart -- 2.39.2