From: Martin Mares Date: Fri, 1 Feb 2008 10:37:17 +0000 (+0100) Subject: Bug fixes. X-Git-Tag: printed~260 X-Git-Url: http://mj.ucw.cz/gitweb/?a=commitdiff_plain;h=2aa9121ad54fa2fbf9980f8480b50114d9c41f6e;p=saga.git Bug fixes. --- diff --git a/PLAN b/PLAN index 93e5940..961ace3 100644 --- a/PLAN +++ b/PLAN @@ -53,6 +53,7 @@ TODO: - Some algorithms (most notably Fredman-Tarjan) do not need flattening - mention in-place radix-sorting? - ranking of permutations on general sets, relationship with integer sorting +- reference to mixed Boruvka-Jarnik Notation: diff --git a/adv.tex b/adv.tex index afb1f72..e8c0fca 100644 --- a/adv.tex +++ b/adv.tex @@ -172,18 +172,19 @@ has degree~9. \section{Using Fibonacci heaps} \id{fibonacci} -We have seen that the Jarn\'\i{}k's Algorithm \ref{jarnik} runs in $\O(m\log n)$ time -(and this bound can be easily shown to be tight). Fredman and Tarjan have shown a~faster -implementation in~\cite{ft:fibonacci} using their Fibonacci heaps. In this section, -we convey their results and we show several interesting consequences. +We have seen that the Jarn\'\i{}k's Algorithm \ref{jarnik} runs in $\Theta(m\log n)$ time. +Fredman and Tarjan have shown a~faster implementation in~\cite{ft:fibonacci} +using their Fibonacci heaps. In this section, we convey their results and we +show several interesting consequences. -The previous implementation of the algorithm used a binary heap to store all neighboring -edges of the cut~$\delta(T)$. Instead of that, we will remember the vertices adjacent -to~$T$ and for each such vertex~$v$ we will keep the lightest edge~$uv$ such that $u$~lies -in~$T$. We will call these edges \df{active edges} and keep them in a~heap, ordered by weight. +The previous implementation of the algorithm used a binary heap to store all edges +separating the current tree~$T$ from the rest of the graph, i.e., edges of the cut~$\delta(T)$. +Instead of that, we will remember the vertices adjacent to~$T$ and for each such vertex~$v$ we +will maintain the lightest edge~$uv$ such that $u$~lies in~$T$. We will call these edges \df{active edges} +and keep them in a~Fibonacci heap, ordered by weight. When we want to extend~$T$ by the lightest edge of~$\delta(T)$, it is sufficient to -find the lightest active edge~$uv$ and add this edge to~$T$ together with a new vertex~$v$. +find the lightest active edge~$uv$ and add this edge to~$T$ together with the new vertex~$v$. Then we have to update the active edges as follows. The edge~$uv$ has just ceased to be active. We scan all neighbors~$w$ of the vertex~$v$. When $w$~is in~$T$, no action is needed. If $w$~is outside~$T$ and it was not adjacent to~$T$ (there is no active edge @@ -198,8 +199,8 @@ and deletions on the heap. \algin A~graph~$G$ with an edge comparison oracle. \:$v_0\=$ an~arbitrary vertex of~$G$. \:$T\=$ a tree containing just the vertex~$v_0$. -\:$H\=$ a~heap of active edges stored as pairs $(u,v)$ where $u\in T,v\not\in T$, ordered by the weights $w(vw)$, initially empty. -\:$A\=$ an~auxiliary array mapping vertices outside~$T$ to their active edges in the heap; initially all elements undefined. +\:$H\=$ a~Fibonacci heap of active edges stored as pairs $(u,v)$ where $u\in T,v\not\in T$, ordered by the weights $w(uv)$, initially empty. +\:$A\=$ a~mapping of vertices outside~$T$ to their active edges in the heap; initially all elements undefined. \:\ all edges incident with~$v_0$ to~$H$ and update~$A$ accordingly. \:While $H$ is not empty: \::$(u,v)\=\(H)$. @@ -211,6 +212,10 @@ and deletions on the heap. \algout Minimum spanning tree~$T$. \endalgo +\para +To analyze the time complexity of this algorithm, we will use the standard +theorem on~complexity of the Fibonacci heap: + \thmn{Fibonacci heaps} The~Fibonacci heap performs the following operations with the indicated amortized time complexities: \itemize\ibull @@ -220,7 +225,7 @@ with the indicated amortized time complexities: \:\ (deletion of the minimal element) in $\O(\log n)$, \:\ (deletion of an~arbitrary element) in $\O(\log n)$, \endlist -\>where $n$ is the maximum number of elements present in the heap at the time of +\>where $n$ is the number of elements present in the heap at the time of the operation. \proof @@ -229,7 +234,7 @@ heap and the proof of this theorem. \qed \thm -Algorithm~\ref{jarniktwo} with a~Fibonacci heap finds the MST of the input graph in time~$\O(m+n\log n)$. +Algorithm~\ref{jarniktwo} with the Fibonacci heap finds the MST of the input graph in time~$\O(m+n\log n)$. \proof The algorithm always stops, because every edge enters the heap~$H$ at most once. @@ -238,8 +243,8 @@ it gives the correct answer. The time complexity is $\O(m)$ plus the cost of the heap operations. The algorithm performs at most one \ or \ per edge and exactly one \ -per vertex and there are at most $n$ elements in the heap at any given time, -so by the previous theorem the operations take $\O(m+n\log n)$ time in total. +per vertex. There are at most $n$ elements in the heap at any given time, +thus by the previous theorem the operations take $\O(m+n\log n)$ time in total. \qed \cor @@ -256,15 +261,15 @@ A~nice example is a~\df{$d$-regular heap} --- a~variant of the usual binary heap in the form of a~complete $d$-regular tree. \, \ and other operations involving bubbling the values up spend $\O(1)$ time at a~single level, so they run in~$\O(\log_d n)$ time. \ and \ require bubbling down, which incurs -comparison with all~$d$ sons at every level, so they run in~$\O(d\log_d n)$. +comparison with all~$d$ sons at every level, so they spend $\O(d\log_d n)$. With this structure, the time complexity of the whole algorithm -is $\O(nd\log_d n + m\log_d n)$, which suggests setting $d=m/n$, giving $\O(m\log_{m/n}n)$. +is $\O(nd\log_d n + m\log_d n)$, which suggests setting $d=m/n$, yielding $\O(m\log_{m/n}n)$. This is still linear for graphs with density at~least~$n^{1+\varepsilon}$. Another possibility is to use the 2-3-heaps \cite{takaoka:twothree} or Trinomial heaps \cite{takaoka:trinomial}. Both have the same asymptotic complexity as Fibonacci -heaps (the latter even in worst case, but it does not matter here) and their -authors claim implementation advantages. +heaps (the latter even in the worst case, but it does not matter here) and their +authors claim faster implementation. \FIXME{Mention Thorup's Fibonacci-like heaps for integers?} @@ -274,7 +279,7 @@ for sufficiently dense graphs. In some cases, it is useful to combine it with another MST algorithm, which identifies a~part of the MST edges and contracts the graph to increase its density. For example, we can perform several iterations of the Contractive Bor\o{u}vka's algorithm and find the rest of the -MST by the above version of Jarn\'\i{}k's algorithm. +MST by the Jarn\'\i{}k's algorithm. \algn{Mixed Bor\o{u}vka-Jarn\'\i{}k} \algo @@ -301,9 +306,9 @@ and both trees can be combined in linear time, too. \para Actually, there is a~much better choice of the algorithms to combine: use the improved Jarn\'\i{}k's algorithm multiple times, each time stopping after a~while. -The good choice of the stopping condition is to place a~limit on the size of the heap. -Start with an~arbitrary vertex, grow the tree as usually and once the heap gets too large, -conserve the current tree and start with a~different vertex and an~empty heap. When this +A~good choice of the stopping condition is to place a~limit on the size of the heap. +We start with an~arbitrary vertex, grow the tree as usually and once the heap gets too large, +we conserve the current tree and start with a~different vertex and an~empty heap. When this process runs out of vertices, it has identified a~sub-forest of the MST, so we can contract the graph along the edges of~this forest and iterate. @@ -315,12 +320,12 @@ contract the graph along the edges of~this forest and iterate. \:$m_0\=m$. \:While $n>1$: \cmt{We will call iterations of this loop \df{phases}.} \::$F\=\emptyset$. \cmt{forest built in the current phase} -\::$t\=2^{2m_0/n}$. \cmt{the limit on heap size} +\::$t\=2^{\lceil 2m_0/n \rceil}$. \cmt{the limit on heap size} \::While there is a~vertex $v_0\not\in F$: \:::Run the improved Jarn\'\i{}k's algorithm (\ref{jarniktwo}) from~$v_0$, stop when: \::::all vertices have been processed, or \::::a~vertex of~$F$ has been added to the tree, or -\::::the heap had more than~$t$ elements. +\::::the heap has grown to more than~$t$ elements. \:::Denote the resulting tree~$R$. \:::$F\=F\cup R$. \::$T\=T\cup \ell[F]$. \cmt{Remember MST edges found in this phase.} @@ -330,8 +335,8 @@ contract the graph along the edges of~this forest and iterate. \nota For analysis of the algorithm, let us denote the graph entering the $i$-th -phase by~$G_i$ and likewise with the other parameters. The trees from which -$F_i$~has been constructed will be called $R_i^1, \ldots, R_i^{z_i}$. The +phase by~$G_i$ and likewise with the other parameters. Let the trees from which +$F_i$~has been constructed be called $R_i^1, \ldots, R_i^{z_i}$. The non-indexed $G$, $m$ and~$n$ will correspond to the graph given as~input. \para @@ -344,7 +349,7 @@ The $i$-th phase of the Iterated Jarn\'\i{}k's algorithm runs in time~$\O(m)$. \proof During the phase, the heap always contains at most~$t_i$ elements, so it takes time~$\O(\log t_i)=\O(m/n_i)$ to delete an~element from the heap. The trees~$R_i^j$ -are disjoint, so there are at most~$n_i$ \'s over the course of the phase. +are edge-disjoint, so there are at most~$n_i$ \'s over the course of the phase. Each edge is considered at most twice (once per its endpoint), so the number of the other heap operations is~$\O(m_i)$. Together, it equals $\O(m_i + n_i\log t_i) = \O(m_i+m) = \O(m)$. \qed @@ -354,19 +359,17 @@ Unless the $i$-th phase is final, the forest~$F_i$ consists of at most $2m_i/t_i \proof As every edge of~$G_i$ is incident with at most two trees of~$F_i$, it is sufficient -to establish that there are at least~$t_i$ edges incident with the vertices of every -such tree~(*). +to establish that there are at least~$t_i$ edges incident with every such tree, including +connecting two vertices of the tree. The forest~$F_i$ evolves by additions of the trees~$R_i^j$. Let us consider the possibilities how the algorithm could have stopped growing the tree~$R_i^j$: \itemize\ibull -\:the heap had more than~$t_i$ elements (step~10): since the elements stored in the heap correspond - to some of the edges incident with vertices of~$R_i^j$, the condition~(*) is fulfilled; +\:the heap had more than~$t_i$ elements (step~10): since the each elements stored in the heap + corresponds to a~unique edges incident with~$R_i^j$, we have enough such edges; \:the algorithm just added a~vertex of~$F_i$ to~$R_i^j$ (step~9): in this case, an~existing tree of~$F_i$ is extended, so the number of edges incident with it cannot decrease;\foot{% - To make this true, we counted the edges incident with the \em{vertices} of the tree - instead of edges incident with the tree itself, because we needed the tree edges - to be counted as well.} + This is the place where we needed to count the interior edges as well.} \:all vertices have been processed (step~8): this can happen only in the final phase. \qeditem \endlist @@ -381,12 +384,12 @@ loop is eventually terminated. The resulting subgraph~$T$ is equal to $\mst(G)$, a~subgraph of~$\mst(G_i)$ and the $F_i$'s are glued together according to the Contraction lemma (\ref{contlemma}). -Let us bound the sizes of the graphs processed in individual phases. As the vertices +Let us bound the sizes of the graphs processed in the individual phases. As the vertices of~$G_{i+1}$ correspond to the components of~$F_i$, by the previous lemma $n_{i+1}\le -2m_i/t_i$. Then $t_{i+1} = 2^{2m/n_{i+1}} \ge 2^{2m/(2m_i/t_i)} = 2^{(m/m_i)\cdot t_i} \ge 2^{t_i}$, +2m_i/t_i$. Then $t_{i+1} = 2^{\lceil 2m/n_{i+1} \rceil} \ge 2^{2m/n_{i+1}} \ge 2^{2m/(2m_i/t_i)} = 2^{(m/m_i)\cdot t_i} \ge 2^{t_i}$, therefore: $$ -\left. \vcenter{\hbox{$\displaystyle t_i \ge 2^{2^{\scriptstyle 2^{\scriptstyle\vdots^{\scriptstyle m/n}}}} $}}\;\right\} +\left. \vcenter{\hbox{$\displaystyle t_i \ge 2^{2^{\scriptstyle 2^{\scriptstyle\rddots^{\scriptstyle m/n}}}} $}}\;\right\} \,\hbox{a~tower of~$i$ exponentials.} $$ As soon as~$t_i\ge n$, the $i$-th phase must be final, because at that time @@ -423,7 +426,7 @@ which need careful handling, so we omit the description of this algorithm. %-------------------------------------------------------------------------------- -\section{Verification of minimality} +%\section{Verification of minimality} \endpart diff --git a/macros.tex b/macros.tex index f04fa00..1b0d8c1 100644 --- a/macros.tex +++ b/macros.tex @@ -43,6 +43,10 @@ \def\timesbeta{\mskip2mu\beta} \def\tower{\mathop\uparrow} +% A reversed version of \ddots with extra space at the top to get good alignment of exponents. +\def\rddots{\mathinner{\mkern1mu\raise\p@\vbox{\kern7\p@\hbox{.}}\mkern2mu + \raise4\p@\hbox{.}\mkern2mu\raise7\p@\hbox{.}\raise11\p@\hbox{}\mkern1mu}} + % Footnotes \newcount\footcnt \footcnt=0