From ad2801e3226e092b1cc36bc1658e8dccd61b278d Mon Sep 17 00:00:00 2001 From: Martin Mares Date: Tue, 8 Apr 2008 10:51:21 +0200 Subject: [PATCH] Improve description of classical algorithms. --- PLAN | 2 +- dyn.tex | 16 +++++----- mst.tex | 87 ++++++++++++++++++++++++++++++++++++---------------- notation.tex | 4 +++ 4 files changed, 75 insertions(+), 34 deletions(-) diff --git a/PLAN b/PLAN index 3208200..174f3a6 100644 --- a/PLAN +++ b/PLAN @@ -97,7 +97,7 @@ Notation: - introduce \widehat\O early - unify { x ; ... }, { x | ...} and { x : ... } - capitalize Pointer Machine -- define \alpha(m,n) and use it instead of \alpha(n) +- define \alpha(m,n) Varia: diff --git a/dyn.tex b/dyn.tex index b16f43a..e8ecc4e 100644 --- a/dyn.tex +++ b/dyn.tex @@ -19,10 +19,10 @@ connected components: \problemn{Dynamic connectivity} Maintain an~undirected graph under a~sequence of the following operations: \itemize\ibull -\:$\(n)$ --- create a~graph with $n$~isolated vertices,\foot{% -The structure could support dynamic additional and removal of vertices, too, -but this is easy to add and infrequently used, so we will rather use a~fixed -set of vertices for clarity.} +\:$\(n)$ --- create a~graph with $n$~isolated vertices $\{1,\ldots,n\}$,\foot{% +The structure could support dynamic addition and removal of vertices, too, +but this is easy to add and infrequently used, so we will rather define +the structure with a~fixed set of vertices for clarity.} \:$\(G,u,v)$ --- insert an~edge $uv$ to~$G$ and return its unique identifier (assuming that the edge did not exist yet), \:$\(G,e)$ --- delete an~edge specified by its identifier from~$G$, @@ -32,9 +32,11 @@ identifier (assuming that the edge did not exist yet), \para We have already encountered a~special case of dynamic connectivity when implementing the Kruskal's algorithm in Section \ref{classalg}. At that time, we did not need to delete -any edges from the graph (this is sometimes called a~\df{semidynamic graph algorithm}), -which makes the problem substantially easier. We mentioned the Union-Find data structure -of Tarjan and van Leeuwen .... +any edges from the graph, which makes the problem substantially easier. This special +case is sometimes called an~\df{incremental} or \df{semidynamic} graph algorithm. +We mentioned the Union-Find data structure of Tarjan and van Leeuwen that is able + +time $\O(n+m\timesalpha(m,n))$ %-------------------------------------------------------------------------------- diff --git a/mst.tex b/mst.tex index f054c1e..591d70c 100644 --- a/mst.tex +++ b/mst.tex @@ -324,6 +324,14 @@ For each of them, we first show the general version of the algorithm, then we pr it gives the correct result and finally we discuss the time complexity of various implementations. +\paran{Bor\o{u}vka's algorithm}% +The oldest MST algorithm is based on a~simple idea: grow a~forest in a~sequence of +iterations until it becomes connected. We start with a~forest of isolated +vertices. In each iteration, we let each tree of the forest select the lightest +edge of those having exactly one endpoint in the tree (we will call such edges +the \df{neighboring edges} of the tree). We add all such edges to the forest and +pAroceed with the next iteration. + \algn{Bor\o{u}vka \cite{boruvka:ojistem}, Choquet \cite{choquet:mst}, Sollin \cite{sollin:mst} and others} \algo \algin A~graph~$G$ with an edge comparison oracle. @@ -386,6 +394,12 @@ Bor\o{u}vka's algorithm finds the MST in time $\O(m\log n)$. Follows from the previous lemmata. \qed +\paran{Jarn\'\i{}k's algorithm}% +The next algorithm, discovered independently by Jarn\'\i{}k, Prim and Dijkstra, is similar +to Bor\o{u}vka's algorithm, but instead of the whole forest it concentrates on +a~single tree. It starts with a~single vertex and it repeatedly extends the tree +by the lightest neighboring edge until it spans the whole graph. + \algn{Jarn\'\i{}k \cite{jarnik:ojistem}, Prim \cite{prim:mst}, Dijkstra \cite{dijkstra:mst}}\id{jarnik}% \algo \algin A~graph~$G$ with an edge comparison oracle. @@ -408,10 +422,9 @@ number of blue edges. \qed \impl\id{jarnimpl}% -The most important part of the algorithm is finding \em{neighboring edges,} i.e., edges -of the cut $\delta(T)$. In a~straightforward implementation, -searching for the lightest neighboring edge takes $\Theta(m)$ time, so the whole -algorithm runs in time $\Theta(mn)$. +The most important part of the algorithm is finding \em{neighboring edges.} +In a~straightforward implementation, searching for the lightest neighboring +edge takes $\Theta(m)$ time, so the whole algorithm runs in time $\Theta(mn)$. We can do much better by using a binary heap to hold all neighboring edges. In each iteration, we find and delete the @@ -428,10 +441,17 @@ Jarn\'\i{}k's algorithm finds the MST of a~given graph in time $\O(m\log n)$. \rem We will show several faster implementations in section \ref{fibonacci}. -\algn{Kruskal \cite{kruskal:mst}, the Greedy algorithm} +\paran{Kruskal's algorithm}% +The last of the three classical algorithms processes the edges of the +graph~$G$ greedily. It starts with an~empty forest and it takes the edges of~$G$ +in order of their increasing weights. For every edge, it checks whether its +addition to the forest produces a~cycle and if it does not, the edge is added. +Otherwise, the edge is dropped and not considered again. + +\algn{Kruskal \cite{kruskal:mst}} \algo \algin A~graph~$G$ with an edge comparison oracle. -\:Sort edges of~$G$ by their increasing weight. +\:Sort edges of~$G$ by their increasing weights. \:$T\=\emptyset$. \cmt{an empty spanning subgraph} \:For all edges $e$ in their sorted order: \::If $T+e$ is acyclic, add~$e$ to~$T$. @@ -453,38 +473,53 @@ so~$T$ must be the~MST. \impl Except for the initial sorting, which in general takes $\Theta(m\log m)$ time, the only -other non-trivial operation is the detection of cycles. What we need is a data structure +other non-trivial operation is the detection of cycles. What we need is a~data structure for maintaining connected components, which supports queries and edge insertion. -(This is also known under the name Disjoint Set Union problem, i.e., maintenance -of an~equivalence relation on a~set with queries on whether two elements are equivalent -and the operation of joining two equivalence classes into one.) -The following theorem shows that it can be done with surprising efficiency. +This is closely related to the well-known Disjoint Set Union problem: + +\problemn{Disjoint Set Union (DSU)} +Maintain an~equivalence relation on a~finite set under a~sequence of operations \ +and \. The \ operation tests whether two elements are equivalent and \ +joins two different equivalence classes into one. -\thmn{Incremental connectivity}% -When only edge insertions and connectivity queries are allowed, connected components -can be maintained in $\O(\alpha(n))$ time amortized per operation. +\para +We can maintain the connected components of our forest~$T$ as equivalence classes. When we want +to add an~edge~$uv$, we first call $\(u,v)$ to check if both endpoints of the edge lie in +the same components. If they do not, addition of this edge connects both components into one, +so we perform $\(u,v)$ to merge the equivalence classes. + +Tarjan and van Leeuwen have shown that there is a~data structure for the DSU problem +with surprising efficiency: + +\thmn{Disjoint Set Union, Tarjan and van Leeuwen \cite{tarjan:setunion}}% +Starting with a~trivial equivalence with single-element classes, a~sequence of operations +comprising of $n$~\s intermixed with $m\ge n$~\s can be processed in time +$\O(m\timesalpha(m,n))$, where $\alpha(m,n)$ is the inverse Ackermann function.\foot{See Section \ref{ackersec} for a~precise definition.} \proof -Proven by Tarjan and van Leeuwen in \cite{tarjan:setunion}. -See Chapter~\ref{dynchap} for more context. +See \cite{tarjan:setunion}. \qed -\FIXME{Define Ackermann's function. Use $\alpha(m,n)$?} +This completes the following theorem: + +\thm\id{kruskal}% +Kruskal's algorithm finds the MST of a given graph in time $\O(m\log n)$. +If the edges are already sorted by their weights, the time drops to +$\O(m\timesalpha(m,n))$. + +\proof +We spend $\O(m\log n)$ on sorting, $\O(m\timesalpha(m,n))$ on processing the sequence +of \s and \s, and $\O(m)$ on all other work. +\qed \rem -The cost of the operations on components is of course dwarfed by the complexity +The cost of the \ and \ operations is of course dwarfed by the complexity of sorting, so a much simpler (at least in terms of its analysis) data structure would be sufficient, as long as it has $\O(\log n)$ amortized complexity per operation. For example, we can label vertices with identifiers of the -corresponding components and always recolor the smaller of the two components. +corresponding components and always relabel the smaller of the two components. -\thm\id{kruskal}% -Kruskal's algorithm finds the MST of a given graph in time $\O(m\log n)$ -or $\O(m\timesalpha(n))$ if the edges are already sorted by their weights. - -\proof -Follows from the above analysis. -\qed +We will study dynamic maintenance of connected components in more detail in Chapter~\ref{dynchap}. %-------------------------------------------------------------------------------- diff --git a/notation.tex b/notation.tex index b53fdf1..caaf93e 100644 --- a/notation.tex +++ b/notation.tex @@ -140,4 +140,8 @@ does not depend on the order of edges). For $U\subseteq V(G)$, we define $G/U$ as the graph with all vertices of~$U$ merged to a~single vertex, that is $(G\cup U^*)/U^*$, where $U^*$ is the complete graph on~$U$. Similarly for $G.F$ and $G.U$. +%-------------------------------------------------------------------------------- + +\section{Ackermann function and its inverse}\id{ackersec}% + \endpart -- 2.39.2