From 583801851a8de6cc312a112ef170d4a279d55b3d Mon Sep 17 00:00:00 2001 From: Martin Mares Date: Fri, 7 Mar 2008 22:28:43 +0100 Subject: [PATCH] Komlos's theorem. --- adv.tex | 136 +++++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 109 insertions(+), 27 deletions(-) diff --git a/adv.tex b/adv.tex index 3a174c8..dde5acb 100644 --- a/adv.tex +++ b/adv.tex @@ -544,7 +544,7 @@ which need careful handling, so we omit the description of this algorithm. \section{Verification of minimality} Now we will turn our attention to a~slightly different problem: given a~spanning -tree, how to verify that it is minimal? We will show that this can be achieved +tree, how to verify that it is minimum? We will show that this can be achieved in linear time and it will serve as a~basis for the randomized linear-time MST algorithm in the next section. @@ -558,7 +558,7 @@ We will follow the results of Koml\'os and King, but as we have the power of the RAM data structures from Section~\ref{bitsect} at our command, we will not have to worry about much technical details. -To verify that a~spanning~$T$ is minimal, it is sufficient to check that all +To verify that a~spanning~$T$ is minimum, it is sufficient to check that all edges outside~$T$ are $T$-heavy (by Theorem \ref{mstthm}). In fact, we will be able to find all $T$-light edges efficiently. For each edge $uv\in E\setminus T$, we will find the heaviest edge of the tree path $T[u,v]$ and compare its weight @@ -642,7 +642,7 @@ See for example Bender and Falach-Colton \cite{bender:lca}. We summarize the reductions in the following lemma, also showing that they can be performed in linear time: -\lemma +\lemma\id{verbranch}% For each tree~$T$ and a~set of query paths~$Q$ on~$T$, it is possible to find a~complete branching tree~$T'$ in linear time together with a~set~$Q'$ of query paths on~$T'$, such that the weights of the heaviest edges of the new paths can @@ -669,43 +669,125 @@ is then trivial. \qed \para -We will now describe a~simple variant of the depth-first search which finds the -maximum-weight edges for all query paths. +We will now describe a~simple variant of depth-first search which finds the +maximum-weight edges for all query paths of the transformed problem. For the +time being, we will not care about the time complexity of the algorithm (as long +as it is polynomial) and we will minimize only the number of edge weight comparisons +performed. For every edge~$e=uv$, we consider the set $Q_e$ of all query paths containing~$e$. -The vertex of a~path which is closer to the root will be called its \df{top,} +The vertex of a~path, which is closer to the root, will be called its \df{top,} the other vertex its \df{bottom.} We define arrays $T_e$ and~$H_e$ as follows: $T_e$ contains the tops of the paths in~$Q_e$ in order of their increasing depth (we will call them \df{active tops} and each of them will be stored exactly once). For -each active top~$t=T_e[i]$, the $H_e[i]$ is defined as the heaviest edge of the path $T[v,t]$. +each active top~$t=T_e[i]$, we define $H_e[i]$ as the heaviest edge of the path $T[v,t]$. As for every~$i$ the path $T[v,T_e[i+1]]$ is contained within $T[v,T_e[i]]$, -the edges in~$H_e$ must have non-increasing weights: $w(H_e[i+1]) \le +the edges in~$H_e$ must have non-increasing weights, that is $w(H_e[i+1]) \le w(H_e[i])$. -The algorithm constructs the $T_e$'s and $H_e$'s recursively. We assume that the -root has a~parent edge~$r$ and we define $T_r$ and~$H_r$ as empty arrays (no query -path contains~$r$). When we arrived to a~vertex~$u$ via an~edge~$p$ from its parent, -we visit each son~$v$ of~$u$ and we calculate $T_e$ and~$H_e$ for the edge $e=uv$ -from $T_p$ and~$H_p$: +\alg $\(u,p,T_p,H_p)$ --- process all queries in the subtree rooted +at~$u$ entered from its parent via an~edge~$p$. +\id{findheavy} -\itemize\ibull -\:First we remove all tops which are no longer contained in paths crossing the -current edge~$e$. -%For example, we can keep a~use count for every active top and -%decrease counters of the tops of all paths whose bottom is~$u$. +\algo +\:Process all query paths whose bottom is~$u$. For each of them, find the top of +the path in~$T_p$ and record the heaviest edge remembered for it in~$H_p$. -\:Then we update the heaviest edges for all tops. The new edge~$e$ will replace -all lighter edges in $H_p$. Since $H_p$ is sorted, we can use binary search -to locate the boundary between lighter and heavier edges in it. +\:First find the tops~$T$ which will be shared by all edges going from~$u$ downwards. +These are the tops from~$T_p$ except for the ones which have ceased to be active, +because all query paths which were referring to them have~$u$ as their bottom. +Select the corresponding array of the heaviest edges~$H$ from~$H_p$. + +\:For every son~$v$ of~$u$, do: + +\::Construct the array of tops~$T_e$ for the edge $e=uv$. It contains all tops + from~$T$ and possibly also the vertex~$u$ itself if there is a~query path + which has~$u$ as its top and which has bottom somewhere in the subtree + rooted at~$v$. + +\::Construct the array of the heaviest edges~$H_e$. For all tops from~$T$, just compare + the weight of the edge recorded in~$H$ with $w(e)$ and if it was smaller, + replace the remembered edge by~$e$. Since $H_p$ was sorted, so is~$H$ + and we can use binary search to locate the boundary between lighter and + heavier edges in~$H$. + For the new top~$u$, the edge~$e$ itself is obviously the heaviest. + +\::Recurse on~$v$: call $\(v,e,T_e,H_e)$. +\endalgo + +\>As we need a~parent edge to start the recursion, we add an~imaginary parent +edge~$p_0$ of the root vertex~$r$, for which no queries are defined. We can +therefore start with $\(r,p_0,\emptyset,\emptyset)$. + +Let us account for the comparisons: + +\lemma\id{vercompares}% +When the procedure \ is called on the transformed problem, it +performs $\O(n+m)$ comparisons, where $n$ is the size of the tree and +$m$ is the number of query paths. + +\proof +We will calculate the number of comparisons~$c_i$ performed when processing the edges +going from the $(i+1)$-th to the $i$-th level of the tree. +The levels are numbered from the bottom, so leaves are at level~0, the root +is at level $\ell\le \lceil \log_2 n\rceil$ and there are $n_i\le n/2^i$ vertices +at the $i$-th level, so we consider exactly $n_i$ edges. +\def\eqalign#1{\null\,\vcenter{\openup\jot + \ialign{\strut\hfil$\displaystyle{##}$&$\displaystyle{{}##}$\hfil + \crcr#1\crcr}}\,} +$$\vcenter{\openup\jot\halign{\strut\hfil $\displaystyle{#}$&$\displaystyle{{}#}$\hfil&\quad#\hfil\cr +c_i &\le \sum_e \left( 1 + \log \vert T_e\vert \right)&(Total cost of the binary search.)\cr + &\le n_i + \sum_e \log\vert T_e\vert&(We sum over $n_i$ edges.)\cr + &\le n_i + n_i \cdot {\sum_e \log\vert T_e\vert \over n_i}&(Consider the average of logarithms.) \cr + &\le n_i + n_i \cdot \log{\sum_e \vert T_e\vert \over n_i}&(Logarithm is concave.) \cr + &\le n_i + n_i \cdot \log{m\over n_i}&(Bound the number of tops by queries.) \cr + &= n_i \cdot \left( 1 + \log\left({m\over n}\cdot{n\over n_i}\right) \right)\cr + &= n_i + n_i\log{m\over n} + n_i\log{n\over n_i}.\cr +}}$$ +Summing over all levels, we estimate the total number of comparisons as: +$$ +c = \sum_i c_i = \left( \sum_i n_i \right) + \left( \sum_i n_i \log{m\over n}\right) + \left( \sum_i n_i \log{n\over n_i}\right). +$$ +The first part is equal to~$n$, the second one to $n\log(m/n)\le m$. For the third +one, we would like to plug in $n_i \le n/2^i$, but we unfortunately have one~$n_i$ +in the denominator. We save the situation by observing that the function $f(x)=x\log(n/x)$ +is decreasing\foot{We can easily check the derivative: $f(x)=(x\ln n-x\ln x)/\ln 2$, so $f'(x)\cdot \ln2 = +\ln n - \ln x - 1$. We want $f'(x)<0$ and therefore $\ln x > \ln n - 1$, i.e., $x > n/e$.} +for $x > n/e$, so except for $i=0$ and $i=1$ it holds that: +$$ +n_i\log{n\over n_i} \le {n\over 2^i}\cdot\log{n\over n/2^i} = {n\over 2^i} \cdot i. +$$ +We can therefore rewrite the third part as: +$$\eqalign{ +\sum_i n_i\log{n\over n_i} &\le n_0\log{n\over n_0} + n_1\log{n\over n_1} + n\cdot\sum_{i\ge 2}{i\over 2^i} \le\cr +&\le n\log1 + n_1\cdot {n\over n_1} + n\cdot\O(1) = \O(n).\cr +}$$ +Putting all three parts together, we conclude that: +$$ +c \le n + m + \O(n) = \O(n+m). +$$ +\qed + +\para +When we combine this lemma with the above reduction, we get the following theorem: + +\thmn{Verification of the MST, Koml\'os \cite{komlos:verify}} +For every weighted graph~$G$ and its spanning tree~$T$, it is sufficient to +perform $\O(m)$ comparisons of edge weights to determine whether~$T$ is minimum +and to find all $T$-light edges in~$G$. + +\para +We first transform the problem to finding the heaviest edges for a~set +of query paths in~$T$ (these are exactly the paths covered by the edges +of $G\setminus T$). We use the reduction from Lemma \ref{verbranch} to get +an~equivalent problem with a~full branching tree and a~set of parent-descendant +paths. Then we run the \ procedure (\ref{findheavy}) to find +the heaviest edges and employ Lemma \ref{vercompares} to bound the number +of comparisons used. +\qed -\:Finally we have to add~$u$ as a~new top of there is a~query path which has~$u$ -at its top and its bottom lies anywhere in the subtree rooted at~$v$. -\endlist -So far, we will be interested in minimizing only the number of comparisons of -edge weights, so we will postpone the exact implementation of the first and the -third step for a~while. -- 2.39.2