Advanced Algorithms Notes

Notes for Advanced Algorithm


20180905 Min-Cut and Max-Cut

Min-Cut

Deterministic Algorithm

Randomized Algorithm

  • Karger’s Contraction Algorithm

    • Random Contract (Choose $e$ randomly)

      fig1_1

    • While($|V| \gt 2$) Repeat

    • $O(n)$ for 1 contraction, $n-2$ contractions, $O(n^2)$ for Karger

    • For any multigraph with $n$ verticles, the RandomContract algorithm returns a minimum cut with probability at least $\frac{2}{n(n-1)}$

    • Run RandomContract independently for $t=\frac{n(n-1)\ln n}{2}$ times, the probability that a minimum cut is found is at least $1-\frac{1}{n}$

    • $O(n^2)$ for 1 time, $O(n^4\log n)$for all ($Pr \geq 1-\frac{1}{n}$)

  • For any graph $G(V,E)$ of $n$ vertices, the number of distinct minimum cuts in $G$ is at most $\frac{n(n-1)}{2}$

  • Fast Min-Cut

    • Random Contract

    • Fastcut

      fig1_2

    • A : The minimum-cut C survives all contractions in RandomContract(G,t)

      B : size of min-cut is unchanged after RandomContract(G,t)

      $Pr[B]\geq Pr[A]\geq \frac{t(t-1)}{n(n-1)}\geq \frac{1}{2} (t=\lceil 1+\frac{n}{\sqrt2}\rceil)$

    • For any multigraph with $n$ vertices, the FastCut algorithm returns a minimum cut with probability $\Omega(\frac{1}{\log n})$ in time $O(n^2\log n)$

    • $O(n^2\log n)​$ for 1 time, $O(n^2\log^3n)​$ for all ($Pr \geq 1-O(\frac{1}{n})​$)

Max-Cut

Greedy Algorithm

  • Greedy Heuristics

    fig1_3

  • Approximation Ratio

    • $|E|=\sum_{i=1}^n(|E(S_i,{v_i})|+|E(T_i,{v_i}|))$
    • $SOL_G=\sum_{i=1}^n\max (|E(S_i,{v_i})|+|E(T_i,{v_i}|)$
    • $SOL_G\geq \frac{|E|}{2}\geq\frac{OPT_G}{2}$

Derandomization by Conditional Expectation

  • For every vertex $v\in V$ let $X_v\in{0, 1}$ be a uniform and independent random bit which indicates whether $v$ joins $S$ or $T$.
  • $\mathbb{E}[|E(S,T)|]=\sum_{uv\in E}\mathbb{E}[I[X_u\neq X_v]]=\sum_{uv\in E}Pr[X_u\neq X_v]=\frac{|E|}{2}\geq \frac{OPT_G}{2}$
  • Find the solution satisfied above(☆)

Derandomization by Pairwise Independence

  • Use $\log n$ independent variables, with xor

20180912 Fingerprinting

Polynomial Identity Testing (PIT)

  • Given as input two polynomials, determine whether they are identical.
  • $f\equiv g \Leftrightarrow f-g\equiv 0$

Deterministic Algorithm

  • Evaluate $f(x_1), f(x_2), \dots, f(x_{d+1})$ over $d+1$ distinct elements $x_1, x_2, \dots, x_{d+1}$ from the field $\mathbb{F}$ and check whether they are all zero.

  • Fundamental Theorem of Algebra

    Any non-zero univariate polynomial of degree $d$ has at most $d$ roots.

Randomized Algorithm

  • fig2_1

  • $Pr[f(r)=0]\leq\frac{d}{|S|}$

Communication Complexity of Equality

  • $EQ(a,b)=\begin{cases}1 & \text{if a=b,} \\ 0 & \text{otherwise.}\end{cases}$

  • Theorem (Yao 1979)

    Any deterministic communication protocol computing EQ on two $n$-bit strings costs $n$ bits of communication in the worst-case.

  • $f(x)=\sum_{i=0}^{n-1}a_ix^i$ and $g(x)=\sum_{i=0}^{n-1}b_ix^i$, defined over finite field $\mathbb{Z}_p=\lbrace 0,1,\dots,p-1\rbrace $ for some suitable $p$

  • fig2_2

  • Communication complexity bounded with $O(\log p)$, $Pr[f(r)=g(r)]\leq\frac{n-1}{p}$, choosing $p$ to be a prime in the interval $[n^2,2n^2]$, error prability of false positive at most $O(\frac{1}{n})$

  • Schwartz-Zippel Theorem

    Let $f\in\mathbb{F}[x_1,x_2,\dots,x_n]$ be a multivariate polynomial of degree $d$ over a field $\mathbb{F}$. If $f \not\equiv0$, then for any finite set $S \subset\mathbb{F}$, and $r_1,r_2,\dots,r_n\in S$ chosen uniformly and independently at random, $Pr[f(r_1,r_2,\dots,r_n)=0]\leq\frac{d}{|S|}$.

    Proof(★) of induction

    Notes :

    1. The proof of $n=1$ is based on fundamental theorem of algebra.
    2. Straightly consider multivariate polynomial $f(x_1,x_2,\dots,x_n)$ as univariate polynomial $f_{r_1,r_2,\dots,r_{n-1}}(r_n)$ with $Pr[f_{r_1,r_2,\dots,r_{n-1}}(r_n)=0]\leq\frac{d}{|S|}$is wrong, because $Pr[f_{r_1,r_2,\dots,r_{n-1}}(r_n)=0]=Pr[\text{This univariate polynomial$\equiv0$}]+Pr[\text{The coefficient$=0$}]$ and the only truth is $Pr[\text{This univariate $\equiv0$}]\leq\frac{d}{|S|}$.

Fingerprinting

  • Fingerprints function $FING(\cdot)$
  • $FING(\cdot)$ is a function, meaning that if $X=Y$ then $FING(X)=FING(Y)$.
  • It is much easier to compute and compare the fingerprints.
  • Ideally, the domain of fingerprints is much smaller than the domain of original objects, so storing and comparing fingerprints are easy. This means the fingerprint function $FING(\cdot)$ cannot be an injection (one-to-one mapping), so it’s possible that different $X$ and $Y$ are mapped to the same fingerprint. We resolve this by making fingerprint function randomized, and for $X\neq Y$, we want the probability $Pr[FING(X)=FING(Y)]$ to be small.

Communication protocols for Equality by fingerprinting

Fingerprinting by randomized checksum

  • Now we consider a new fingerprint function: We treat each input string $x\in\lbrace 0,1\rbrace ^n$ as the binary representation of a number, and let $FING(x)=x\text{ mod }p$ for some random prime $p$ chosen from $[k]=\lbrace 0,1,\dots,k-1\rbrace $.

  • fig2_3

  • The number of bits to be communicated is obviously $O(\log k)$.

  • Let $z=x-y$. $Pr[z\text{ mod }p=0]\leq\frac{\text{the number of prime divisors }z}{\text{the number of primes in }[k]}$.

  • Prime Number Theorem

    Let $\pi(k)$ denote the number of primes less than $k$. Then $\pi(k) \sim\frac{k}{\ln k}$ as $k\sim\infty$.

  • Therefore, by choosing $k=2n^2\ln n$, $0\lt z\lt 2^n$, and a random prime $p\in[k]$, $Pr[z\text{ mod }p]\leq\frac{n}{\pi(k)}\sim\frac{1}{n}$.

    $Pr[FING(x)=FING(y)]\leq Pr[|x-y|\text{ mod }p=0]\leq\frac{1}{n}$.

Checking distinctness

  • Given a sequence $x_1,x_2,\dots,x_n\in\lbrace 1,2,\dots,n\rbrace $, check whether every element of $\lbrace 1,2,\dots,n\rbrace $ appears exactly once.

Fingerprinting multisets

  • fig2_4

  • $FING(A)=\prod_{i=1}^n(r-a_i)$

  • Theorem (Lipton 1989)

    Let $A=\lbrace a_1,a_2,\dots,a_n\rbrace $ and $B=\lbrace b_1,b_2,\dots,b_n\rbrace $ be two multisets whose elements are from $\lbrace 1,2,\dots,n\rbrace $. If $A\neq B$, then $Pr[FING(A)=FING(B)]=O(\frac{1}{n})$.

    Proof(★)


20180919 Hashing and Sketching