CMPS 2200¶

Introduction to Algorithms¶

Recurrences - Example Recursive Algorithm¶

Recap¶

We've been discussing Divide and Conquer algorithms.

Split Sum
Merge Sort
Longest Run Recursive

A divide-and-conquer algorithm, at each step, divides the problem into subproblems, solves each, then combines the results to arrive at the final solution.

Recurrences can easily be written and interpretted from the perspective of divide and conquer algorithms.

General Recurrences¶

$$ \begin{equation} W(n) = \begin{cases} c_b, & \text{if $n=1$} \\ aW(\frac{n}{b}) + f(n), & \text{otherwise} \end{cases} \end{equation} $$

$a$ is the branching factor, how many subproblems every problem generates.
$\frac{n}{b}$ gives us the sub-problem size at the next level.
$f(n)$ is the cost of the combining step.

Tree Method¶

We can solve recurrences by drawing and analyzing the Recursion Trees that result from them.

The root node is the instance of the whole problem.
Children are subproblems resulting from the divide step.
The cost of every node is the cost to perform the combine step for that subproblem.

The cost of the entire tree is then the cost of all of the levels of the tree.

In the tree method, we generalize the cost of every level as a function of that level, $C(i)$. E.g,

$$C(i) = \frac{n^2}{2^i},\quad 3^in,\quad n,\quad \lg{\frac{3n}{2^i}},\quad 1, etc...$$

And sum up over all levels:

$$\sum_{i=0}^{\log n} C(i)$$

We solve the summation through simplification and by using properities of series to arrive as a final runtime.

The Brick Method¶

The tree method is tedious, but it works. Often, we can exploit patterns to perform our analysis more quickly.

Brick method:

Root dominated: cost decreases geometrically as we descend the tree $$W(n) \in O(\hbox{root})$$
- e.g., $W(n) = 2 W(\frac{n}{2}) + n^2$

Leaf dominated: cost increases geometrically as we descend the tree $$W(n) \in O(\hbox{leaves})$$
- e.g., $W(n) = 2 W(\frac{n}{2}) + \sqrt{n}$

Balanced: neither of the above is true $$W(n) \in O((\hbox{num levels}) \cdot (\hbox{max cost at any level}))$$
- e.g., $W(n) = W(n-1) + n \in O(n^2)$
- e.g., $W(n) = 2 W(\frac{n}{2}) + n \in O(n \lg n)$

Integer multiplication¶

With our powerful ideas of divide & conquer, recurrences, and our techniques for solving recurrences, let's see how we can put them to use to make algorithmic headway on a problem where it seems there is no headway to be made!

Multiplication!

Here is an elementary school algorithm for integer multiplication:

Input: $n$ bit integers $x= \langle x_{n-1}, \ldots, x_0\rangle$ and $y = \langle y_{n-1}, \ldots, y_0\rangle$
Output: $x \cdot y$
Example: '1001'$\times$'1101'

In [2]:

def int2binary(n):
    return list('{0:b}'.format(n))
int2binary(13)

Out[2]:

['1', '1', '0', '1']

In [3]:

nine = int2binary(9)
print(nine)
thirteen = int2binary(13)
print(thirteen)

9*13

['1', '0', '0', '1']
['1', '1', '0', '1']

Out[3]:

In [7]:

def binary2int(n): 
    return int(n, 2)
binary2int('1110101')

Out[7]:

What is the running time of the "elementary school" algorithm?

For two $n$-digit inputs: $O(n^2)$, since for each digit of $x$ we might add a stack of $n$ bits. The total number of bits in the solution is at most $2n$.

Can we improve on this using Divide and Conquer?

How can we break our problem into smaller parts?

Instead of the elementary school algorithm, consider splitting each $n$-digit input in half. Can we multiply recursively?

Let $x = \langle x_L, x_R\rangle$, $y = \langle y_L, y_R\rangle$. Then,

$\begin{align} x &=& 2^{n/2} x_L + x_R \:\:\:\:\:\: \hbox{e.g.} \: 1001: 2^2 (10) + (01)\\ y &=& 2^{n/2} y_L + y_R \:\:\:\:\:\: \hbox{e.g.} \: 1101: 2^2 (11) + (01)\\ \end{align} $

What is does multiplying by $2^{n/2}$ do, and is it efficient?

$2^2 [11] \rightarrow [1100] \:\:$ (shift two places to the left).

Multiplication the long way

$\begin{align} x &=& 2^{n/2} x_L + x_R \:\:\:\:\:\: \hbox{e.g.} \: 1001: 2^2 (10) + (01)\\ y &=& 2^{n/2} y_L + y_R \:\:\:\:\:\: \hbox{e.g.} \: 1101: 2^2 (11) + (01)\\ \end{align} $

So then,

$\begin{align} x\cdot y &=& (2^{n/2} x_L + x_R)(2^{n/2} y_L + y_R) \\ &=& 2^n (x_L \cdot y_L) + 2^{n/2} (x_L \cdot y_R + x_R \cdot y_L) + (x_R \cdot y_R) \\ \end{align} $

We've converted one multiplication of sizes $(n,n)$ into four multiplications of size $(\frac{n}{2}, \frac{n}{2})$.

What recursive algorithm, and recurrence is suggested by this observation?

$W(n) = 4W(n/2) + n$

What is the solution to this recurrence using the brick method? Is it root-dominated, or leaf-dominated?

Work of recursive multiplication¶

$C(\hbox{root}) = n$

$C(\hbox{level}\:1) = 4(\frac{n}{2})= 2 \cdot n$

geometrically increasing as we descend the recurrence tree.

A recurrence is leaf-dominated if for all $v$, there is an $\alpha > 1$ such that:

$$C(v) \leq \frac{1}{\alpha} \sum_{u \in D(v)} C(u)$$

let $\alpha \leftarrow 2$

$n \le \frac{1}{2}\cdot 2 \cdot n$

how many leaf nodes are there?¶

$W(n) = 4W(n/2) + n$

The cost of a leaf dominated recurrence is $O(L)$, where $L$ is the number of leaves.

nodes per level: $1, 4, 64, \ldots 4^i \ldots 4^{\lg n} = n^{\lg 4} = n^2$

This is a leaf-dominated reucrrence that is $O(n^2)$ -- the same as the elementary school algorithm!

Now, what is the span of this algorithm if implemented in parallel?

Span of recursive multiplication¶

Assuming each multiplication can be done in parallel, and that addition has span $O(n)$, we get that

$$S(n) = S(n/2) + n$$

which yields a span of ...

brick method

$S(\hbox{root})=n$

$S(\hbox{level 1}) = \frac{n}{2}$

$\rightarrow$ root dominated

$S(n) \in O(n)$. This is much better than the span of the grade school algorithm, which is $O(n^2)$!

What parallelism is achieved?

Parallelism $= \frac{W}{S} = \frac{n^2}{n} = n$

Can we do any better?

Recursive multiplication with less work¶

Recall that

$\begin{align} x\cdot y &=& (2^{n/2} x_L + x_R)(2^{n/2} y_L + y_R) \\ &=& 2^n (x_L \cdot y_L) + 2^{n/2} (x_L \cdot y_R + x_R \cdot y_L) + (x_R \cdot y_R) \\ \end{align} $

Observation:

$\begin{align} (x_L + x_R)\cdot (y_L + y_R) &=& (x_L\cdot y_L) + \pmb{(x_L\cdot y_R) + (x_R\cdot y_L)} + (x_R\cdot y_R)\\ \end{align} $

$\begin{align} \pmb{(x_L\cdot y_R) + (x_R\cdot y_L)} = (x_L + x_R)\cdot (y_L + y_R) - (x_L\cdot y_L) - (x_R\cdot y_R)\\ \end{align}$

How does our observation help us?

If we calculate $(x_L\cdot y_L)$, $(x_R\cdot y_R)$, and $(x_L + x_R)\cdot (y_L + y_R)$, that is three recursive multiplications.

So with 3 recursive multiplications and two more "additions", we then get that $W(n) = 3W(n/2) + n$. What is the running time?

Work of $W(n) = 3W(n/2) + n$¶

brick method

$C(\hbox{root}) = n$

$C(\hbox{level 1}) = \frac{3n}{2}$

$\frac{3}{2} > 1 \Rightarrow$ leaf dominated.

But, there are fewer leaves this time. Why?

nodes per level: $1, 3, 9, \ldots 3^i \ldots 3^{\lg n}=n^{\lg 3} \:\:\:\: (\hbox{by}\: a^{\log_b c} = c^{\log_b a})$

Using the brick method, this is still a leaf-dominated recurrence, and thus the running time is $O(n^{\log_2 3}) \:\: $ (approximately $O(n^{1.58}$) versus of $O(n^2)$ earlier).

This is known as the Karatsuba-Ofman algorithm (1962), and is the earliest known divide-and-conquer algorithm for integer multiplication. It is actually implemented in Intel hardware!

So, our we've decreased work from $O(n^2)$ to $O(n^{\lg 3})$.

Our span stays the same: $O(n)\:\:$ Why?

Parallelism $= \frac{W}{S} = \frac{n^{\lg 3}}{n} \approx n^{.58} < n$

So, our parallelism went down. Is that good or bad?

Schönhage and Strassen (1971) gave an algorithm that runs in $O(n\log n \log\log n)$ time.

In 2007, Fürer gave an algorithm that runs in $n \log n 2^{O(\log^* n)}$.

What is the fastest possible sequential algorithm for integer multiplication? In parallel?