Wednesday, June 22, 2011

Introduction to Algorithms by Thomas H. Cormen, Charles E. Leiserson and Ronald L. Rivest and Clifford Stein : Chapter 15

The authors introduce the concept of dynamic programming in this chapter where programming refers to tabular method. Unlike divide-and-conquer technique, where a problem is divided into independent sub-problems, dynamic programming is applicable when the sub-problems are not independent and share common sub-sub-problems. A divide-and-conquer approach does more work than necessary solving the common sub-sub-problems repeatedly. A dynamic programming algorithm solves every sub-sub-problem once and stores the result in a table thereby avoiding the work of recomputing the answer every time the sub-sub-problem is encountered. This simple technique sometimes transforms exponential-time algorithms to polynomial-time algorithms. Dynamic programming is typically applicable to optimization problems where there are multiple solutions with one value attached to each solution and we need to find the solution with an optimal value.
The development of a dynamic programming algorithm is divided in four steps:
  1. Characterize the structure of an optimal solution.
  2. Recursively define the value of an optimal solution.
  3. Compute the value of an optimal solution in a bottom-up approach.
  4. Construct an optimal solution from computed information.
The authors discuss the problem of Assembly Line Scheduling. There are two assembly lines in factory of a car manufacturer. Each assembly line has $ n $ stations numbered $ 0, 1, ..., n-1 $. The station $ i $ on assembly $ j $ is denoted as $ S_{i,j} $. The assembly time required at station $ S_{i,j} $ is denoted by $ a_{i,j} $. The time to transfer a chassis away from assembly line $ i $ after having gone through station $ S_{i,j} $ is $ t_{i,j} $, where $ i = 1, 2 $ and $ j = 1, 2, ..., n - 1 $. There is an entry time $ e_{i} $ for chassis to enter assembly line $ i $ and exit time $ x_i $ for the completed auto to exit assembly line $ i $. The problem is to determine which stations to choose from line 1 and which to choose from line 2 in order to minimize the total time through the factory for one auto. The brute-force approach, i.e. determining the fastest way through the factory by enumerating all ways and by computing how long each takes would require $ \Omega(2^n) $ time which is infeasible for large $ n $.

The authors then apply the above steps to this problem as below:
  1. The structure of an optimal solution: The authors prove that an optimal solution to the problem contains an optimal solution to sub-problem, i.e. fastest way through station $ S_{i,j} $ contains the fastest way through either $ S_{1, j-1} $ or $ S_{2, j-1} $.
  2. The recursive solution: The authors define $ f_i[j] $ to be the fastest time to get a chassis all the way from start to station $ S_{i,j} $ and derive following recursive relationship
    • $$ f_1[j] = \begin{cases}
      e_1 + a_{1,1}, & \text{ if }j = 1 \\
      min(f_1[j-1]+a_{1,j},f_2[j-1]+t_{2,j-1}+a_{1,j}), & \text{ if }j >= 2 \end{cases} $$
    • $$ f_2[j] = \begin{cases} e_2 + a_{2,1}, & \text{ if } j = 1 \\ min(f_2[j-1]+a_{2,j},f_1[j-1]+t_{1,j-1}+a_{2,j}), & \text{ if } j >= 2 \end{cases} $$
  3. Computing the fastest times: The authors provide an algorithm which finds the fastest assembly line in $ \Theta(n) $ time.
  4. Constructing the fastest way: The authors then show the solution for a sample problem given.
The authors then discuss the matrix-chain multiplication problem which is defined as follows: Given a chain $<A_1,A_2,...,A_n>$ of $n$ matrices, where for $i = 1, 2, ..., n$, matrix $A_i$ has dimension $ p_{i-1} \times p_i $, fully parenthesize the product $A_1A_2...A_n$ in a way that minimizes the number of scalar multiplications. If the number of alternative parenthesizations of a sequence of $n$ matrices are denoted by $P(n)$, then
$$ P(n) = \begin{cases}
1 & \text{ if }n = 1,\\
\displaystyle\sum\limits_{k=1}^{n-1}P(k) \times P(n-k) & \text{ if }n >= 2,
\end{cases} $$
The authors then apply the dynamic programming steps to the problem as below:
  1. The structure of an optimal parenthesization: The authors prove that the optimal solution for parenthesization of matrix chain $A_iA_{i+1}...A_j$ already contain the optimal solution for parenthesization of matrix chain $A_iA_{i+1}...A_k$ and $A_{k+1}...A_j$ where $i<=k<j$.
  2. A recursive solution: Let $m[i,j]$ be the minimum number of scalar multiplications needed to compute the matrix $A_{i...j}$. It can be recursively defined as follows:
    • $$ m[i,j] = \begin{cases}
      0, & \text{ if }i=j \\
      min_{i<=k<j}\{m[i,k] + m[k+1,j] + p_{i-1}p_{k}p_{j}\}, & \text{ if } i<j
      \end{cases}$$
  3. Computing the optimal costs: The authors point out that deciding $i$ and $j$ such that $1 <= i <= j <= n$ requires $ \Theta(n) $ choices in all and provide an algorithm to compute $m[1,n]$ with a running time of $ O(n^3) $.
  4. Constructing an optimal solution: The authors provide a method that prints the parenthesization required to generate the minimum number of scalar multiplication to get the matrix multiplication.
The authors then discuss two characteristics of problems which allow dynamic programming to be applicable to them:
  • Optimal substructure: If optimal solution to the problem contains optimal solution to sub-problems, the problem is a probable candidate for dynamic programming (and greedy method too).
  • Overlapping sub-problems: The space of the sub-problems must be small and a recursive algorithm must be solving the same sub-problem over and over. Typically, the total number of distinct sub-problems is a polynomial in input size.
The authors then discuss the concept of memoization where a table with sub-problem solutions is maintained but the control structure for filling the table is more like a recursive algorithm. The authors then provide memoized algorithm for matrix-chain multiplication and prove that it too takes running time of $O(n^3)$. The authors then discuss longest common sub-sequence problem. Given a sequence $X = <x_1,x_2,...,x_m>$, another sequence $Z=<z_1,z_2,...,z_k>$ is a sub-sequence of $X$, if there exists a strictly increasing sequence $<i_1,i_2,..i_k>$ of indices of $X$ such that for all $ j = 1,2,...,k $, we have $x_{i_j} = z_j$. In the longest common sub-sequence problem, we need to find maximum length sub-sequence common to two sequences $X = <x_1,x_2,...,x_m>$ and $Y = <y_1,y_2,...,y_n>$. Given the sequence $ X = <x_1,x_2,...,x_m> $, $i^{th}$ prefix of $ X $, for $ i = 0,1,...,m $ is $ X_i = <x_1,x_2,...,x_i> $. The authors then apply the dynamic programming steps to the problem as below:
  1. Characterizing the longest common sub-sequence: If $ X = <x_1,x_2,...,x_m> $ and $ Y = <y_1,y_2,...,y_n> $ be sequences, and let $ Z = <z_1,z_2,...,z_k> $ be any longest common prefix of $X$ and $Y$. Then one of the following three holds:
    • If $ x_m = y_n $ then $ z_k = x_m = y_n $ and $ Z_{k-1} $ is a longest common sub-sequence of $X_{m-1}$ and $Y_{n-1}$.
    • If $ x_m != y_n $ then $ z_k != x_m $ implies that $ Z $ is a longest common sub-sequence of $X_{m-1}$ and $Y$.
    • If $ x_m != y_n $ then $ z_k != y_n $ implies that $ Z $ is a longest common sub-sequence of $X$ and $Y_{n-1}$.
  2. A recursive solution: If the length of a longest common sub-sequence of the sequences $ X_{i} $ and $ Y_{j} $ is defined as $ c[i,j] $ then the following recursive formula can be derived based on the observation in 1 above
    • $$ c[i,j] =
      \begin{cases} 0 & \text{ if } i=0 \text{ or } j=0,\\
      c[i-1,j-1] + 1 & \text{ if } i,j > 0 \text{ and } x_i = y_j, \\
      max(c[i-1,j],c[i,j-1]) & \text{ if } i,j > 0 \text{ and } x_i \ne y_j.
      \end{cases}$$
  3. Computing the length of a longest common sub-sequence: The authors then provide a $ O(m \times n) $ algorithm to compute the length of a longest common sub-sequence.
  4. Constructing a longest common sub-sequence: The authors then provide an algorithm to print the longest common sub-sequence from the tables generated in 3.
The authors then discuss optimal binary search tree. A sequence of $n$ keys $ K= <k_1,k_2,...,k_n> $ is given in sorted order with each key $k_i$ having a probability $p_i$ associated with it that a search will be for $k_i$. There are $n+1$ dummy keys $ <d_0,d_1,...,d_n> $ such that $ d_0 < k_1 $, $ k_n < d_n $ and $ k_i < d_i < k_{i+1} $ for each $ i = 1,2,...,n-1 $ with a probability $q_i$ associated with each dummy key that a search will be for $d_i$. The problem is to create a binary search key such that the number of nodes visited on average is minimized. The authors then apply the dynamic programming steps to the problem as below:
  1. The structure of an optimal binary search tree: The authors describe that the optimal binary search tree will also contain optimal binary search trees as sub-trees rooted at the left and right child of the root of the tree.
  2. A recursive solution: If $ e[i,j] $ is the expected cost of searching an optimal binary search tree containing the keys $ k_i,...,k_j $ and if the sum of all the probabilities in the sub-tree with keys $ k_i,...,k_j $ is $ w(i,j) = \displaystyle\sum\limits_{l=i}^{j}p_l + \sum\limits_{l=i-1}^{j}q_l $ then $$ e[i,j] = \begin{cases} q_{i-1} & \text{ if } j=i-1, \\
    min_{i<=r<=j} { e[i,r-1] + e[r+1,j] + w[i,j] } & \text{ if } i<=j. \end{cases} $$
  3. Computing the expected search cost of an optimal binary search tree: The authors give $ \Omega (n^3) $ time algorithm to calculate the expected search cost of an optimal binary search tree.
  4. Constructing an optimal binary search tree: The authors leave it to an exercise to write an algorithm to output an optimal binary search tree constructed from the root table generated in 3.

No comments:

Post a Comment