In http://blog.bytecode.tech/the-power-of-generic-algorithms/, we mentioned the need of taking into consideration the physical limitations of time and space when writing an algorithm. In this post, we will show how the anatomy of an algorithm varies as successive optimisations are added. This change is reflected, especially, in the shift from a functional to an imperative style.

As we know, functional programming allows to develop in a declarative way: specify ** what** to do and leave the

**to the computer. On the other hand, imperative programming cares about both.**

*how*As functional programming is focused on the * what*, it is better suited to solve problems at a higher level of abstraction. Whereas the need of dealing with implementation details makes imperative programming more error-prone and the solutions less portable across different domains.

Having said that, optimising an algorithm requires to do things in a very specific way. In that case, only an imperative approach will do.

#### Longest Increasing Subsequence

To illustrate these ideas, let’s consider the problem of the Longest Increasing Sequence (LIS).

As it is known, the standard solution is O(N^{2}) and there is an optimised version O(N logN).

##### Solution 1

Let’s consider a first approach using functional programming. This approach builds all possible increasing subsequences and then selects the longest one.

def solution1(arr: List[Int]): List[Int] = { val subSequences: mutable.Set[List[Int]] = mutable.Set() arr.foreach { x => subSequences ++= (subSequences .filter(_.head < x) .map(x :: _) match{ case s if s.isEmpty => Set(List(x)) case s => s }) } subSequences.map(s => (s, s.size)).maxBy(_._2)._1.reverse }

To see this solution in action, let’s consider the array **[5,2,7,4,3,8]**.

Below is the list of subsequences generated in each iteration over the array

Iteration | Subsequences |
---|---|

1 | [5] |

2 | [2] |

3 | [5,7] [2,7] [7] |

4 | [2,4] [4] |

5 | [2,3] [3] |

6 | [5,8] [2,8] [5,7,8] [2,7,8] [7,8] [2,4,8] [4,8] [2,3,8] [3,8] [8] |

And the longest subsequences are: [5,7,8], [2,7,8], [2,4,8], [2,3,8]

The code is easy to understand, with no need of handling painful loop indices (although there are 2 nested loops disguised in the methods “foreach” and “filter/map”). Given that the intermediate solutions are stored and used to build the new ones, this algorithm could be an example of dynamic programming.

However, this approach is too naive as generating so many subsequences makes the application run out of memory for inputs as small as 200 elements.

##### Solution 2

If we are only interested in getting any of the longest subsequences, we can apply our first optimisation by concatenating each new element only to the longest of the previous subsequences (if there is more than one such subsequence, we take any one of them)

Iteration | Subsequences |
---|---|

1 | [5] |

2 | [2] |

3 | [2,7] |

4 | [2,4] |

5 | [2,3] |

6 | [2,3,8] |

And here’s the code, just a small variant over the previous solution and yet delivers a significant speed improvement and avoid running out of memory (in this case, the number of subsequences is the same as the number of elements in the original array).

def solution2(arr: List[Int]): List[Int] = { val subSequences: mutable.Set[List[Int]] = mutable.Set() arr.foreach { x => subSequences += (subSequences .filter(_.head < x) .map(l => (l,l.size)) match { case s if s.isEmpty => List(x) case s => x :: s.maxBy(_._2)._1 }) } subSequences.map(s => (s, s.size)).maxBy(_._2)._1.reverse }

##### Solution 3

Despite the improvement introduced by solution 2, we are still calculating too many sequences. In the previous example, there are 2 sequences of size 1 and 3 of size 2. However, without loss of generality, we can keep just the sequence of each size ending with the smallest number. For instance, any increasing sequence built on top of [2,7] and [2,4] can also be built on top of [2,3], therefore we can keep just [2,3].

Here’s the algorithm in action

Iteration | Subsequence length | Subsequences |
---|---|---|

1 | 1 | [5] |

2 | 1 | [2] |

3 | 1 | [2] |

2 | [2,7] | |

4 | 1 | [2] |

2 | [2,4] | |

5 | 1 | [2] |

2 | [2,3] | |

6 | 1 | [2] |

2 | [2,3] | |

3 | [2,3,8] |

and the code

def solution3(arr: List[Int]) = { val subSequences: Array[List[Int]] = new Array(arr.size) subSequences(0) = List(arr(0)) for(i <- Range(1,arr.size)) subSequences(i) = List(Integer.MAX_VALUE) var maxIdx = 0 for(i <- Range(1,arr.size)){ var j = maxIdx while(j >= 0){ if(subSequences(j).head < arr(i)) { subSequences(j + 1) = arr(i) :: subSequences(j) if(j == maxIdx) maxIdx += 1 j = -1 //to break the loop } else if(j == 0){ subSequences(0) = List(arr(i)) j -= 1 } else j -= 1 } } subSequences.map(s => (s, s.size)).maxBy(_._2)._1.reverse }

This is a great improvement! The total number of subsequences in memory is never greater than the number of elements of the LIS.

Unfortunately, we are forced to deal with low level handling of the loops, entering the realm of imperative programming and losing the simplicity of the functional version.

**Note**: an additional optimisation introduced in this solution is to limit the loop over “subSequences” to those elements with actual values.

##### Solution 4

If we are just interested in knowing the length of the LIS, we can still go further in the optimisation process. Looking at the above solution, it is clear that the only thing that we need to store is the last element of each sequence as the length of the LIS is given by maxIdx + 1.

Iteration | Subsequence length | Last element |
---|---|---|

1 | 1 | 5 |

2 | 1 | 2 |

3 | 1 | 2 |

2 | 7 | |

4 | 1 | 2 |

1 | 4 | |

5 | 1 | 2 |

2 | 3 | |

6 | 1 | 2 |

2 | 3 | |

3 | 8 |

This version is very similar to the previous one and the improvement is not much

def solution4(arr: List[Int]) = { val subSequences: Array[Int] = new Array(arr.size) subSequences(0) = arr(0) for(i <- Range(1,arr.size)) subSequences(i) = Integer.MAX_VALUE var maxIdx = 0 for(i <- Range(1,arr.size)){ var j = maxIdx while(j >= 0){ if(subSequences(j) < arr(i)) { subSequences(j + 1) = arr(i) if(j == maxIdx) maxIdx += 1 j = -1 } else if(j == 0){ subSequences(0) = arr(i) j -= 1 } else j -= 1 } } maxIdx + 1 }

##### Solution 5

Although a small improvement, the previous version paved the way for the last optimisation: replacement of the linear search with a binary search. The use of the binary search makes this solution O(N logN).

def solution5(arr: List[Int]) = { def binarySearch(arr: Array[Int], valueSearched: Int, lowerBound: Int, upperBound: Int): Int = { if(lowerBound > upperBound) lowerBound - 1 else { val mid = (lowerBound + upperBound) / 2 if (valueSearched <= arr(mid)) binarySearch(arr, valueSearched, lowerBound, mid - 1) else binarySearch(arr, valueSearched, mid + 1, upperBound) } } val subSequences: Array[Int] = new Array(arr.size+1) subSequences(0) = Integer.MIN_VALUE for(i <- Range(1,arr.size)) subSequences(i) = Integer.MAX_VALUE var maxIdx = 0 for(i <- Range(1,arr.size)){ val idx = binarySearch(subSequences, arr(i), 0, maxIdx) subSequences(idx + 1) = arr(i) if(maxIdx < idx + 1) maxIdx = idx + 1 } maxIdx }

##### Examples

Here’s the result of running the different solutions for different inputs randomly generated. “Solution 1” has been left out as it runs out of memory for inputs greater than 100.

Clearly, the results confirm the expected performance. For instance, for input 10000, the LIS has 198 elements and the times vary from the 36 seconds of solution 1 to 352 milliseconds of solution 5

********************************** array length ==> 1000 ********************************** solution2 58 time2: 363 solution3 58 time3: 48 solution4 58 time4: 21 solution5 58 time5: 4 ********************************** array length ==> 2000 ********************************** solution2 85 time2: 795 solution3 85 time3: 129 solution4 85 time4: 131 solution5 85 time5: 14 ********************************** array length ==> 3000 ********************************** solution2 106 time2: 1965 solution3 106 time3: 354 solution4 106 time4: 335 solution5 105 time5: 19 ********************************** array length ==> 4000 ********************************** solution2 117 time2: 3341 solution3 117 time3: 953 solution4 117 time4: 980 solution5 117 time5: 55 ********************************** array length ==> 5000 ********************************** solution2 139 time2: 5953 solution3 139 time3: 1881 solution4 139 time4: 1872 solution5 139 time5: 88 ********************************** array length ==> 6000 ********************************** solution2 150 time2: 9043 solution3 150 time3: 2952 solution4 150 time4: 2957 solution5 150 time5: 131 ********************************** array length ==> 7000 ********************************** solution2 164 time2: 13542 solution3 164 time3: 3743 solution4 164 time4: 3717 solution5 164 time5: 180 ********************************** array length ==> 8000 ********************************** solution2 170 time2: 19721 solution3 170 time3: 5776 solution4 170 time4: 5740 solution5 170 time5: 240 ********************************** array length ==> 9000 ********************************** solution2 181 time2: 26030 solution3 181 time3: 8064 solution4 181 time4: 7942 solution5 181 time5: 299 ********************************** array length ==> 10000 ********************************** solution2 198 time2: 36755 solution3 198 time3: 9954 solution4 198 time4: 9750 solution5 198 time5: 352

The repository containing the source code included in this post can be found on github