# Algorithms

## Goals

• Appreciate the role of algorithms in solving problems.
• Recognize the different categories of algorithm complexity.
• Understand the two types of looping used in algorithms.
• Explore common searching algorithms.
• Learn general techniques for creating algorithms.

## Concepts

• analysis of algorithms
• algorithm
• base case
• big O notation
• binary logarithm
• binary search
• brute-force
• common logarithm
• complexity
• constant time
• divide and conquer
• efficiency
• exhaustive search
• exponential time
• factorial
• factorial time
• general case
• greedy algorithm
• heuristic
• iterative
• knapsack problem
• linear time
• logarithmic time
• O(1)
• O(cn)
• O(log n)
• O(n)
• O(n2)
• O(n!)
• optimization problems
• NP-complete
• polynomial time
• premature optimization
• pseudocode
• recursion
• searching
• sorting
• traveling salesperson problem

## Lesson

An algorithm is a set of steps for solving a problem. Algorithms are usually compared to a recipe you might use for making a cake, but what makes algorithms different from each other is not so much different ingredients, but the order and manner of using those ingredients. Many times one algorithm can be shown to be better than another in some aspect, even though they may use the same ingredients and result in the same “cake”. Two of the most common applications of algorithms is searching for things and sorting items in a particular order.

You already used an algorithm when you first learned to use variables. The task was something like this:

What are the first 3 even integers?

Because you you had not yet learned about loops or studied methods in depth, your solution might have looked something like the one below. This solution may even be more sophisticated than the one you wrote, because you had not yet learned the importance of separating the algorithm parameters `start` and `step` from the algorithm logic to make it more understandable and reusable.

### Analysis

The algorithm above gets the job done nicely, but there may be other ways to arrive at the same answer. Another way be faster or use less memory. There may not be a single algorithm that is better than the others in all aspects; you may need to make trade-offs based upon the needs of your application. With analysis of algorithms you can characterize different aspects of an algorithms in order to make an intelligent decision about which one to use.

#### Loops

One of the easiest ways to categorize an algorithm is based upon how it uses loops. Now that you've learned about methods and control flow, you wouldn't copy `number += step` just to print even integers. Moreover that wouldn't be very flexible, as you'd have to change the code for different numbers of integers. Instead your algorithm is likely to use loops to avoid repeating code.

Let's modify the problem above slightly to calculate just a single `n`th even integer. We can use a loop to avoid repeating the code for each step, and put the algorithm in a method to make it self-contained.

Given `n`, what is `n`th even integer after zero?

This algorithm uses a single loop, self-contained within the method. It doesn't call any other methods, and mostly importantly the method doesn't call itself. A method that uses loops that do not call back to the method itself is said to use an iterative algorithm.

#### Complexity

Believe it or not, how well an algorithm runs with an arbitrary set of data, or its efficiency, is usually not very significant in itself. What is usually more important is how the algorithm behaves when the problem gets bigger; this is called its complexity, or its “order of growth”.

##### Linear Time

Take a look at the algorithm above for calculating the `n`th even integer. Each time it calculates another integer, it performs `number += step`. Let's assume that this step takes `1ms` (an arbitrary value, probably not realistic) to perform, and that you want to retrieve the fourth even integer. The algorithm takes about `1ms + 1ms + 1ms + 1ms = 4ms` to calculate four integers. (We started with the first even integer, after `0`, and we're ignoring the time taken by the first step, `number = start`.) What does `4ms` tell us? Not a lot in real life, because it's likely that we would want to use this algorithm to calculate different numbers of integers.

Things get more interesting if we want to calculate the 25th even integer. Still assuming each step takes `1ms`, it's easy to see that the algorithm would take `1ms × 25 = 25ms` to complete. What about the 999th even integer? It would take `1ms × 999 = 999ms` (almost a second) to finish. We could even write a formula for it:

``complexity = time_per_step * number_of_steps``

What this is saying is that, for each step added, you simply add one more unit of time. The unit of time, whether it's 1ms or 5 hours, stays the same in the formula. You might have seen this formula before. The general formula for a line is `y = mx + b`. If the line goes through the y axis at the origin (the y-intercept variable `b` is `0`), the formula becomes simply `y = mx`, which is the same formula as the one above! If the size of the problem is 0, we expect the time the algorithm takes to be about 0, so a y-intercept of `0` makes sense in the line formula.

We can therefore say that the order of growth of this algorithm is linear. The bigger the problem grows, the longer the algorithm takes to completes, but it increases by the same amount with each step added. In other words, the time the algorithm takes is directly proportional to the size of the input.

##### Constant Time

Now that the algorithm is working and you have (hopefully) added unit tests for regression testing, it's OK to try to improve the algorithm. You might have noticed a pattern in the data: the fourth even integer is 8. The fifth even integer is 10. The sixth even integer is 12. In a stroke of inspiration, you decide to replace the loop with a simple formula:

That's certainly easier to understand, but is it any better? Assuming that calculating `n * 2` takes `1ms` (the same as `number += step`), you see that for the first even integer after zero (an input of `1`), both algorithms take about `1ms`. But what about the 999th even integer? The iterative algorithm takes `999ms`, while the formula still only takes `1ms`. The pattern is illustrated in the following table:

No matter how large the problem grows, the algorithm based on the formula will always take the same amount of time to complete! This is referred to as constant time, it is obviously much better than linear time (although linear order of growth wasn't so bad to begin with).

#### Big O Notation

Computer scientists have a special way of communicating the complexity of an algorithm, referred to as big O notation. The capital letter “O” is used before a set of parentheses to indicate the order of growth of the algorithm. Inside the parentheses is some value indicating how the running time of the algorithm grows in relation to the size of the of the data, represented by `n`.

Here are the two orders of growth you've seen so far, represented in big O notation. The big O is pronounced “order”.

O(n)
Linear time. If a data set is size `n`, then the running time is `n` as well. Put another way, if you increase the data size, the running time increases by the same number of units.
O(1)
Constant time. The variable `n` isn't even included in the notation, because it is irrelevant; the running time stays the same regardless of `n`.

### Searching

One of the most common applications is searching for some item. There are several reasons you might want to search for something:

• You might want to keep track of whether you've seen an item before. Maybe you are interested in knowing how many people have visited a store more than once, for instance.
• You might want to know other information about the item. You may might know a car's vehicle identification number, for example, and want to retrieve more information about the car.

#### Exhaustive Search

Let's start with a simple array of integers, and attempt to search for an item using the most obvious method: starting at the first array element and progressing until you find the object or reach the end of the array. This is called an exhaustive search.

As you can see from the loop, this algorithm is iterative. What is the complexity of this algorithm? You can try a few test searches using an array with the following numbers: `11`, `38`, `4`, `15`, `12`, `10`, `5`, `7`, `32`, `3`.

##### Best Case

Let's try to find the value `11`. (We'll continue to pretend that each iteration of the loop takes `1ms`.) Because this is the first number in the array, the algorithm finds it in the first loop iteration, taking only 1ms! What if we were to add another number to the array, to make 11 positions instead of 10? This algorithm still takes `1ms`. What if we had an array of 999 elements? This algorithm would still find the number `11` in only `1ms`! Does this mean that this exhaustive search has constant time complexity? It seems to—at least in the best-case scenario in which the item we're looking for is first in line.

##### Average Case

Let's try searching for a few other values:

search for `11`
`1ms`
search for `15`
`4ms`
search for `3`
`10ms`
search for `10`
`6ms`

There doesn't really seem to be any pattern here, except that the average search time is just over `5ms`, which is the time it take to get about halfway through the array. If there are `n` elements in the array, it needs on average `n/2` iterations, that is to say `½n` iterations, to find a value. (Remember that when analyzing complexity, the coefficients are ignored.) The running time of the average case is proportional to the number of items to search. This is linear complexity.

##### Worst Case

What is the worst performance this algorithm could possibly show? That would be the case in which we search for a number that is not present in the array, because the algorithm will have to look at all the positions to make sure that the item isn't in one of them. For the sample array of 10 values, this would take 10 iterations of the loop, or `10ms` in our example. For an array of 11 values, it would take `11ms`. With 999 values, in the array, it would take `999ms`. This is obviously linear time.

The complexity of an exhaustive search therefore varies based upon whether you are discussing the best case, average case, or worst case.

#### Binary Search

The linear time of an exhaustive search is not too bad, but is there some way we could make it faster? If only there were some way we didn't have to look at all the values when searching. What if we were to first sort the values?

If we know that the values are sorted in ascending order, we could start looking at any position in the array and if the value was not found, we would know in which direction to continue looking! For example, if we were to start looking for the number `11` at `array`, we would find the value `7`. The value `7` is not `11`, but because the values are sorted we know that `11` must be at some position greater than that of `array`! We don't even have to look at `array` and `array`, because the value `11` could not possibly be in one of those positions.

##### Divide and Conquer

This approach to dividing up the problem and looking at portions of the data independently is a common technique in creating algorithms. It's called divide and conquer. If we know that the numbers to search are sorted, we can start searching in the middle of the array. If the number we encounter is not the one we're searching for, we can immediately “throw away” part of the numbers and continue searching those for which we've “narrowed down” our search.

The divide and conquer technique is the basis of the binary search algorithm:

1. Require that the input data be sorted.
2. Start searching halfway through the remaining input data.
3. If the encountered number is the one we're looking for, return it.
4. If the encountered number is too high, throw away all the values above it and repeat step 2.
5. If the encountered number is too low, throw away all the values below it and repeat step 2.
##### Binary Search Complexity

Think back to how we thought about complexity for an exhaustive search. For the worst case scenario, adding one unit of data to the problem resulted in one more loop iteration, that is, the time it took went up by one unit. When plotting on a graph, if advancing `x` by one unit results in advancing `y` by one unit, we get a plot of a straight line, so an exhaustive search runs in linear time. But we can also think about this backwards: how many more data units will be required to create one more loop iteration in an exhaustive search? The answer is one more data unit; there is a `1:1` correspondence between the input data and the number of loop iterations.

Now think about binary search. Remember that each loop iteration throws away half the data. So instead of asking how many more iterations one more unit of data would require, ask instead how many more units of data would require to add one more iteration to the loop? Because each loop iteration takes care of half the data, the number of loop iterations increase by one each time you double number of input units. If you have 32 data units, a worst case would take around 5 iterations to find the correct item (16, 8, 4, 2, 1), cutting data in half each time around. If you have 64 data units, this only increases the number of iterations by one, producing 6 iterations: 32, 16, 8, 4, 2, 1.

The number of iterations of a binary search is the inverse of some square of 2. This is the binary logarithm; if there are `n` data units to be searched, the worst-case number of loops will be `log2 n`. For this reason algorithms like binary search are said to show logarithmic growth, and they are indicated as O(log n) in big O notation.

As you can see from the chart, logarithmic growth is much slower than linear growth. As the problem gets larger, algorithms with logarithmic running time outperform algorithms with linear growth by a large margin.

##### Data Preparation

To use the binary search algorithm, you must first sort the data to be searched. Although sorting will be covered in a future lesson, for now you could use the Java utility method `java.util.Arrays.sort(int[] a)`.

But doesn't sorting the data itself take time, and detract from the efficiency of the search? Yes, if you consider the time it takes to sort the data and the time it takes to search the data, the combined time could be longer than even a brute-force, exhaustive search! But consider that we only have to sort the data once to get the benefits. Once the data is sorted, we can search again and again, and each of those searches will run in logarithmic time.

### Recursion

Iteration is not the only type of looping that can be performed in an algorithm. Another way for an algorithm to repeat code is for a method to call itself; this is called recursion. Because recursion is an alternative to iteration, most algorithms written using iteration can be rewritten to use recursion, and vice-versa. Many algorithms are more naturally written one or the other, however.

The key to recursion is knowing when to stop. Just like the condition of a `while(…)` loop or a `for(…)` loop, in recursion a method needs to know when to stop calling itself; this condition is called the base case (as opposed to the general case in the other recursions). With a check for the base case, the method will keep calling itself, modifying the passed arguments each time, until the base case is reached.

Consider the mathematical operation factorial, in which a positive number is multiplied by every number less than itself, stopping at `1`. The factorial operation is represented by the `!` symbol, and the definition of factorial says that `0!` is equal to `1`. For example `5!` is equal to `5 × 4 × 3 × 2 × 1`. With an iterative approach it is easy enough to use a `while(…)` or a `for(…)` loop to go through each of the numbers:

If we think about how factorial works, we realize that if `5! = 5 × 4 × 3 × 2 × 1`, and if `4! = 4 × 3 × 2 × 1`, then `5! = 5 × 4!`. Similarly `4! = 4 × 3!`. In other words, the factorial of any number is equal to that number multiplied by the factorial of one minus the number—down to `0`, for which `0!` is `1`. In more mathematical terms, `n!` is equal to `n × (n - 1)!`. Here's how we might take advantage of this property to convert the iterative algorithm to a recursive algorithm:

The `factorial(…)` method calls itself to find the factorial of `n - 1`, which in turn calls itself to find the factorial of `n - 1 - 1`, and so on. So why doesn't this recursion go on forever? It is because eventually `factorial(…)` is called with the base case `0`, which instead of calling the method recursively will return some value, breaking the cycle.

The following table, when read from the bottom up, indicates the sequence of recursive calls made to calculate `factorial(5)`.

`factorial(0)` `return 1;` `return 1 * factorial(0):` `return 2 * factorial(1);` `return 3 * factorial(2);` `return 4 * factorial(3);` `return 5 * factorial(4);`

Using these concepts we can convert the iterative binary search algorithm above to use recursion.

### Optimization Problems

Imagine that you win a computer programming contest, and as a prize you get to choose any two computer games from an online store. The only restriction is that the total price of both games cannot exceed \$29.

You decide that you will try to find the most expensive two games available without going over the maximum price. Alongside searching and sorting, this is a third common category called optimization problems. In these sort of problems you need to find some optimal combination, such as the most or least of something within certain constraints. An optimization problem may have more than one solution.

Algorithms work with values; what those values mean depends on the problem. Let's use the same list of values we were using before, assuming they represent prices of computer games. You may already see that the optimal solution is to choose the two games that cost \$15 and \$12 for a total price of \$27. But what steps would you use to solve this problem using an algorithm?

Think about choosing two games; if you choose the game at `array`, you can pair it with any of the other games. If the rules of the contest allow, you could even pair the game with itself (choosing two copies of the same game). Thus if you choose the game at `array` as your first prize, you have 10 possibilities for the second game.

Similarly if you choose the game at `array` as the first game, you again have 10 possibilities for the second game, and so on. For each of the 10 games, you could pair it with one of 10 games (or one of the other 9 games, depending on whether you were allowed to choose the same game twice). This suggests the following brute-force approach:

The crux of this algorithm is the nested loop; each of the 10 prices is being compared against each of the 10 prices themselves, for `10 × 10 = 100` combinations to check. You can see that the same thing happens if the length of the input list grows: for 11 prices, the number of iterations would be `11 × 11 = 121`.

In general if the length of the prices list is `n`, we can say that the number of iterations is `n × n` or `n2`. An algorithm with this sort of complexity is referred to has running in quadratic time, and is denoted as O(n2) in big O notation. From the figure you can see that O(n2) complexity increases much more rapidly than any of the running times we've discussed earlier.

Here is a table to put this into perspective, showing just how fast these different classes of algorithms increase in running time as the problem size increases. For illustration each iteration is assumed to take `1ms`, and as an example O(log n) is assumed to have log2 n complexity.

#### Knapsack Problem

The above task of choosing two optimal computer game prices is really a variation of a the more general knapsack problem. This type of problem traditionally uses the illustration of a knapsack (or backpack) which has a limited amount of space or weight, and you want to choose items to fill the knapsack as completely as possible. In the most common version of this problem, which we'll use here, you are allowed to choose at most one of each item of a particular value.

Imagine that after winning the programming contest, the rules say that as a prize you can choose any number of computer games as long as the total price does not exceed \$29. We'll continue using the same list of computer game prices shown earlier.

As a first step at attacking this problem we'll think of how a brute-force approach would work. Remember that including a game twice isn't allowed, and if order doesn't matter

1. First look at every computer game by itself. (There are 10 options to check.)
2. Then look at every group of two games. This will require `10! / (2! x 8!) = 55` iterations, as you might already know from the earlier problem.
3. Then look at every group of three games.
4. Then look at every group of eight games. (The missing game can be any 2, so there are 45 combinations to check.)
5. Then look at every group of nine games. (The missing game can be any one of the 10, so there are 10 combinations to check.)
6. Check the group of all 10 games. (There is only one such group.)

The number of iterations this will require may not be immediately obvious. Realize that you need to check every number and combination of game. Each of the 10 games can either be chosen or not chosen, which is a choice of two options (“yes” or “no”). With two options for 10 items, this results in a `210 = 1024` possibilities. Similarly with only 5 games to choose from, each game still has an option of being chosen or not, resulting in `25 = 32` possibilities.

As the size of the problem `n` grows, the number of iterations needed to check all possibilities will be `2n`. When the running time is some constant `c` to the power of `n`, it is called exponential time. In big O notation this is generally shown as O(cn), and from the figure you can see that this running time increasing even faster than even O(n2) complexity!

#### Greedy Algorithms

We say that brute-force approach to the knapsack problem explained above is correct because it gives the true answer. But algorithms with running times of O(n2) and O(cn) are so slow, you'll want to avoid them for all but small problems. Sometimes there is a way to find an answer that isn't “correct”, but is “good enough”. For your computer programming prize, would you rather wait a year for your algorithm to determine the absolute best combination of computer games within the \$29 limit, or would you rather take just a few seconds and walk away with computer games in your knapsack that represent a pretty good estimation of the best value?

A technique for creating algorithms that may not be correct but that provide good estimates, is to use a heuristic, a “short-cut” rule that gets you close to the correct answer in a shorter amount of time. One of the most important heuristics is that used in so-called greedy algorithms: repeatedly pick the best answer available without exhaustively trying all possibilities.

Let's see how you could create a greedy algorithm to arrive at a good value of video games without going over \$29. Rather than using Java code, we'll show pseudocode. This is a common way of discussion algorithms, illustrating the programming logic without using any particular programming language syntax.

Let's look at how the knapsack would be filled based upon the two different approaches:

Greedy algorithm (heuristic solution)

The algorithm will fill the knapsack in the following order. When it comes to the \$12 game, it will add it into the knapsack because it fits at that point.

1. `\$11 <= \$29?` Yes.
2. `\$11 + \$38 = \$49 <= \$29?` No.
3. `\$11 + \$4 = \$15 <= \$29?` Yes.
4. `\$11 + \$4 + \$15 = \$30 <= \$29?` No.
5. `\$11 + \$4 + \$12 = \$27 <= \$29?` Yes.
6. `\$11 + \$4 + \$12 + \$10 = \$37 <= \$29?` No.
7. `\$11 + \$4 + \$12 + \$5 = \$32 <= \$29?` No.
8. `\$11 + \$4 + \$12 + \$7 = \$34 <= \$29?` No.
9. `\$11 + \$4 + \$12 + \$34 = \$61 <= \$29?` No.
10. `\$11 + \$4 + \$12 + \$3 = \$30 <= \$29?` No.

Result: `\$11 + \$4 + \$12 = \$27`

Brute-force algorithm (correct solution)

The brute-force algorithm will not be fooled into placing the \$12 game into the knapsack. By trying all the possibilities, it realizes leaving out the \$12 game will leave room for a \$10 and a \$3 game, which is \$1 closer to the \$29 maximum.

Result: `\$11 + \$4 + \$10 + \$3 = \$28`

This greedy algorithm iterates through the list of games a single time. For each game, the algorithm simply includes it in the knapsack if it doesn't push the price over the maximum total (\$29 in the example scenario). Because the games are only iterated once, you should already recognize that this is linear running time. By switching to a greedy approach, we may not get the exact answer, but we've reduced the running time from O(cn) to O(n). This is a huge performance increase! And the result is still pretty close to the correct answer.

#### Traveling Salesperson

One of the most famous optimization problems revolves around the story of a salesperson wishing to visit a certain number of cities. The salesperson knows that time is money, and therefore wants to spend the least amount of time traveling—which is what makes it an optimization problem. Each two cities are separated by some distance (or travel time, based upon how the problem is phrased), and the salesperson wants to know which city to start at and which order of cities to visit so as to visit all the cities in the shortest time possible.

This traveling salesperson problem is famous because it is hard. All the running times we've talked about so far are called polynomial times, because their running time can be expressed by a simple mathematical power relationship of the problem size `n`. But the best correct solution anyone has found for the traveling salesperson problem is a brute-force approach, using four cities as an example:

1. For the first city, the salesperson can start at any of the of the four cities.
2. For each of the four cities, the salesperson has three more cities to choose from.
3. For each of those cities, there are two more cities left.
4. Whichever of the two cities the salesperson chose, there will be one more city remaining.

Therefore with four cities, a brute-force approach will require trying `4 × 3 ``× 2 ``× 1 = 24` different combinations. You should already recognize this as `4!`, or “four factorial”, and the algorithm indeed runs in factorial time, shown as O(n!) in big O notation. With only four cities, the running time doesn't seem so bad. With 10 cities, however, the running time jumps to 3,628,800 iterations. The figure indicates that factorial time O(n!) increases even faster than O(cn) and O(n2) complexity! For all but the smallest problems, algorithms with O(n!) complexity are simply unusable in the real world, at least on a single computer—simply because we don't have the time to wait for them.

## Review

### Summary Graph of O(1), O(log n), O(n), O(n2), O(cn), and O(n!) running times.

• searching
• sorting
• optimization

#### Techniques

• data structuring
• divide and conquer
• greedy
• heuristics

#### Algorithm Analysis

• iterative / recursive
• complexity

### Gotchas

• Don't try to make an algorithm run faster until you have shown that it gives the right answer.
• Just because an algorithm runs quickly for a certain set of data, it may not be the best choice. It's order of growth is probably more important; make sure it doesn't drastically slow down when given larger problems.
• Ignore constants and coefficients when determining algorithm complexity.
• Don't forget to add a check (usually up-front) for the base case in a recursive method to know when to stop.

### In the Real World

• Most of the basic algorithms have already been implemented in standard libraries. However it is important to know how to think about and analyze algorithms in order to choose the best one for the job, as well as to design your own algorithm for new problems that come up.

If you're stuck trying to create an algorithm, try the following process:

1. Create an interface for your algorithm. Make sure the interface clear about what data it expects and what it promises to return.
2. Craft a brute-force algorithm that solves the problem based upon your real-world understanding, without regard to how fast the algorithm runs.
3. Create unit tests that verify the initial algorithm functions correctly.
4. Analyze the algorithm and determining its complexity. See if you can change your approach to produce a faster running time.
5. Use the unit tests to make sure your algorithm still works.
6. If there is a need for further speed improvements, see if you can improve the logic of the algorithm.
7. Continue to use the unit tests for regression testing, making sure that no bugs were introduced during optimization.

### Self Evaluation

• What are three of the most common types of problems for which you can use algorithms?
• What is “premature optimization”, and why should you avoid it?
• What is the difference between the “efficiency” and the “complexity” of an algorithm?
• How do you use constants and coefficients when determining the complexity of algorithms?
• Which is better, linear time or constant time? Why?
• When might you want to use a brute-force approach when creating an algorithm?
• Should you pay more attention to an algorithm's best-time, average-time, or worst-time performance?
• Why is a “binary search” so-called, and how does that relate to its running time?
• What is the difference between the binary logarithm and the common logarithm, and how does it influence the classification of algorithm complexity?
• What is the key factor that makes a technique “recursive”?
• How would you choose between an iterative or a recursive technique?
• What sort of loops in an algorithm might indicate that you are dealing with a slower running time than linear?
• Give an example of an optimization problem.
• What is a “heuristic” and when would you want to use one?