Binary Search

Example Problem

Imagine that we have a list of \(N\) integers given to us in ascending order. We are also given a value x, and our goal is to find the first value that is greater than or equal to x (or determine that there is no such value). For example, we could have \(N\) = 13 and a list of numbers 2, 3, 7, 10, 12, 15, 18, 21, 25, 30, 32, 35, 39:

Solution 1

We can loop through the array until we find the first value ≥ x. In python, this might look like:

# Assume that the list is called L
# e.g. L = [2, 3, 7, 10, 12, 15, 18, 21, 25, 30, 32, 35, 39]
# x is the target value, e.g. x = 8

found = False

for val in L:
    if val >= x:
        print(x)
        found = True
        break

if not found:
    print("No value!")
    

This has a worst-case time complexity of \(O(N)\) because it might loop through the entire list.

Solution 2: Binary Search

Because the list is sorted, we can use an algorithm called binary search to improve the time complexity. Here is a video of binary search being applied to our example list with x = 22.

In binary search, we first check the middle position in the list. Because the list is sorted, this tells us a lot of information:

In either case, we can immediately eliminate half of the list. We can then repeat this process until there is only one option remaining.

Time Complexity

Each time binary search checks a value, it eliminates half of the remaining possibilities. For example, if the array had \(N\) = 64 values, then the number of possibilities would go 64 → 32 → 16 → 8 → 4 → 2 → 1, requiring only 6 checks! In general, the number of times that you need to halve a number before reaching 1 is represented by the base-2 logarithm function. For example, \(\log_2(64) = 6\) and \(\log_2(1\,000\,000) \approx 19.93\). This means that if our list had 1000000 numbers, we could find the answer by checking just 20 of them!

We could say that binary search has a time complexity of \(O(\log_2(N))\). However, if you have learned logarithms in school, you may know that The 'change of base rule' tells us that logarithms with different bases are multiples of each other. Because of this, we can omit the base 2, since this represents a constant factor (don't worry if you don't understand this, it's not that important).

Putting this all together, binary search has a time complexity of \(O(\log(N))\).

Coding Binary Search

Binary search is notoriously difficult to code because it is very easy to have off-by-one errors that cause incorrect answers or infinite loops! I suggest trying to code it yourself (you can submit to Bernard's Magic Needles), and look at my code if you get stuck.

My binary search code

'''
 Returns the first value in the list A which is greater than or equal to x
 or -1 if no such value exists
'''
def bsearch(A, x):
    lo = 0
    hi = len(A)-1
    while lo != hi:
        mid = (lo+hi)//2
        if A[mid] >= x:
            hi = mid
        else:
            lo = mid+1
        
    # Check that the value exists
    if A[lo] >= x:
        return A[lo]
    else:
        return -1;
/*
 * Returns the first value in the array A which is greater than or equal to x
 * or -1 if no such value exists
 */
int bsearch(int n, int *A, int x) {
    int lo = 0, hi = n-1;
    while (lo != hi) {
        int mid = (lo+hi)/2;
        if (A[mid] >= x) {
            hi = mid;
        } else {
            lo = mid+1;
        }
    }
    
    // Check the value exists
    if (A[lo] >= x) return A[lo];
    else return -1;
}

Medusa's Snakes (AIO 2019)

Let's try to apply what we have learned to solve a challenging problem. Read Medusa's Snakes and spend some time thinking about how to solve it. If you get stuck, look at the hints below.

Hint 1

This problem is a maximisation problem, because it is asking us for the maximum possible venom level. Let's begin by solving a simpler problem: given some value x, is it possible to create a snake with a venom level of exactly x? This is called a decision problem, because the answer is either yes or no. Try to solve this problem in \(O(N)\) time, without using binary search.

For example, consider the first sample input KSEESNANNAAKNKESE. For x = 1 or 2, the answer is yes, because we can make a snake with venom level 1 or 2. For x = 3 or greater, the answer is no, because we cannot make a snake with these venom levels.

Hint 2 (solution to hint 1)

To check if there is a snake with venom level x, we can try to find one. We construct our snake as follows, using a greedy algorithm:

  • Find the first x S's in the string.
  • Find the first x N's in the string that occur after the S's.
  • Find the first x A's in the string that occur after the N's.
  • Find the first x K's in the string that occur after the A's.
  • Find the first x E's in the string that occur after the K's.

This will always find a snake with venom level x if one exists. Why? Using the first x S's maximises the remaining characters to use in the rest of the snake. If we skipped over an S to use a different one, we might also skip over an N that is required to finish our snake, but the method we use guarantees we don't skip over letters unless we are required to. The same argument can be applied to the other letters.

By using a for loop to iterate through the letters, this algorithm runs in \(O(N)\) time.

Hint 3

Try to use binary search, alongside the problem and solution from hints 1 and 2, to solve the original problem (Medusa's Snakes) in \(O(N \log N)\) time.

Note that the exact situation you apply the binary search to will not be the same as the example problem from above. Try to use the underlying idea of halving the search space.

Hint 4

We could use the ideas from hints 1 and 2 to make an \(O(N^2)\) solution to the problem (which would score partial marks but be too slow to score 100). Specifically, we know how to tell if a snake can be made for a given venom level x, so we can simply loop through all possible values of x and check if they are possible. The venom level can't be larger than \(N\) - in fact it can't be larger than \(N/5\) - so there are \(O(N)\) possible values of x and we can check if each of them is possible in \(O(N)\) time. This gives us an overall time complexity of \(O(N^2)\).

Notice that as we try possible values of x, starting from 0 and increasing, the answer will be yes until we reach the maximum venom level, then it will always be no afterwards. Can we use this to speed up our search for the maximum venom level?

Hint 5 (solution to hints 3/4)

In hint 2, we found an \(O(N)\) time algorithm to check whether we can make a snake with venom level x (for any x). Said another way, this algorithm allows us to check whether or not the answer is at least x.

We can use a technique called binary searching for the answer. That is, we use binary search to continuously halve the options for the answer, until only one remains. In code, that would look something like this:

def check(x):
    # TODO: write a function that returns True if the answer is >= x, and False Otherwise
    # This function runs the solution from hint 2

lo = 0 # the answer is always at least 0
hi = N # the answer is always at most N 
       # (we could replace this with N//5, but this would not improve the time complexity)

while lo != hi:
    mid = (lo+hi+1)//2 # I have a +1 so that this rounds up rather than down. 
                       # If I rounded down instead, this could cause an infinite loop
    if check(mid):
        lo = mid # the answer is >= mid
    else:
        hi = mid-1 # the answer is < mid

print(lo) # the answer is lo (alternatively hi)
bool check(int x) {
    // TODO: write a function that returns true if the answer is >= x, and false Otherwise
    // This function runs the solution from hint 2
}

int lo = 0; // the answer is always at least 0
int hi = n; // the answer is always at most N 
              // (we could replace this with N/5, but this would not improve the time complexity)

while (lo != hi) {
    int mid = (lo+hi+1)/2; // I have a +1 so that this rounds up rather than down.
                           // If I rounded down instead, this could cause an infinite loop
    if (check(mid)) {
        lo = mid; // the answer is >= mid
    } else {
        hi = mid-1; // the answer is < mid
    }
}

printf("%d\n", lo); // the answer is lo (alternatively hi)

The function check runs in \(O(N)\) time and is called \(O(\log N)\) times by the binary search, giving an overall time complexity of \(O(N \log N)\). This is fast enough to score all 100 points.

Exercises

Click here to go back.