Programming Binary Search

I'll show the trick to pass them at the very end of this story.

Meanwhile, I'll appreciate if you share your results via an anonymous twitter poll at this link.

Generic binary searchIt is easier to think about binary search when solving a more general problem than finding an element in a sorted list.

It is quite a handy algorithm when you are trying to solve a problem that has the following properties.

The answer to the problem is a number.

You don’t know the answer, but given the answer you can code the algorithm to check if the number is an answer to the problem in a relatively straightforward way.

So, essentially you have a predicate function ok(x) that returns true when x is a potential solution.

Your task is to find the minimal x so that ok(x) returns true.

There is one crucial property that needs to be satisfied to apply a binary search algorithm.

The predicate has to be monotonic.

It means that if x is the lowest number making the predicate true, then it is also true for all the larger values of x.

For example, the following table shows the values of a monophonic predicate over integers in range from 1 to 6: x : 1 2 3 4 5 6ok(x) : F F T T T TIn binary search algorithm we introduce two variables.

One variable keeps the index of the left side of the currently searched range and is commonly shortened to l (however, make sure your programming environment uses a font that makes it look distinct from the number 1), while the second one keep index of the right side (r).

The algorithm maintains the following invariant throughout its operation:ok(l) is falseok(r) is trueBinary search is a primitive example of a divide and conquer algorithm.

It repeatedly halves the range from l to r while maintaining this invariant until l and r only differ by one, thus exactly pinpointing the indices at which monotonic predicate turns from false to true: x : 1 2 3 4 5 6ok(x) : F F T T T T ^ ^ l r // at the endSo the whole solution is to run the following loop:while (r – l > 1) { val m = (l + r) / 2 // compute the midpoint of l.

r interval if (ok(m)) r = m // Maintain invariant: ok(r) is true else l = m // Maintain invariant: ok(l) is false }If you keep the loop invariant in mind when writing this code it is easy to see what exactly needs to be put into each of the branches of the if statement.

No chance for any kind of off-by-one errors.

But we need to initialize the values of l and r before that start of the algorithm so that the invariant holds.

How do we find such an initial l that ok(l) is false?.In general, we cannot set l to the left of our search range (index 1 in the above picture), because the predicate might already be true there.

The same problem is with r.

Now, in some specific problems we might happen to know initial values of l and r that maintain the invariant, but when it is not case we can use the approach called a sentinel value.

We set the initial l to the number before the beginning of the search range and just pretend ok(l) is false there and set r to the number after the end of the search range and just pretend ok(r) is true.

x : 0 1 2 3 4 5 6 7ok(x) : [F] F F T T T T [T] // [x] are virtual values ^ ^ l r // at the beginningWe don’t need our predicate function to return those virtual true and false values.

The binary search algorithm only ever evaluates predicate ok(m) for values of m strictly between the current values of l and r.

The initial values work as sentinels that ensure that predicate is only evaluated in the search range.

Array searchThe original problem of searching an element in a sorted list is just a specific instance of a binary search when predicate ok(i) is list[i] >= value.

Valid indices in the list run from 0 to list.

size – 1, so using the approach of sentinel values we start with l = -1 and r = list.

size.

We are asked to return the first i where the predicate is true, so we consult our invariant and see that at the end we need to return r as the answer.

Here is the simple code that implements searchLowerBound function and passes all the main tests:Integer overflowUnfortunately, this code will not pass the bonus tests.

The size of the list in those tests is equal Int.

MAX_VALUE and integer overflow problems in two places in this code prevent it from working correctly.

The first one is r – l > 1 check.

At the very start, when l = -1 and r = Int.

MAX_VALUE the result of r – l overflows integer range, becomes negative, makes the loop condition false, causes immediate exit from the loop, and produce a wrong answer.

A better way to write this check is l + 1 < r.

It is always correct and overflow-free even when you solve a general binary search problem on the full range of integers from l = Int.

MIN_VALUE to r = Int.

MAX_VALUE, since l is always less that r and thus l + 1 never overflows.

The other problem is in the code that computes the middle as (l + r) / 2.

This sum might overflow when l and r are close to the end of the integer range.

We cannot use unsigned arithmetics here, since we start with l at a negative value.

We could use Long type, but what if we had to solve the similar problem with Long indices?.The right way to fix it in this particular case is to write (l + r) ushr 1, using an unsigned shift to the right instead of division by two, relying on the fact that here the values of m are never negative and overflow in addition never goes beyond the highest bit.

This expression yields the correct value for this particular range of values of l and r that this binary search algorithm can produce.

Side note: However, it is not a generic solution to compute a mid-range for the full range of integers.

You'd either need to use higher-precision integers or more complicated bit-fiddling tricks.

I'd highly recommend reading Hacker's Delight for those who are interested to learn more about those (see item 2–5 for this particular problem).

All in all, the following code successfully passes all the tests for the original array search problem, including the bonus ones for the maximal list size:ConclusionBinary search seems like an easy problem, but is tricky to code right in practice.

Being able to write algorithms like this one is an essence of competitive programming that I’ve previously covered in the “Programming as a Sport”.

But learning them is also known to boost your chances at a technical interview and teaches you important principles of algorithm design that you can apply anywhere.

.

. More details

Leave a Reply