Binary Heap

From GPWiki
Jump to: navigation, search

A binary heap is a useful data structure if you have a collection of objects to which you are adding random objects and need to remove the object with the highest (or lowest, or the extreme of some other ordering) value. You could use a dynamic array or a list, but then you'll have to compare all the elements each time you want to find the biggest one. A binary tree is already better, but a binary heap is more space efficient, and simpler.

So when does this situation of needing to remove the element with the highest value occur? There are two common uses. The first one is a priority queue, you want to enqueue elements, and handle the one with the highest priority first. Another one occurs in the A* algorithm, where you want to extend the path that is looking the most promising first.

The way a binary heap works is that you have an array (you can use a fixed size if you know the maximum number of elements that are going to be on the heap, otherwise a dynamic array is a better idea) on which elements are organized in a way that resembles a binary tree. The element that currently has the highest value sits on the first position of the array (position 0), then at position 1 and 2 are two elements that are lower than the top element. At position 3 and 4 are two elements that are lower than element 1, and at position 5 and 6 are two elements that are lower than element 2.

This picture illustrates the array. The lines indicate the lower-than relations, and the numbers indicate the position in the array where the element can be found. Binary heap.png

Note that there is no guarantee that element 4 is lower than element 2, even though it is lower in the table. For example, 0 could be 10, 1 could be 9, 4 could be 7 and 2 could be 5. The lower-than relations are intact.

The elements that an object has relations (the lines in the picture) to can be easily found. Take element number X. The element above it is (X - 1) / 2 (round down if the result is a fraction), and the elements below it are (X + 1) * 2 - 1 and (X + 1) * 2. (These formulae are a little simpler when 1 is used as the top element.)

So how does an element get added? You put it at the end of the array (increasing the size by 1), and let it 'float' up. For example, if the elements in the picture represent the array, and only position 0-8 are full, a new element is put in at position 9. You find the element above it ((9 - 1) / 2 = 4) and compare it to that element. If the new element has a bigger value, you swap the elements and continue on position 4, comparing it to to (4 - 1) / 2 = 1. If the new element is smaller than the existing element, the heap is in a correct state again.

Removing the top element is a little more involved, but still no rocket science. You start by replacing the top element with the element that is currently at the bottom, at the end of the array. Shrink the array by 1 to remove the old instance of this bottom element. Then you let this element 'sink' down from the top. This 'sinking' goes like this: You find the elements below the current element (starting at 0 that is element 1 and 2), and take the largest one. If this element is larger than the element that is sinking, you swap them and continue sinking from the position of that element. When the floating element has a greater value than the elements below it, the heap is in order again.

This algorithm takes only a little space (just the space to store the elements) and inserting and removing elements is very fast (O(log(N)) at worst). Of course it is only useful in some situations, but in those situations this is the best thing I know of.