Create a Max-Heap with a Randomized Algorithm in Python

What will we cover in this tutorial?

We will create a heap, or more specifically, a max-heap. A max-heap is a tree structure where the node value of every parent is greater or equal to the children.

In this tutorial we will implement a max-heap with a binary tree and use a randomized approach to keep it balanced.

You might be wondering why to make it randomized. Simply, said, to keep it simple and keep operations on average O(log(n)).

Step 1: Recall what a max-heap is

A max-heap is a tree structure where the node value of every parent is greater or equal to the children.

Example of a max-heap

A heap will have two primary functions.

  1. Insert an element and still keep the max-heap structure.
  2. Get and remove maximum element (the element at the root) and still keep the max-heap structure.

The goal is to be able to do these operations in O(log(n)) time.

Step 2: Understand what a randomized algorithm can do for you

Randomized algorithms help you achieve great performance on average while keeping the algorithms simple.

To get keep the operations of a heap at worst-case O(log(n)), you need to keep the binary tree structure balanced. This requires complex ways to ensure that.

Instead, just put in the leaves in the three randomly, and you will get the same result with very high probability. Hence, you will end up with an average time of O(log(n)) for the operations.

Step 3: Insert into a max-heap

Now to the fun part. The code.

Let’s start simple and create a Node to represent the nodes in the binary tree, which will represent the max-heap.

class Node:
    def __init__(self, element):
        self.element = element
        self.left = None
        self.right = None

The node needs to be able to keep the element (which should be comparable), and a left and right child.

From the above Node class you can create an arbitrary binary tree.

The max-heap insert function can be implemented by a recursive and randomized approach in the following manner.

import random


class Heap:
    def __init__(self):
        self.head = None

    def _insert(self, element, node):
        # if element is larger then node.element, switch
        if element > node.element:
            element, node.element = node.element, element

        # check if available node
        if node.left is None:
            node.left = Node(element)
            return
        if node.right is None:
            node.right = Node(element)
            return

        # Choose a random node (here is the randomness hidden)
        if random.randint(0, 1) == 0:
            self._insert(element, node.left)
        else:
            self._insert(element, node.right)

    def insert(self, element):
        if self.head is None:
            self.head = Node(element)
        else:
            self._insert(element, self.head)

The function insert(…) checks for the special case, if there are no nodes in the binary tree so far and inserts if so. Otherwise, it will forward the call to the recursive and randomized function _insert(….), which also takes the head (root) of the tree as argument.

A recursive function in this case could be at any node, but starting from the head (root) node. It will do the same all the way down.

  1. Check if element of node is smaller than element to insert. If so, switch them.
  2. Check if node has a free child (left or right). If so, use it to insert new node with element and return.
  3. If none of the above, choose a random child (left or right) and call recursive down.

That is it. It will most likely create a well balanced binary tree.

See example here.

               +---------------36---------------+               
       +-------29-------+              +-------34-------+       
   +---27---+      +---20---+      +---32---+      +---33---+   
 +- 3-+  +-13-+  +- 6-+  +- 2-+  +-24-+  +-31-+  +-25-+  +-25-+ 
  0   1           4              16               6             

In simple ascii representation of the binary tree representing the max-heap. That the binary tree keeps a balance like that ensures that the insertion will be O(log(n)) on average.

Step 4: Delete the maximum element from the heap (and return it)

Deleting the maximum element will remove the root (head) of the binary tree. Then we need to take the larger child and move it up. That obviously makes an empty space in the child below. Hence, we need to do the same operation below.

This sounds recursive, doesn’t it?

import random


class Heap:
    def __init__(self):
        self.head = None

    def _insert(self, element, node):
        # if element is larger than node.element, switch
        if element > node.element:
            element, node.element = node.element, element

        # check if available node
        if node.left is None:
            node.left = Node(element)
            return
        if node.right is None:
            node.right = Node(element)
            return

        if random.randint(0, 1) == 0:
            self._insert(element, node.left)
        else:
            self._insert(element, node.right)

    def insert(self, element):
        if self.head is None:
            self.head = Node(element)
        else:
            self._insert(element, self.head)

    def get_max(self):
        return self.head.element

    def _delete_max(self, node):
        if node.left is None and node.right is None:
            return None

        if node.left is None:
            return node.right

        if node.right is None:
            return node.left

        if node.right.element > node.left.element:
            node.element = node.right.element
            node.right = self._delete_max(node.right)
            return node
        else:
            node.element = node.left.element
            node.left = self._delete_max(node.left)
            return node

    def delete_max(self):
        if self.head is None:
            return None

        max_element = self.head.element
        self.head = self._delete_max(self.head)
        return max_element

The delete_max function takes care of the special case where there are no elements (or nodes) in the binary tree. Then it takes the largest element and calls the recursive _delete_max(…) function with the head (root) as argument.

The _delete_max(…) function does the following.

  1. Checks for special case where node has no children. If so, return None.
  2. Check if one child is not there, if so return the existing child.
  3. Otherwise, take the child with the larger element. Take the larger element and assign it to node (remember, we have removed the element form the calling node), and call recursive down with larger child on _delete_max(…) and assign result to larger child node.

That can be a bit confusing at first. But try it out.

This operation also only has O(log(n)) performance on average. And as elements are put randomly, then removing them in order (maximum elements), will remove elements randomly and keep the binary tree balanced on average case.

Step 5: The full code and a simple print function of the tree

The full code can be found here.

import random


class Node:
    def __init__(self, element):
        self.element = element
        self.left = None
        self.right = None


class Heap:
    def __init__(self):
        self.head = None

    def _insert(self, element, node):
        # if element is larger than node.element, switch
        if element > node.element:
            element, node.element = node.element, element

        # check if available node
        if node.left is None:
            node.left = Node(element)
            return
        if node.right is None:
            node.right = Node(element)
            return

        if random.randint(0, 1) == 0:
            self._insert(element, node.left)
        else:
            self._insert(element, node.right)

    def insert(self, element):
        if self.head is None:
            self.head = Node(element)
        else:
            self._insert(element, self.head)

    def get_max(self):
        return self.head.element

    def _delete_max(self, node):
        if node.left is None and node.right is None:
            return None

        if node.left is None:
            return node.right

        if node.right is None:
            return node.left

        if node.right.element > node.left.element:
            node.element = node.right.element
            node.right = self._delete_max(node.right)
            return node
        else:
            node.element = node.left.element
            node.left = self._delete_max(node.left)
            return node

    def delete_max(self):
        if self.head is None:
            return None

        max_element = self.head.element
        self.head = self._delete_max(self.head)
        return max_element

    def _get_depth(self, node):
        if node is None:
            return 0
        left = self._get_depth(node.left)
        right = self._get_depth(node.right)
        if left > right:
            return 1 + left
        else:
            return 1 + right

    def get_depth(self):
        return self._get_depth(self.head)

    def _print_heap(self, current_level, request_level, depth, node):
        characters_per_level = 4*2**depth
        characters_per_node = characters_per_level // (2**(current_level + 1))
        if current_level == request_level:
            if node is not None:
                space_fill = characters_per_node // 4 - 1
                if request_level == depth - 1:
                    print(' '*space_fill + ' ' + ' '*space_fill + f'{node.element:2d}' + ' '*space_fill + ' ' + ' '*space_fill, end='')
                else:
                    print(' '*space_fill + '+' + '-'*space_fill + f'{node.element:2d}' + '-'*space_fill + '+' + ' '*space_fill, end='')
            else:
                print(' '*characters_per_node, end='')
        else:
            if node is not None:
                self._print_heap(current_level + 1, request_level, depth, node.left)
                self._print_heap(current_level + 1, request_level, depth, node.right)
            else:
                self._print_heap(current_level + 1, request_level, depth, None)
                self._print_heap(current_level + 1, request_level, depth, None)

    def print_heap(self):
        depth = self._get_depth(self.head)
        for i in range(depth):
            self._print_heap(0, i, depth, self.head)
            print()

Notice that the print function also is recursive.

Leave a Reply