Avoid Common Object Oriented Programming Pitfalls

What will we cover in this tutorial?

I did them myself. When you first learn about object oriented programming, you get exited about it and want to turn everything into objects.

Very often it fails to make the code easier to maintain and understand. Yes, that is one core goal of creating object oriented programming, to make it easier to maintain and understand.

Often we just turn the traditional way of programming into objects. But that is not the point. The point is to model it in an object oriented way.

Before that, let’s just look at why we use object oriented programming.

Step 1: Why use object oriented programming?

When introducing object oriented programming to beginners it often introduces too many concepts at once.

Simply because object oriented programming is awesome and can do so many things.

Let’s just start simple with the core of the object oriented programming idea.

We want to make your program easier to understand and maintain.

That is the goal for any programmer. Also, when it comes to object oriented programming.

Object oriented programming tries to make the link between the real world and the programming world as close as possible. We humans understand the objects in the real world better than we understand how a computer pushes bits and bytes around in the memory and CPU.

Luckily, we do not need to understand all that. We just need to understand how to program the computer through a programming language.

Object oriented programming tries to make that easier with modeling the programs with objects, which are related to the way we humans understand things.

Let’s try with a simple example of a Stack.

A Stack has a top and bottom, it has operations push and pop.

How would you model the above without using object oriented programming.

stack_size = 8
stack = [None]*stack_size
top = -1 # empty stack

def pop():
    global top, stack
    if top == -1:
        return None
    else:
        top -= 1
        return stack[top + 1]

def push(element):
    global top, stack, stack_size
    if top + 1 >= stack_size:
        stack += [None]*stack_size
        stack_size *= 2
    top += 1
    stack[top] = element

def print_stack():
    global top, stack, stack_size
    for i in range(top + 1):
        print(stack[i], '', end='')
    print(" (size: " + str(stack_size) + ")")


print_stack()
for i in range(10):
    push(i)
print_stack()
for i in range(8):
    pop()
print_stack()

That is confusing code, right?

First of all, we use global variables. That makes the function calls hard to understand, as they have side effects.

This is made more confusing than necessary. It does use Python lists like an Array. That is not needed, but it is just to exemplify how difficult it is to make intuitive code if you model the world like a computer works.

Step 2: Using object oriented programming to solve the above (first step – the less bad, but still not good solution)

First time you are asked to create stack using an object oriented approach, you will probably do it in a non intuitive way.

Say, you think of a Stack like a object. The stack is the full object.

What does a stack consists of?

Well items which are piled on top of each other, and you can take the top off.

How could that be modeled?

Often the straight forward way from a classical thinking way into the a class.

class Stack:
    def __init__(self):
        self.stack_size = 8
        self.stack = [None]*self.stack_size
        self.top = -1 # empty stack

    def pop(self):
        if self.top == -1:
            return None
        else:
            self.top -= 1
            return self.stack[self.top + 1]

    def push(self, element):
        if self.top + 1 >= self.stack_size:
            self.stack += [None]*self.stack_size
            self.stack_size *= 2
        self.top += 1
        self.stack[self.top] = element

    def print_stack(self):
        for i in range(self.top + 1):
            print(self.stack[i], '', end='')
        print(" (size: " + str(self.stack_size) + ")")


s = Stack()
s.print_stack()
for i in range(10):
    s.push(i)
s.print_stack()
for i in range(8):
    s.pop()
s.print_stack()

If you inspect the code, it is actually the same code. Which in some ways has improved it.

  1. We do not have global variables anymore. They are tied to the Stack class.
  2. The function calls push and pop are also tied to the Stack class.
  3. Also, when we use the Stack, it is clear from the context, as it is tied to the variable s, in this case.

So what is wrong?

Well, it is still not simple to understand what happens in the code. It takes time to understand, even with this simple code.

The functions use variables like top and assigns it to -1 if the stack is empty. It requires you to investigate pop and the constructor to understand that.

The key is to keep it simple.

So how to do that?

Step 3: A simple way to model it

Let’s take a look at the drawing again.

A stack.

A stack actually consists of object on it. Hence, the stack is the abstraction that keeps objects in a specific order.

That said, we need to model that closer, as it will be easier to understand for the reader of the code.

The objects on the stack we will call Nodes.

What characterizes a Node? It lies either on the bottom or on top of another Node.

What characterizes a Stack? It knows the top of the stack and can push and pop Nodes.

How can that be turned into code?

class Node:
    def __init__(self, element=None, next_node=None):
        self.element = element
        self.next_node = next_node


class Stack:
    def __init__(self):
        self.top = None

    def push(self, element):
        self.top = Node(element, self.top)

    def pop(self):
        element = self.top.element
        self.top = self.top.next_node
        return element


s = Stack()
for i in range(20):
    s.push(i)
for i in range(10):
    print(s.pop(), '', end='')
print()

How is that code? Easier to understand?

Of course, normally a Stack would as a minimum have a helper function is_empty() to return if stack is empty. That can be added easily. You see how?

class Node:
    def __init__(self, element=None, next_node=None):
        self.element = element
        self.next_node = next_node


class Stack:
    def __init__(self):
        self.top = None

    def push(self, element):
        self.top = Node(element, self.top)

    def pop(self):
        element = self.top.element
        self.top = self.top.next_node
        return element

    def is_empty(self):
        return self.top == None


s = Stack()
for i in range(20):
    s.push(i)
while not s.is_empty():
    print(s.pop(), '', end='')
print()

Do you get the sense of it now?

Model the code as closely to the reality we humans understand. It makes the code easy to read and understand. Hence, easy to maintain.

3 Steps to Understand and Implement a Queue in Python

What will we cover in this tutorial?

We will cover what a Queue is. A Queue is a data structure used by computers. It resembles a queue as we know it.

The key things a Queue should have are efficient operations for insertion (enqueue) and removal (dequeue) of the queue. In this tutorial we will understand what a is, how to represent it, and how to implement the efficient operations.

Step 1: Understand a Queue

A Queue is like a queue in real life.

A Queue is used in computer science in many scenarios. One scenario is when a resource is shared among multiple consumers, then a Queue is set in front.

This resembles the real world, where we use queues in the grocery store, pharmacy, you name it. In all stores, where we have more consumers than registers (the resource) to serve.

A Queue serves the principle, First-in-first-out or FIFO.

A simple diagram shows the operations of a Queue.

A queue with the two operations enqueue and dequeue

The above queue shows the direction of the Queue and the order they have been added symbolized with the numbers 1 to 8, where 8 is about to be added.

The operations on the Queue are as follows.

  • enqueue Adds an element to the back of the Queue.
  • dequeue Removes the element of the front of the Queue.

Normally, a Queue would also have a function is_empty, which checks whether the Queue is empty.

Step 2: How to represent a Queue item with an element

How do you represent the above items of the Queue?

Well, we need to be able to keep an order of the Queue. Let’s try to draw it and see what we can figure out.

The structure we need to keep.

Surprised? Well, the Queue goest from left to right, while the arrows between the items go from right to left.

Actually, the items of the Queue can be represented by a simple Node class.

class Node:
    def __init__(self, element=None, next_node=None):
        self.element = element
        self.next_node = next_node

This simple Node class has an element and the pointer to next_node. Hence, the next_node are the arrows in the diagram above. And the element are representing the numbers in the above diagram. Obviously, the element can contain anything.

Step 3: Create a Queue class to represent the enqueue and dequeue operations

Let’s start in the simple.

Given the Nodes, we need a class Queue with a head and tail to implement the operations enqueue and dequeue

The above suggest that if we have head and tail pointer, we can have what we need to implement a Queue data structure.

The enqueue function has the special case where the Queue is empty, otherwise it will do as follows.

  • enqueue Create a new Node with the element. Point tails next_node at created Node. Then update tail to point at created Node.

Here we can implement it as follows.

class Queue:
    def __init__(self):
        self.head = None
        self.tail = None

    def enqueue(self, element):
        if self.head is None:
            self.head = self.tail = Node(element)
        else:
            node = Node(element)
            self.tail.next_node = node
            self.tail = node

The dequeue is a bit more involved.

  • dequeue Get the element from the Node that head points at. If head Node and tail Node is the same (that is if it is the last Node in the Queue), then point tail and head at None. Otherwise, set the head to point at the the next_node head points at.
class Queue:
    def __init__(self):
        self.head = None
        self.tail = None

    def enqueue(self, element):
        if self.head is None:
            self.head = self.tail = Node(element)
        else:
            node = Node(element)
            self.tail.next_node = node
            self.tail = node

    def dequeue(self):
        element = self.head.element
        if self.tail == self.head:
            self.tail = self.head = None
        else:
            self.head = self.head.next_node
        return element

Notice that the dequeue does fail if the Queue is empty. This is dealt with by creating a is_empty function, which returns whether the Queue is empty.

class Queue:
    def __init__(self):
        self.head = None
        self.tail = None

    def enqueue(self, element):
        if self.head is None:
            self.head = self.tail = Node(element)
        else:
            node = Node(element)
            self.tail.next_node = node
            self.tail = node

    def dequeue(self):
        element = self.head.element
        if self.tail == self.head:
            self.tail = self.head = None
        else:
            self.head = self.head.next_node
        return element

    def is_empty(self):
        return self.head is None

That is it.

The full code including a string representation to enable it to be printed

See the full code below. It also contains the __str__(…) function, witch enables it to be printed. Also, how to use it.

class Queue:
    def __init__(self):
        self.head = None
        self.tail = None

    def enqueue(self, element):
        if self.head is None:
            self.head = self.tail = Node(element)
        else:
            node = Node(element)
            self.tail.next_node = node
            self.tail = node

    def dequeue(self):
        element = self.head.element
        if self.tail == self.head:
            self.tail = self.head = None
        else:
            self.head = self.head.next_node
        return element

    def is_empty(self):
        return self.head is None

    def __str__(self):
        if self.head is None:
            return None
        node = self.head
        result = "["
        while node is not None:
            result += str(node.element) + " "
            node = node.next_node
        return result[:-1] + "]"


q = Queue()
for i in range(10):
    q.enqueue(i)
while not q.is_empty():
    print(q.dequeue(), '', end='')

The above code will result in the following output.

[0 1 2 3 4 5 6 7 8 9]
0 1 2 3 4 5 6 7 8 9

3 Steps to Understand and Implement a Stack in Python

What will we cover in this tutorial?

We will learn what a Stack is and how to implement it in Python.

But why bother, when you can use a Python list as a Stack? Good question. If you understand how a Stack works, you will know when to use it to efficiently solve problems. The best way to learn something, is by trying to implement it.

Why use a Stack? Check out this tutorial on how to Solve a Maze or Find the Nearest Smallest Element Problem.

Step 1: Understand Stack

A Stack in computer science is like a stack of plates. You can add one on top at the time. Or you can remove the top plate one by one.

Stack with 4 items on it, then a new item (5) is added by push operation, then the last item is removed (5) by pop.

A stack can have any number of items on it. In the above illustration you see the following.

  1. (left) A stack with 4 elements (1, 2, 3, 4).
  2. (mid) A stack of 4 elements where a fifth element is added by a push operation.
  3. (right) A stack that had 5 elements, but where the last item (5) is removed by a pop operation.

That is a Stack has two vital operations push and pop. Also, these operations should be done in constant time, O(1). That is they should be done with a performance, which is independent of the size of the stack. Say, if the Stack has 2 elements, then a push (or a pop) takes the same time to execute if the Stack has 2,000,000 elements.

Notice the following with a Stack. The last item you add on the Stack is the first one to get removed. That is Last-in-first-out or LIFO.

Step 2: How to represent a Stack item (Node) in Python code

As always, keep things simple.

How can we represent the above efficiently, such that the operations are not becoming expensive.

We only need to keep track of two things.

  1. The item added before. Starting with the first item to be added to point at None.
  2. A pointer to the top of the Stack (not in diagram above).

If we have that we can create a Stack. We can make the operations as follows.

  • push – Find the top item of the stack using pointer from 2. Then add the new item and let it point at top item. Let stack pointer point at new item.
  • pop – Remove top item. Let stack pointer point at item pointed by top item.

Hence, we need a way to represent a item. Such an item is called a Node. This Node should have two things.

  1. The element it represents.
  2. A pointer to the next node.

This can be turned into Python code like this.

class Node:
    def __init__(self, element=None, next_node=None):
        self.element = element
        self.next_node = next_node

This class can take the element and next_node pointer as optional arguments as constructed.

Step 3: Creating the stack class to represent the operations push and pop

Let’s start by creating a new class Stack.

class Stack:
    def __init__(self):
        self.stack = None

This creates an empty Stack, where we have the pointer self.stack to point at None.

To add an element can be done by adding a new Node and let it point at the current self.stack.

class Stack:
    def __init__(self):
        self.stack = None

    def push(self, element):
        node = Node(element, self.stack)
        self.stack = node

This works for the initial element push’ed on the Stack and for any element push’ed afterwards.

Now we need to pop the top element of the Stack.

class Stack:
    def __init__(self):
        self.stack = None

    def push(self, element):
        node = Node(element, self.stack)
        self.stack = node

    def pop(self):
        element = self.stack.element
        self.stack = self.stack.next_node
        return element

We need to get the element. Update the self.stack pointer to point to the element below.

Notice, we do not handle the case where the stack is empty, this will cause the pop function to fail. This is handled by adding a a is_empty function, to check whether the Stack is empty.

class Stack:
    def __init__(self):
        self.stack = None

    def push(self, element):
        node = Node(element, self.stack)
        self.stack = node

    def pop(self):
        element = self.stack.element
        self.stack = self.stack.next_node
        return element

    def is_empty(self):
        return self.stack == None

The full code with a print function

Here the full code is provided with a way to represent the Stack as a string. That way you can print it out.

class Node:
    def __init__(self, element=None, next_node=None):
        self.element = element
        self.next_node = next_node


class Stack:
    def __init__(self):
        self.stack = None

    def push(self, element):
        node = Node(element, self.stack)
        self.stack = node

    def pop(self):
        element = self.stack.element
        self.stack = self.stack.next_node
        return element

    def is_empty(self):
        return self.stack == None

    def __str__(self):
        if self.stack is None:
            return None
        node = self.stack
        result = "["
        while node is not None:
            result += str(node.element) + " "
            node = node.next_node
        return result[:-1] + "]"


stack = Stack()
for i in range(10):
    stack.push(i)
print(stack)
for _ in range(8):
    stack.pop()
print(stack)

Where the example code in the end will give the following output.

[9 8 7 6 5 4 3 2 1 0]
[1 0]

Create a Max-Heap with a Randomized Algorithm in Python

What will we cover in this tutorial?

We will create a heap, or more specifically, a max-heap. A max-heap is a tree structure where the node value of every parent is greater or equal to the children.

In this tutorial we will implement a max-heap with a binary tree and use a randomized approach to keep it balanced.

You might be wondering why to make it randomized. Simply, said, to keep it simple and keep operations on average O(log(n)).

Step 1: Recall what a max-heap is

A max-heap is a tree structure where the node value of every parent is greater or equal to the children.

Example of a max-heap

A heap will have two primary functions.

  1. Insert an element and still keep the max-heap structure.
  2. Get and remove maximum element (the element at the root) and still keep the max-heap structure.

The goal is to be able to do these operations in O(log(n)) time.

Step 2: Understand what a randomized algorithm can do for you

Randomized algorithms help you achieve great performance on average while keeping the algorithms simple.

To get keep the operations of a heap at worst-case O(log(n)), you need to keep the binary tree structure balanced. This requires complex ways to ensure that.

Instead, just put in the leaves in the three randomly, and you will get the same result with very high probability. Hence, you will end up with an average time of O(log(n)) for the operations.

Step 3: Insert into a max-heap

Now to the fun part. The code.

Let’s start simple and create a Node to represent the nodes in the binary tree, which will represent the max-heap.

class Node:
    def __init__(self, element):
        self.element = element
        self.left = None
        self.right = None

The node needs to be able to keep the element (which should be comparable), and a left and right child.

From the above Node class you can create an arbitrary binary tree.

The max-heap insert function can be implemented by a recursive and randomized approach in the following manner.

import random


class Heap:
    def __init__(self):
        self.head = None

    def _insert(self, element, node):
        # if element is larger then node.element, switch
        if element > node.element:
            element, node.element = node.element, element

        # check if available node
        if node.left is None:
            node.left = Node(element)
            return
        if node.right is None:
            node.right = Node(element)
            return

        # Choose a random node (here is the randomness hidden)
        if random.randint(0, 1) == 0:
            self._insert(element, node.left)
        else:
            self._insert(element, node.right)

    def insert(self, element):
        if self.head is None:
            self.head = Node(element)
        else:
            self._insert(element, self.head)

The function insert(…) checks for the special case, if there are no nodes in the binary tree so far and inserts if so. Otherwise, it will forward the call to the recursive and randomized function _insert(….), which also takes the head (root) of the tree as argument.

A recursive function in this case could be at any node, but starting from the head (root) node. It will do the same all the way down.

  1. Check if element of node is smaller than element to insert. If so, switch them.
  2. Check if node has a free child (left or right). If so, use it to insert new node with element and return.
  3. If none of the above, choose a random child (left or right) and call recursive down.

That is it. It will most likely create a well balanced binary tree.

See example here.

               +---------------36---------------+               
       +-------29-------+              +-------34-------+       
   +---27---+      +---20---+      +---32---+      +---33---+   
 +- 3-+  +-13-+  +- 6-+  +- 2-+  +-24-+  +-31-+  +-25-+  +-25-+ 
  0   1           4              16               6             

In simple ascii representation of the binary tree representing the max-heap. That the binary tree keeps a balance like that ensures that the insertion will be O(log(n)) on average.

Step 4: Delete the maximum element from the heap (and return it)

Deleting the maximum element will remove the root (head) of the binary tree. Then we need to take the larger child and move it up. That obviously makes an empty space in the child below. Hence, we need to do the same operation below.

This sounds recursive, doesn’t it?

import random


class Heap:
    def __init__(self):
        self.head = None

    def _insert(self, element, node):
        # if element is larger than node.element, switch
        if element > node.element:
            element, node.element = node.element, element

        # check if available node
        if node.left is None:
            node.left = Node(element)
            return
        if node.right is None:
            node.right = Node(element)
            return

        if random.randint(0, 1) == 0:
            self._insert(element, node.left)
        else:
            self._insert(element, node.right)

    def insert(self, element):
        if self.head is None:
            self.head = Node(element)
        else:
            self._insert(element, self.head)

    def get_max(self):
        return self.head.element

    def _delete_max(self, node):
        if node.left is None and node.right is None:
            return None

        if node.left is None:
            return node.right

        if node.right is None:
            return node.left

        if node.right.element > node.left.element:
            node.element = node.right.element
            node.right = self._delete_max(node.right)
            return node
        else:
            node.element = node.left.element
            node.left = self._delete_max(node.left)
            return node

    def delete_max(self):
        if self.head is None:
            return None

        max_element = self.head.element
        self.head = self._delete_max(self.head)
        return max_element

The delete_max function takes care of the special case where there are no elements (or nodes) in the binary tree. Then it takes the largest element and calls the recursive _delete_max(…) function with the head (root) as argument.

The _delete_max(…) function does the following.

  1. Checks for special case where node has no children. If so, return None.
  2. Check if one child is not there, if so return the existing child.
  3. Otherwise, take the child with the larger element. Take the larger element and assign it to node (remember, we have removed the element form the calling node), and call recursive down with larger child on _delete_max(…) and assign result to larger child node.

That can be a bit confusing at first. But try it out.

This operation also only has O(log(n)) performance on average. And as elements are put randomly, then removing them in order (maximum elements), will remove elements randomly and keep the binary tree balanced on average case.

Step 5: The full code and a simple print function of the tree

The full code can be found here.

import random


class Node:
    def __init__(self, element):
        self.element = element
        self.left = None
        self.right = None


class Heap:
    def __init__(self):
        self.head = None

    def _insert(self, element, node):
        # if element is larger than node.element, switch
        if element > node.element:
            element, node.element = node.element, element

        # check if available node
        if node.left is None:
            node.left = Node(element)
            return
        if node.right is None:
            node.right = Node(element)
            return

        if random.randint(0, 1) == 0:
            self._insert(element, node.left)
        else:
            self._insert(element, node.right)

    def insert(self, element):
        if self.head is None:
            self.head = Node(element)
        else:
            self._insert(element, self.head)

    def get_max(self):
        return self.head.element

    def _delete_max(self, node):
        if node.left is None and node.right is None:
            return None

        if node.left is None:
            return node.right

        if node.right is None:
            return node.left

        if node.right.element > node.left.element:
            node.element = node.right.element
            node.right = self._delete_max(node.right)
            return node
        else:
            node.element = node.left.element
            node.left = self._delete_max(node.left)
            return node

    def delete_max(self):
        if self.head is None:
            return None

        max_element = self.head.element
        self.head = self._delete_max(self.head)
        return max_element

    def _get_depth(self, node):
        if node is None:
            return 0
        left = self._get_depth(node.left)
        right = self._get_depth(node.right)
        if left > right:
            return 1 + left
        else:
            return 1 + right

    def get_depth(self):
        return self._get_depth(self.head)

    def _print_heap(self, current_level, request_level, depth, node):
        characters_per_level = 4*2**depth
        characters_per_node = characters_per_level // (2**(current_level + 1))
        if current_level == request_level:
            if node is not None:
                space_fill = characters_per_node // 4 - 1
                if request_level == depth - 1:
                    print(' '*space_fill + ' ' + ' '*space_fill + f'{node.element:2d}' + ' '*space_fill + ' ' + ' '*space_fill, end='')
                else:
                    print(' '*space_fill + '+' + '-'*space_fill + f'{node.element:2d}' + '-'*space_fill + '+' + ' '*space_fill, end='')
            else:
                print(' '*characters_per_node, end='')
        else:
            if node is not None:
                self._print_heap(current_level + 1, request_level, depth, node.left)
                self._print_heap(current_level + 1, request_level, depth, node.right)
            else:
                self._print_heap(current_level + 1, request_level, depth, None)
                self._print_heap(current_level + 1, request_level, depth, None)

    def print_heap(self):
        depth = self._get_depth(self.head)
        for i in range(depth):
            self._print_heap(0, i, depth, self.head)
            print()

Notice that the print function also is recursive.

Solve a Maze in Python with a Stack

What will we cover in this tutorial?

Given an input of a maze, how can we solve it?

That is – given an input like this the following.

#############
S #     #   #
# #  ##   # #
# #  ## ### #
#       #   #
######### # #
#         # E
#############

How can we find a path through the maze starting from S and ending at E.

#############
S.#.....#...#
#.#. ##...#.#
#.#. ## ###.#
#...    #  .#
######### #.#
#         #.E
#############

We will create a program that can do that in Python.

Step 1: Reading and representing the Maze

An good representation of the maze makes it easier to understand the code. Hence, the first part is to figure out how to represent the maze.

To clarify, imagine that this maze.

v

Was represented by one long string, like this.

maze = "#############S #     #   ## #  ##   # ## #  ## ### ##       #   ########## # ##         # E#############"

While this is possible, if you also know the dimensions (8 rows and 13 columns), it makes things difficult to navigate in. Say, you are somewhere in the maze, and you want to see what is right above you. How do you do that?

Yes, you can do it, but it is complex.

Now image you have the following representation.

maze = [['#', '#', '#', '#', '#', '#', '#', '#', '#', '#', '#', '#', '#'],
        ['S', ' ', '#', ' ', ' ', ' ', ' ', ' ', '#', ' ', ' ', ' ', '#'],
        ['#', ' ', '#', ' ', ' ', '#', '#', ' ', ' ', ' ', '#', ' ', '#'],
        ['#', ' ', '#', ' ', ' ', '#', '#', ' ', '#', '#', '#', ' ', '#'],
        ['#', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '#', ' ', ' ', ' ', '#'],
        ['#', '#', '#', '#', '#', '#', '#', '#', '#', ' ', '#', ' ', '#'],
        ['#', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '#', ' ', 'E'],
        ['#', '#', '#', '#', '#', '#', '#', '#', '#', '#', '#', '#', '#']]

This will make it easier to navigate in the maze.

Then maze[1][0] represents row 1 (2nd row) and column 0 (1st column), which is S.

If we assume that the maze is represented in a file, then we can read that file and convert it with the following code.

def load_maze(file_name):
    f = open(file_name)
    maze = f.read()
    f.close()
    return maze

def convert_maze(maze):
    converted_maze = []
    lines = maze.splitlines()
    for line in lines:
        converted_maze.append(list(line))
    return converted_maze


maze = load_maze("maze.txt")
maze = convert_maze(maze)
print(maze)

Step 2: Creating a print function of the maze

As you probably noticed, the print statement of the maze (print(maze)) did not do a very good job.

[['#', '#', '#', '#', '#', '#', '#', '#', '#', '#', '#', '#', '#'], ['S', ' ', '#', ' ', ' ', ' ', ' ', ' ', '#', ' ', ' ', ' ', '#'], ['#', ' ', '#', ' ', ' ', '#', '#', ' ', ' ', ' ', '#', ' ', '#'], ['#', ' ', '#', ' ', ' ', '#', '#', ' ', '#', '#', '#', ' ', '#'], ['#', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '#', ' ', ' ', ' ', '#'], ['#', '#', '#', '#', '#', '#', '#', '#', '#', ' ', '#', ' ', '#'], ['#', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '#', ' ', 'E'], ['#', '#', '#', '#', '#', '#', '#', '#', '#', '#', '#', '#', '#']]

In order to be able to follow what happens in the code later on, it would be nice with a better representation of the code.

def print_maze(maze):
    for row in maze:
        for item in row:
            print(item, end='')
        print()


print_maze(maze)

This will print the maze in a more human readable way.

#############
S #     #   #
# #  ##   # #
# #  ## ### #
#       #   #
######### # #
#         # E
#############

Step 3: Finding the starting point of the maze

First thing first. We need to find where we enter the maze. This is where the S is located.

We could have some assumptions here, say, that the starting point is always located on the outside borders of the maze. To make it more general, we will assume that the starting point, S, can be located anywhere in the maze.

This makes the algorithm to find it easy, but of course not effective in worst case, as we need to check all locations in the maze.

def find_start(maze):
    for row in range(len(maze)):
        for col in range(len(maze[0])):
            if maze[row][col] == 'S':
                return row, col


start = find_start(maze)
print(start)

Which will go through all rows and columns one by one and check if it is the starting point. When found, it will return it immediately.

Notice, that we do assume that we have a maze with at least one row, also that the starting point exists. If not, it will not return anything.

Step 4: Find if there is a path in the maze

This is where a stack comes in handy. If you are new to what a stack is, then check out this tutorial.

For the purpose here, we are not looking for performance, we just need the concept of a stack.

To see if there is a way through the maze, you need to check all paths. As some leads to the same points, we need to keep track of where we have been. If we reach a dead end, where we have been, we need to backtrack to the last point with options, and check them out.

Wow. That was a lot of talking. How can we understand that a bit better.

Let’s make a simple maze and see it.

###
S E
###

When you are at start, S, you have (in theory) 4 possible ways to go. Up, Down, Right, Left. In the above example you can only go Right. Let’s do that, and mark the position we just left.

###
[email protected]
###

Now we have marked where we have been with X and where we are with @. Then we can check all possible ways. Up, Down, Right, Left. Again we can only go Right, as Left we have already been there.

Luckily, going Right is the exit E and we are done.

Now that was a simple maze. Now how does a stack come into the picture? Well, we used it secretly and in a very simple case.

So let’s try a bit more complex case.

#####
S # E
#   #
# # #
#####

Now let’s also have a stack with the positions we can visit. We will use the coordinate system as the representation of the list of lists we use to represent the maze. That means that the upper left corner is (0, 0) and the lower right corner is (5, 5) in this case.

The starting point is (1 , 0).

Now, let’s have a stack in the picture and put the staring point on that stack.

#####
S # E
#   #
# # #
#####      stack -> (1, 0)

When we enter we look for points on the stack. Yes, we have one point (1, 0). Then we pop it off and check all possible positions. Up, Down, Right, Left. Only one is possible, Left, which is (1, 1). Updating where we been, then it looks like this.

#####
X # E
#   #
# # #
#####      stack -> (1, 1)

Notice, that the stack changed and we marked the first point with X.

We do the same. Pop of the stack and get (1, 1) and check Up, Down, Right, Left. Only down is possible, which leaves us in (2, 1). Now the picture looks like this.

#####
XX# E
#   #
# # #
#####      stack -> (2, 1)

Then we do the same. We pop of the stack and get (2, 1) and check Up, Down, Right, Left. Now we can go both Down and Right, which is (3, 1) and (2, 2), respectively. Hence, we push both of them on the stack.

#####
XX# E
#X  #
# # #               (3, 1)
#####      stack -> (2, 2)

Depending on the order we put thins on stack, it will look like that.

Now we pop of the top element (3, 1) and check Up, Down, Right, Left. But there are no way to go from there, it is a dead end. Then we mark that spot and do not push anything on the stack.

#####
XX# E
#X  #
#X# #
#####      stack -> (2, 2)

Now we pop (2, 2) of the stack, as this is the first possibility we have to go another way in the maze. Then check Up, Down, Right, Left. We can only go Right. After marking pushing we have.

#####
XX# E
#XX #
#X# #
#####      stack -> (3, 2)

Now we can both go up and down and add that to the stack.

#####
XX# E
#XXX#
#X# #               (3, 2)
#####      stack -> (3, 4)

Now we pop (3, 2) of the stack. Then we check all and see we can only go Right.

#####
XX#XE
#XXX#
#X# #               (4, 2)
#####      stack -> (3, 4)

Then we pop (4, 2) of the stack and see it is the exit, E. That is just what we needed.

The stack keeps track on all the possible choices we have along the way. If we reach a dead end (also, if we have visited all around us and marked it by X), we can look at the last possible choice we skipped, which is on the top of the stack.

That actually means, that we do not need to move backwards along the road we went, we can go directly to the point on the stack. We know we can reach that point, as we have already visited a neighboring point, which is connected. Also, as we mark everything along the way, we do not go in endless loops. We only visit every point at most once.

Now, if there was no way to the exit, E, this will eventually terminate when there is no points on the stack.

Step 5: Implementing the algorithm

The above can be made into a simple algorithm to determine whether there is a way through the maze.

The pseudo code for the algorithm could be something like this.

stack.push(start)
while stack is not empty
    location = stack.pop()

    if location is marked continue to next location in loop

    if location is marked as exit - return True (we found the exit)

    for loc as Up, Down, Left, Right:
        if loc is valid and not marked add to stack

return False (if all locations are tried (stack empty) there is not way to the exit)

Now let’s turn that into code.

# This is only a helper function to see if we have a valid positino.
def is_valid_position(maze, pos_r, pos_c):
    if pos_r < 0 or pos_c < 0:
        return False
    if pos_r >= len(maze) or pos_c >= len(maze[0]):
        return False
    if maze[pos_r][pos_c] in ' E':
        return True
    return False


def solve_maze(maze, start):
    # We use a Python list as a stack - then we have push operations as append, and pop as pop.
    stack = []

    # Add the entry point (as a tuple)
    stack.append(start)

    # Go through the stack as long as there are elements
    while len(stack) > 0:
        pos_r, pos_c = stack.pop()

        if maze[pos_r][pos_c] == 'E':
            print("GOAL")
            return True

        if maze[pos_r][pos_c] == 'X':
            # Already visited
            continue

        # Mark position as visited
        maze[pos_r][pos_c] = 'X'
        # Check for all possible positions and add if possible
        if is_valid_position(maze, pos_r - 1, pos_c):
            stack.append((pos_r - 1, pos_c))
        if is_valid_position(maze, pos_r + 1, pos_c):
            stack.append((pos_r + 1, pos_c))
        if is_valid_position(maze, pos_r, pos_c - 1):
            stack.append((pos_r, pos_c - 1))
        if is_valid_position(maze, pos_r, pos_c + 1):
            stack.append((pos_r, pos_c + 1))

    # We didn't find a path, hence we do not need to return the path
    return False

The full code

Here we present the full code. It prints out the state through the processing. It expects a file called maze.txt with the maze in it. See below for a possible file content.

def load_maze(file_name):
    f = open(file_name)
    maze = f.read()
    f.close()
    return maze


def convert_maze(maze):
    converted_maze = []
    lines = maze.splitlines()
    for line in lines:
        converted_maze.append(list(line))
    return converted_maze


def print_maze(maze):
    for row in maze:
        for item in row:
            print(item, end='')
        print()


def find_start(maze):
    for row in range(len(maze)):
        for col in range(len(maze[0])):
            if maze[row][col] == 'S':
                return row, col



from time import sleep


# This is only a helper function to see if we have a valid positino.
def is_valid_position(maze, pos_r, pos_c):
    if pos_r < 0 or pos_c < 0:
        return False
    if pos_r >= len(maze) or pos_c >= len(maze[0]):
        return False
    if maze[pos_r][pos_c] in ' E':
        return True
    return False


def solve_maze(maze, start):
    # We use a Python list as a stack - then we have push operations as append, and pop as pop.
    stack = []

    # Add the entry point (as a tuple)
    stack.append(start)

    # Go through the stack as long as there are elements
    while len(stack) > 0:
        pos_r, pos_c = stack.pop()

        print("Current position", pos_r, pos_c)

        if maze[pos_r][pos_c] == 'E':
            print("GOAL")
            return True

        if maze[pos_r][pos_c] == 'X':
            # Already visited
            continue

        # Mark position as visited
        maze[pos_r][pos_c] = 'X'
        # Check for all possible positions and add if possible
        if is_valid_position(maze, pos_r - 1, pos_c):
            stack.append((pos_r - 1, pos_c))
        if is_valid_position(maze, pos_r + 1, pos_c):
            stack.append((pos_r + 1, pos_c))
        if is_valid_position(maze, pos_r, pos_c - 1):
            stack.append((pos_r, pos_c - 1))
        if is_valid_position(maze, pos_r, pos_c + 1):
            stack.append((pos_r, pos_c + 1))


        # To follow the maze
        print('Stack:' , stack)
        print_maze(maze)


    # We didn't find a path, hence we do not need to return the path
    return False


maze = load_maze("maze.txt")
maze = convert_maze(maze)
print_maze(maze)
start = find_start(maze)
print(start)
print(solve_maze(maze, start))

Then for some possible content of file maze.txt needed by the program above. To make the above code work, save the content below in a file called maze.txt in the same folder from where you run the code.

#############
S #     #   #
# #  ##   # #
# #  ## ### #
#       #   #
######### # #
#         # E
#############

Further work

The above code only answers if there is a path from S to E in the maze. It would also be interesting to print a route out. Further improvements could be finding the shortest path.

Queue vs Python list – Comparing the Performance – Can a simple Queue beat the default Python list?

How to profile a program in Python

In this video we will see how cProfile (default Python library) can help you to get run-times from your Python program.

Queue vs Python lists

In this video we will compare the performance of a simple Queue implemented directly into Python (no optimisations) with the default Python list.

Can it compare with it on performance?

This is where time complexity analysis come into the picture. A Queue insert and deletion is O(1) time complexity. A Python list used as a queue has O(n) time complexity.

But does the performance and run-time show the same? Here we compare the run-time by using cProfile in Python.

Want to learn more about Linked-lists, Stacks and Queues?

Check out my Course on Linked Lists, Stacks and Queues

Find the Nearest Smaller Element on Left Side in an Array – Understand the Challenge to Solve it Efficiently

The Nearest Smaller Element problem explained:

Given an array (that is a list) of integers, for each element find all the nearest element smaller on the left side of it.

The naive solution has time complexity O(n^2). Can you solve it in O(n)? Well, you need to have a Stack to do that.

The naive solution is for each element to check all the elements on the left of it, to find the first one which is smaller.

The worst case run time for that would be O(n^2). For an array of length n, it would take: 0 + 1 + 2 + 3 + … + (n-1) comparisons. = (n-1)*n/2 = O(n^2) comparisons.

But with a stack we can improve that.

Want to learn more about Stacks?

Check out my Course on Linked Lists, Stacks and Queues

How to Implement a Stack in Python and Check the Run-time Performance

We will cover the following in this article

  • What is a Stack – a short introduction
  • How to implement a Stack in Python
  • Investigate the run-time performance

What is a Stack

A Stack is a useful concept that is used in daily life, and hence, a concept that is important to understand and master as a programmer.

To understand Stacks just think of a stack of plates. There are two main operations you can do. First, you can add a plate on top of the stack. That operation is called push adds the element on top of the stack. Second, you can remove the top plate of the stack. That operation is called pop, and returns the top element of the stack.

In the diagram below a Stack is pictured. It contains of a Stack of element on the left side. The operation push of the element 03 is executed and results is pictured on the right side. Notice, that the push operation puts the element on top of the stack.

Below the operation pop is executed. Notice, that the pop operation takes from the top of the stack.

Implementation of a Stack in Python

It is a good idea to have a helper class Node that represents the elements on the stack.

class Node:
    def __init__(self, element=None, next_node=None):
        self.element = element
        self.next_node = next_node

The actual functionality of the Stack is kept in a Stack class.

class Stack:
    def __init__(self):
        self.stack = None

    def push(self, element):
        self.stack = Node(element, self.stack)

    def pop(self):
        element = self.stack.element
        self.stack = self.stack.next_node
        return element

    def is_empty(self):
        return self.stack is None

Now you can use your stack. Like the example below.

s = Stack()
for i in range(5):
    s.push(i)
while not s.is_empty():
    print(s.pop())

Will give the following output.

4
3
2
1
0

Notice the order of the element being removed from the stack by pop.

Run-time Performance

If we look at how the stack perform in order of the data size. To investigate the run-time performance the cProfile library is a good choice and simple to use. The following piece of code will help you investigate the performance.

import cProfile

def profile_stack(n):
    s = Stack()
    for i in range(n):
        s.push(i)
    while not s.is_empty():
        s.pop()


cProfile.run("profile_stack(100000)")

See the following graph.

As you see, the push and pop operations are constant, O(1), resulting in a linear performance of n push and pop operations as in the above experiment.

Comparing Performance of Python list as a Stack – How a wrong implementation can ruin performance

A Stack?

A Stack is using the principle first-in-last-out.

It is like a stack of plates. The last one you put on the top is the first one you take.

How can you implement them in Python? Well, we are in luck, you can use a Stack, and if done correctly, you will have the same performance as an actual Stack implementation will have.

But first, how can you do it wrong?

Well, you might think that the first element of the list is the top of your stack, hence in you will insert the elements on the first position, and, hence, remove them from the first position as well.

# Create a list as a stack
s = []

# Insert into the first position.
element = 7
s.insert(0, element)

# Remove from the first position.
s.pop(0)

Sounds about right?

Let’s test that and compare it with a different approach. To add the newest element to the end of the list, and, hence, remove them from the end of the list.

# Create a list and use it as stack
s = []

# Insert element in last postion
element = 7
s.append(element)

# Remove from the last position
s.pop()

Let’s check the performance of those two approaches.

Comparing the performance of the two approaches

How do you compare. You can use cProfile library. It is easy to use and informative results

See the sample code below, which compares the two approaches by create a stack each and inserting n elements to it and removing them afterwards.

import cProfile


def profile_list_as_queue_wrong(n):
    s = []
    for i in range(n):
        s.insert(0, i)
    while len(s) > 0:
        s.pop(0)


def profile_list_as_queue_correct(n):
    s = []
    for i in range(n):
        s.append(i)
    while len(s) > 0:
        s.pop()


def profile(n):
    profile_list_as_queue_wrong(n)
    profile_list_as_queue_correct(n)


cProfile.run("profile(100000)")

The results are given here.

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    5.842    5.842 <string>:1(<module>)
        1    0.078    0.078    0.107    0.107 Stack.py:12(profile_list_as_queue_correct)
        1    0.000    0.000    5.842    5.842 Stack.py:20(profile)
        1    0.225    0.225    5.735    5.735 Stack.py:4(profile_list_as_queue_wrong)
   200002    0.017    0.000    0.017    0.000 {len}
   100000    0.007    0.000    0.007    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   100000    3.547    0.000    3.547    0.000 {method 'insert' of 'list' objects}
   200000    1.954    0.000    1.954    0.000 {method 'pop' of 'list' objects}
        2    0.014    0.007    0.014    0.007 {range}

Observe that the “wrong” implementation takes over 5 seconds and the “correct” takes approximately 0.1 second. Over a factor 50 in difference.

Looking into the details

If we look at the complexities given by Python, it explains it all.

The Python lists amortised complexities are given on this page.

And you notice that the append and pop (last element) are O(1), which means constant time. Constant time, means that the operations are independent on the size of the lists. That means the correct implementation gives O(n) time complexity.

On the other hand, the insert and pop(0) have linear performance. That basically means that we with the wrong implementation end up with O(n^2) time complexity.

How to Implement a Queue in Python and Compare Performance with a Python list

What will we cover in this article

  • What is a Queue?
  • Implement a Queue in Python
  • Make performance testing of it
  • Compare it with performance of a Python list

What is a Queue?

We all know what a queue is. You go to the grocery store and get spinach, strawberry and bananas for your shake. Then you see a long line of people in front of the register. That line is a queue.

The same holds in programming. You create queues to process data or input of any kind.

How to implement a Queue in Python

It is easier than you think.

First you create a Node class to represent each node in a queue. A node is an abstraction to represent a point to the next node and the actual element.

class Node:
    def __init__(self, element=None, next_node=None):
        self.element = element
        self.next_node = next_node


Then you create the class for the Queue.

class Queue:
    def __init__(self):
        self.head = None
        self.tail = None

    def enqueue(self, element):
        if self.head is None:
            self.head = self.tail = Node(element)
        else:
            n = Node(element, self.tail)
            self.tail.next_node = n
            self.tail = n

    def dequeue(self):
        element = self.head.element
        if self.tail == self.head:
            self.tail = self.head = None
        else:
            self.head = self.head.next_node
        return element

    def is_empty(self):
        return self.head is None

How does it work. Let’s make a simple example.

q = Queue()
for i in range(10):
    q.enqueue(i)

while not q.is_empty():
    print(q.dequeue())

Which will output.

0
1
2
3
4
5
6
7
8
9

Yes! You guessed it.

How do we test performance?

I like to use the cProfile library. It is easy to use and gives informative results.

So how do you test performance? You simply import the cProfile library and use the cProfile.run(…) call.

You also need to do some operations to see how your Queue performs. See the code as an example.

import cProfile


def profile_queue(n):
    q = Queue()
    for i in range(n):
        q.enqueue(i)
    while not q.is_empty():
        q.dequeue()


def profile(n):
    profile_queue(n)


cProfile.run("profile(100000)")

Which will result in the following output.

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.819    0.819 <string>:1(<module>)
   100000    0.310    0.000    0.351    0.000 Queue.py:11(enqueue)
   100000    0.308    0.000    0.308    0.000 Queue.py:19(dequeue)
   100000    0.041    0.000    0.041    0.000 Queue.py:2(__init__)
   100001    0.021    0.000    0.021    0.000 Queue.py:27(is_empty)
        1    0.132    0.132    0.819    0.819 Queue.py:34(profile_queue)
        1    0.000    0.000    0.819    0.819 Queue.py:42(profile)
        1    0.000    0.000    0.000    0.000 Queue.py:7(__init__)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.008    0.008    0.008    0.008 {range}

The interesting line is line 9, which tells us how much time is spend in the call to profile_queue.

But is the result good?

We need to compare it to other implementations.

Performance testing the Queue with a Python list

Python lists are used for anything. Can we use a Python list as a Queue. Of course. Let’s try to implement that and compare it to our Queue.

import cProfile


def profile_queue(n):
    q = Queue()
    for i in range(n):
        q.enqueue(i)
    while not q.is_empty():
        q.dequeue()


def profile_list_as_queue(n):
    q = []
    for i in range(n):
        q.insert(0,i)
    while len(q) > 0:
        q.pop()


def profile(n):
    profile_queue(n)
    profile_list_as_queue(n)


cProfile.run("profile(100000)")

How does that compare? Let’s see.

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    3.680    3.680 <string>:1(<module>)
   100000    0.295    0.000    0.331    0.000 Queue.py:11(enqueue)
   100000    0.298    0.000    0.298    0.000 Queue.py:19(dequeue)
   100000    0.036    0.000    0.036    0.000 Queue.py:2(__init__)
   100001    0.019    0.000    0.019    0.000 Queue.py:27(is_empty)
        1    0.104    0.104    0.756    0.756 Queue.py:34(profile_queue)
        1    0.101    0.101    2.924    2.924 Queue.py:42(profile_list_as_queue)
        1    0.000    0.000    3.680    3.680 Queue.py:50(profile)
        1    0.000    0.000    0.000    0.000 Queue.py:7(__init__)
   100001    0.005    0.000    0.005    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   100000    2.805    0.000    2.805    0.000 {method 'insert' of 'list' objects}
   100000    0.012    0.000    0.012    0.000 {method 'pop' of 'list' objects}
        2    0.004    0.002    0.004    0.002 {range}

Wow. Our Queue is way faster than the Python list.

But how is it comparing in general?

Comparing the performance of the Queue and a Python list as a Queue.

While it is difficult to see, the performance of the Queue is O(n) (linear) while the performance of the Python list as a Queue is O(n^2).

Hence, the Queue will outperform the Python list for this use case.