Deleting Elements of a Python List while Iterating

What will we cover in this tutorial?

  • Understand the challenge with deleting elements while iterating over a Python list.
  • How to delete element from a Python list while iterating over it.

Step 1: What happens when you just delete elements from a Python list while iterating over it?

Let’s first try this simple example to understand the challenge of deleting element in a Python list while iterating over it.

a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
for e in a:
  a.remove(e)
print(a)

Now, looking at this piece of code, it would seem to be intended to delete all elements. But that is not happening. See, the output is.

[1, 3, 5, 7, 9]

Seems like every second element is deleted. Right?

Let’s try to understand that. When we enter the the loop we see the following view.

for e (= 0, first element) in a (= [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]):
  a.remove(e)

Then the first element is removed on the second line, then the view is.

for e (= 0, first element) in a (= [1, 2, 3, 4, 5, 6, 7, 8, 9]):
  a.remove(e) (a = [1, 2, 3, 4, 5, 6, 7, 8, 9])

Going into the second iteration it looks like this.

for e (= 2, second element) in a (= [1, 2, 3, 4, 5, 6, 7, 8, 9]):
  a.remove(e)

Hence, we see that the iterator takes the second element, which now is the number 2.

This explains why the every second number is deleted from the list.

Step 2: What if we use index instead

Good idea. Let’s see what happens.

a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
for i, e in enumerate(a):
  a.pop(i)
print(a)

Which results in the same.

[1, 3, 5, 7, 9]

What if we iterate directly over the index by using the length of the list.

a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
for i in range(len(a)):
  a.pop(i)
print(a)

Oh, no.

Traceback (most recent call last):
  File "main.py", line 3, in <module>
    a.pop(i)
IndexError: pop index out of range

I get it. It is because the len(a) is invoked in the first iteration and results to 10. Then when we reach i = 5, we have already pop’ed 5 elements and have only 5 elements left. Hence, out of bound.

Not convinced?

a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
for i in range(len(a)):
  print(i, len(a), a)
  a.pop(i)
print(a)

Resulting to.

0 10 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
1 9 [1, 2, 3, 4, 5, 6, 7, 8, 9]
2 8 [1, 3, 4, 5, 6, 7, 8, 9]
3 7 [1, 3, 5, 6, 7, 8, 9]
4 6 [1, 3, 5, 7, 8, 9]
5 5 [1, 3, 5, 7, 9]
Traceback (most recent call last):
  File "main.py", line 4, in <module>
    a.pop(i)
IndexError: pop index out of range

But what to do?

Step 3: How to delete elements while iterating over a list

The problem we want to solve is not to delete all the element. It is to delete entries based on their values or some conditions, where we need to interpret the values of the elements.

How can we do that?

By using list comprehension or by making a copy. Or is it the same, as list comprehension is creating a new copy, right?

Okay, one step at the time. Just see the following example.

a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
a = [i for i in a if i % 2 == 0]
print(a)

Resulting in a copy of the the original list with only the even elements.

[0, 2, 4, 6, 8]

To see it is a copy you can evaluate the following code.

a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
b = a
a = [i for i in a if i % 2 == 0]
print(a)
print(b)

Resulting in the following, where you see the variable a get’s a new copy of it and the variable b refers to the original (and unmodified version).

[0, 2, 4, 6, 8]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Hence, the effect of the list comprehension construction above is as the following code shows.

a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# a = [i for i in a if i % 2 == 0]
c = []
for i in a:
    if i % 2 == 0:
        c.append(i)
a = c
print(a)

Getting the what you want.

[0, 2, 4, 6, 8]

Next steps

You can make the criteria more advanced by making the criteria by a function call.

def criteria(v):
  # some advanced code that returns True of False

a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
a = [i for i in a if criteria(i)]

And if you want to keep a state of all previous criteria, then you can even use an Object to keep that stored.

class State:
  # ...
  def criteria(self, v):
    # ...

s = State()
a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
a = [i for i in a if s.criteria(i)]

Also, check out this tutorial that makes some observations on performance on list comprehensions.

Comparing Performance of Python list as a Stack – How a wrong implementation can ruin performance

A Stack?

A Stack is using the principle first-in-last-out.

It is like a stack of plates. The last one you put on the top is the first one you take.

How can you implement them in Python? Well, we are in luck, you can use a Stack, and if done correctly, you will have the same performance as an actual Stack implementation will have.

But first, how can you do it wrong?

Well, you might think that the first element of the list is the top of your stack, hence in you will insert the elements on the first position, and, hence, remove them from the first position as well.

# Create a list as a stack
s = []

# Insert into the first position.
element = 7
s.insert(0, element)

# Remove from the first position.
s.pop(0)

Sounds about right?

Let’s test that and compare it with a different approach. To add the newest element to the end of the list, and, hence, remove them from the end of the list.

# Create a list and use it as stack
s = []

# Insert element in last postion
element = 7
s.append(element)

# Remove from the last position
s.pop()

Let’s check the performance of those two approaches.

Comparing the performance of the two approaches

How do you compare. You can use cProfile library. It is easy to use and informative results

See the sample code below, which compares the two approaches by create a stack each and inserting n elements to it and removing them afterwards.

import cProfile


def profile_list_as_queue_wrong(n):
    s = []
    for i in range(n):
        s.insert(0, i)
    while len(s) > 0:
        s.pop(0)


def profile_list_as_queue_correct(n):
    s = []
    for i in range(n):
        s.append(i)
    while len(s) > 0:
        s.pop()


def profile(n):
    profile_list_as_queue_wrong(n)
    profile_list_as_queue_correct(n)


cProfile.run("profile(100000)")

The results are given here.

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    5.842    5.842 <string>:1(<module>)
        1    0.078    0.078    0.107    0.107 Stack.py:12(profile_list_as_queue_correct)
        1    0.000    0.000    5.842    5.842 Stack.py:20(profile)
        1    0.225    0.225    5.735    5.735 Stack.py:4(profile_list_as_queue_wrong)
   200002    0.017    0.000    0.017    0.000 {len}
   100000    0.007    0.000    0.007    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   100000    3.547    0.000    3.547    0.000 {method 'insert' of 'list' objects}
   200000    1.954    0.000    1.954    0.000 {method 'pop' of 'list' objects}
        2    0.014    0.007    0.014    0.007 {range}

Observe that the “wrong” implementation takes over 5 seconds and the “correct” takes approximately 0.1 second. Over a factor 50 in difference.

Looking into the details

If we look at the complexities given by Python, it explains it all.

The Python lists amortised complexities are given on this page.

And you notice that the append and pop (last element) are O(1), which means constant time. Constant time, means that the operations are independent on the size of the lists. That means the correct implementation gives O(n) time complexity.

On the other hand, the insert and pop(0) have linear performance. That basically means that we with the wrong implementation end up with O(n^2) time complexity.