Sort a Python List with String of Integers or a Mixture

What will we cover in this tutorial?

  • How can you sort a list of strings containing integers by the integer value?
  • Or what if it contains both strings containing integers and integers?
  • Finally, also how if only a substring contains integers?

Why sort on a list of integers represented in strings fails

First of, we need to understand why it is not trivial to solve by just calling sort on the list.

Let’s just try with an example.

l = ['4', '8', '12', '23', '4']
l.sort()
print(l)

Which will result in the following list.

['12', '23', '4', '4', '8']

Where you see the list is sorted lexicographical order and not by the numeric value the strings represent.

How to solve this

Solving this is quite straight forward if you know your way around Python. You look in the documentation and see that it takes a key as argument. Okay, you are new to this, so what does it mean.

key specifies a function of one argument that is used to extract a comparison key from each list element

Python docs.

Still not comfortable about it. Let’s try to figure it out together. If you are new to Python, you might not know that you can send functions as arguments like any other value.

The key argument is a function that will be applied on every item in the list. The output of that function will be used to make a simple comparison and order it by that.

That is great news. Why?

I am glad you asked. If we just use the int() function as argument, it should cast the string to an integer and use that for comparison and our problem is solved.

Let’s try.

l = ['4', '8', '12', '23', '4']
l.sort(key=int)
print(l)

Resulting to the following list.

['4', '4', '8', '12', '23']

How simple is that?

What if my list is a mixture of integers and strings of integers?

What is your wild guess?

l = ['4', '8', 12, '23', 4]
l.sort(key=int)
print(l)

Notice that some integers are not strings any more. Let see the output.

['4', 4, '8', 12, '23']

It works. This is why we love Python!

But what if it is more complex?

A complex examples of sorting

Say we have a list of of strings like this one.

l = ['4 dollars', '8 dollars', '12 dollars', '23 dollars', '4 dollars']

The story is something like this. You ask a lot of providers how much it will cost to give a specific service. The answers are given in the list and you want to investigate them in order of lowest price.

We can just do the same, right?

l = ['4 dollars', '8 dollars', '12 dollars', '23 dollars', '4 dollars']
l.sort(key=int)
print(l)

Wrong!

Traceback (most recent call last):
  File "main.py", line 2, in <module>
    l.sort(key=int)
ValueError: invalid literal for int() with base 10: '4 dollars'

The string is not just an integer. It contains more information.

The good luck is that we can send any function. Let’s try to create one.

def comp(o):
  return int(o.split()[0])

l = ['4 dollars', '8 dollars', '12 dollars', '23 dollars', '4 dollars']
l.sort(key=comp)
print(l)

And the output is as desired.

['4 dollars', '4 dollars', '8 dollars', '12 dollars', '23 dollars']

Too fast? Let’s just analyse our function comp. It contains only one return statement. Try to read it from inside out.

o.split() splits the string up in a list of items contain word by word. Hence, the call of ‘4 dollars’.split() will result in [‘4’, ‘dollars’].

Then o.split()[0] will return the first item of that list, i.e. ‘4’.

Finally, we cast it to an integer by int(o.split()[0]).

Remember that the comparison is done by the output of the function, that is what the function returns, which in this case is the integer represented by the first item in the string.

What about lambda?

Lambda? Yes, lambda functions is also a hot subject.

A lambda function is just a smart way to write simple functions you send as arguments to other functions. Like in this case a sorting function.

Let’s try if we can do that.

l = ['4 dollars', '8 dollars', '12 dollars', '23 dollars', '4 dollars']
l.sort(key=lambda o: int(o.split()[0]))
print(l)

Resulting in the same output.

['4 dollars', '4 dollars', '8 dollars', '12 dollars', '23 dollars']

A bit magic with lambda functions? We advice you to read this tutorial on the subject.