# Performance comparison of Numba vs Vectorization vs Lambda function with NumPy

## What will we cover in this tutorial?

We will continue our investigation of Numba from this tutorial.

Numba is a just-in-time compiler for Python that works amazingly with NumPy. As we saw in the last tutorial, the built in vectorization can depending on the case and size of instance be faster than Numba.

Here we will explore that further as well to see how Numba compares with lambda functions. Lambda functions has the advantage, that they can be parsed as an argument down to a library that can optimize the performance and not depend on slow Python code.

## Step 1: Example of Vectorization slower than Numba

In the previous tutorial we only investigated an example of vectorization, which was faster than Numba. Here we will see, that this is not always the case.

```import numpy as np
from numba import jit
import time
size = 100
x = np.random.rand(size, size)
y = np.random.rand(size, size)
iterations = 100000

@jit(nopython=True)
c = np.zeros(a.shape)
for i in range(a.shape):
for j in range(a.shape):
c[i, j] = a[i, j] + b[i, j]
return c

return a + b

# We call the function once, to precompile the code
start = time.time()
for _ in range(iterations):
end = time.time()
print("Elapsed (numba, precompiled) = %s" % (end - start))
start = time.time()
for _ in range(iterations):
end = time.time()
print("Elapsed (vectorized) = %s" % (end - start))
```

Varying the size of the NumPy array, we can see the performance between the two in the graph below.

Where it is clear that the vectorized approach is slower.

## Step 2: Try some more complex example comparing vectorized and Numba

A if-then-else can be expressed as vectorized using the Numpy where function.

```import numpy as np
from numba import jit
import time

size = 1000
x = np.random.rand(size, size)
iterations = 1000

@jit(nopython=True)
def numba(a):
c = np.zeros(a.shape)
for i in range(a.shape):
for j in range(a.shape):
if a[i, j] < 0.5:
c[i, j] = 1
return c

def vectorized(a):
return np.where(a < 0.5, 1, 0)

# We call the numba function to precompile it before we measure it
z = numba(x)
start = time.time()
for _ in range(iterations):
z = numba(x)
end = time.time()
print("Elapsed (numba, precompiled) = %s" % (end - start))
start = time.time()
for _ in range(iterations):
z = vectorized(x)
end = time.time()
print("Elapsed (vectorized) = %s" % (end - start))
```

This results in the following comparison.

That is close, but the vectorized approach is a bit faster.

## Step 3: Compare Numba with lambda functions

I am very curious about this. Lambda functions are controversial in Python, and many are not happy about them as they have a lot of syntax, which is not aligned with Python. On the other hand, lambda functions have the advantage that you can send them down in the library that can optimize over the for-loops.

```import numpy as np
from numba import jit
import time
size = 1000
x = np.random.rand(size, size)
iterations = 1000

@jit(nopython=True)
def numba(a):
c = np.zeros((size, size))
for i in range(a.shape):
for j in range(a.shape):
c[i, j] = a[i, j] + 1
return c

def lambda_run(a):
return a.apply(lambda x: x + 1)

# Call the numba function to precompile it before time measurement
z = numba(x)
start = time.time()
for _ in range(iterations):
z = numba(x)
end = time.time()
print("Elapsed (numba, precompiled) = %s" % (end - start))
start = time.time()
for _ in range(iterations):
z = vectorized(x)
end = time.time()
print("Elapsed (vectorized) = %s" % (end - start))
```

Resulting in the following performance comparison.

This is again tight, but the lambda approach is still a bit faster.

Remember, this is a simple lambda function and we cannot conclude that lambda function in general are faster than using Numba.

## Conclusion

Learnings since the last tutorial is that we have found an example where simple vectorization is slower than Numba. This still leads to the conclusion that performance highly depends on the task. Further, the lambda function seems to give promising performance. Again, this should be compared to the slow approach of a Python for-loop without Numba just-in-time compiled machine code.

## Learn Python

Learn Python A BEGINNERS GUIDE TO PYTHON

• 70 pages to get you started on your journey to master Python.
• How to install your setup with Anaconda.
• Written description and introduction to all concepts.
• Jupyter Notebooks prepared for 17 projects.

Python 101: A CRASH COURSE

1. How to get started with this 8 hours Python 101: A CRASH COURSE.
2. Best practices for learning Python.
4. A chapter for each lesson with a descriptioncode snippets for easy reference, and links to a lesson video.

## Expert Data Science Blueprint

Expert Data Science Blueprint

• Master the Data Science Workflow for actionable data insights.
• A chapter to each lesson with a Description, Learning Objective, and link to the lesson video.

## Machine Learning

Machine Learning – The Simple Path to Mastery

• How to get started with Machine Learning.
• One chapter for each lesson with a Description, Learning Objectives, and link to the lesson video.

### 2 thoughts on “Performance comparison of Numba vs Vectorization vs Lambda function with NumPy”

1. I believe numba is slower because of the numpy.zeros call inside the function, which puts into disadvantage compared to the vectorized version. In both cases where numba is slower, you can improve the results for it by returning the result “in place”.
In the lambda case this is trivial, with a+=1, in the where case you could also replace the a with the result of the comparison (as 0. or 1.) or you could use a boolean type that does not require as much memory.

• Hi I.,
Good question.
I would say, it depends on what you measure.

What the Numba version does is the same as the vectorized version does. It creates a new NumPy array of same size and returns the result in it.

If you look for speed, then you can do it in-place, as you suggest. But then I would argue, that it is not a fair comparison.

Let me know what you think.

Cheers, Rune