## What will we cover in this tutorial?

We will continue our investigation of Numba from this tutorial.

**Numba** is a just-in-time compiler for **Python** that works amazingly with **NumPy**. As we saw in the last tutorial, the built in **vectorization** can depending on the case and size of instance be faster than **Numba**.

Here we will explore that further as well to see how **Numba** compares with **lambda** functions. **Lambda** functions has the advantage, that they can be parsed as an argument down to a library that can optimize the performance and not depend on slow **Python** code.

## Step 1: Example of Vectorization slower than Numba

In the previous tutorial we only investigated an example of vectorization, which was faster than Numba. Here we will see, that this is not always the case.

```
import numpy as np
from numba import jit
import time
size = 100
x = np.random.rand(size, size)
y = np.random.rand(size, size)
iterations = 100000
@jit(nopython=True)
def add_numba(a, b):
c = np.zeros(a.shape)
for i in range(a.shape[0]):
for j in range(a.shape[1]):
c[i, j] = a[i, j] + b[i, j]
return c
def add_vectorized(a, b):
return a + b
# We call the function once, to precompile the code
z = add_numba(x, y)
start = time.time()
for _ in range(iterations):
z = add_numba(x, y)
end = time.time()
print("Elapsed (numba, precompiled) = %s" % (end - start))
start = time.time()
for _ in range(iterations):
z = add_vectorized(x, y)
end = time.time()
print("Elapsed (vectorized) = %s" % (end - start))
```

Varying the size of the NumPy array, we can see the performance between the two in the graph below.

Where it is clear that the vectorized approach is slower.

## Step 2: Try some more complex example comparing vectorized and Numba

A if-then-else can be expressed as vectorized using the **Numpy** **where** function.

```
import numpy as np
from numba import jit
import time
size = 1000
x = np.random.rand(size, size)
iterations = 1000
@jit(nopython=True)
def numba(a):
c = np.zeros(a.shape)
for i in range(a.shape[0]):
for j in range(a.shape[1]):
if a[i, j] < 0.5:
c[i, j] = 1
return c
def vectorized(a):
return np.where(a < 0.5, 1, 0)
# We call the numba function to precompile it before we measure it
z = numba(x)
start = time.time()
for _ in range(iterations):
z = numba(x)
end = time.time()
print("Elapsed (numba, precompiled) = %s" % (end - start))
start = time.time()
for _ in range(iterations):
z = vectorized(x)
end = time.time()
print("Elapsed (vectorized) = %s" % (end - start))
```

This results in the following comparison.

That is close, but the vectorized approach is a bit faster.

## Step 3: Compare Numba with lambda functions

I am very curious about this. Lambda functions are controversial in Python, and many are not happy about them as they have a lot of syntax, which is not aligned with Python. On the other hand, lambda functions have the advantage that you can send them down in the library that can optimize over the for-loops.

```
import numpy as np
from numba import jit
import time
size = 1000
x = np.random.rand(size, size)
iterations = 1000
@jit(nopython=True)
def numba(a):
c = np.zeros((size, size))
for i in range(a.shape[0]):
for j in range(a.shape[1]):
c[i, j] = a[i, j] + 1
return c
def lambda_run(a):
return a.apply(lambda x: x + 1)
# Call the numba function to precompile it before time measurement
z = numba(x)
start = time.time()
for _ in range(iterations):
z = numba(x)
end = time.time()
print("Elapsed (numba, precompiled) = %s" % (end - start))
start = time.time()
for _ in range(iterations):
z = vectorized(x)
end = time.time()
print("Elapsed (vectorized) = %s" % (end - start))
```

Resulting in the following performance comparison.

This is again tight, but the lambda approach is still a bit faster.

Remember, this is a simple lambda function and we cannot conclude that lambda function in general are faster than using Numba.

## Conclusion

Learnings since the last tutorial is that we have found an example where simple vectorization is slower than Numba. This still leads to the conclusion that performance highly depends on the task. Further, the lambda function seems to give promising performance. Again, this should be compared to the slow approach of a Python for-loop without Numba just-in-time compiled machine code.

## Python for Finance: Unlock Financial Freedom and Build Your Dream Life

Discover the key to financial freedom and secure your dream life with Python for Finance!

Say goodbye to financial anxiety and embrace a future filled with confidence and success. If you’re tired of struggling to pay bills and longing for a life of leisure, it’s time to take action.

Imagine breaking free from that dead-end job and opening doors to endless opportunities. With Python for Finance, you can acquire the invaluable skill of financial analysis that will revolutionize your life.

Make informed investment decisions, unlock the secrets of business financial performance, and maximize your money like never before. Gain the knowledge sought after by companies worldwide and become an indispensable asset in today’s competitive market.

Don’t let your dreams slip away. Master Python for Finance and pave your way to a profitable and fulfilling career. Start building the future you deserve today!

Python for Finance a 21 hours course that teaches investing with Python.

Learn **pandas**, **NumPy**, **Matplotlib** for Financial Analysis & learn how to Automate Value Investing.

*“Excellent course for anyone trying to learn coding and investing.”* – **Lorenzo B.**

I believe numba is slower because of the numpy.zeros call inside the function, which puts into disadvantage compared to the vectorized version. In both cases where numba is slower, you can improve the results for it by returning the result “in place”.

In the lambda case this is trivial, with a+=1, in the where case you could also replace the a with the result of the comparison (as 0. or 1.) or you could use a boolean type that does not require as much memory.

Hi I.,

Good question.

I would say, it depends on what you measure.

What the Numba version does is the same as the vectorized version does. It creates a new NumPy array of same size and returns the result in it.

If you look for speed, then you can do it in-place, as you suggest. But then I would argue, that it is not a fair comparison.

Let me know what you think.

Cheers, Rune