## What will we cover in this tutorial?

You just want your code to run fast, right? **Numba** is a just-in-time compiler for **Python** that works amazingly with **NumPy**. Does that mean we should alway use Numba?

Well, let’s try some examples out and learn. If you know about **NumPy**, you know you should use **vectorization** to get speed. Does **Numba** beat that?

## Step 1: Let’s learn how Numba works

**Numba** will compile the **Python** code into machine code and run it. What about the just-in-time compiler? That means, the first time it uses the code you want to turn into machine code, it will compile it and run it. The next, or any time later, it will just run it, as it is already compiled.

Let’s try that.

import numpy as np from numba import jit import time @jit(nopython=True) def full_sum_numba(a): sum = 0.0 for i in range(a.shape[0]): for j in range(a.shape[1]): sum += a[i, j] return sum iterations = 1000 size = 10000 x = np.random.rand(size, size) start = time.time() full_sum_numba(x) end = time.time() print("Elapsed (Numba) = %s" % (end - start)) start = time.time() full_sum_numba(x) end = time.time() print("Elapsed (Numba) = %s" % (end - start))

Where you get.

Elapsed (No Numba) = 0.41634082794189453 Elapsed (No Numba) = 0.11176300048828125

Where you see a difference in runtime.

Oh, did you get what happened in the code? Well, if you put **@jit(nopython=True)** in front of a function, **Numba** will try to compile it and run it as machine code.

As you see above, the first time as has an overhead in run-time, because it first compiles and the runs it. The second time, it already has compiled it and can run it immediately.

## Step 2: Compare Numba just-in-time code to native Python code

So let us compare how much you gain by using Numba just-in-time (**@jit**) in our code.

import numpy as np from numba import jit import time def full_sum(a): sum = 0.0 for i in range(a.shape[0]): for j in range(a.shape[1]): sum += a[i, j] return sum @jit(nopython=True) def full_sum_numba(a): sum = 0.0 for i in range(a.shape[0]): for j in range(a.shape[1]): sum += a[i, j] return sum iterations = 1000 size = 10000 x = np.random.rand(size, size) start = time.time() full_sum(x) end = time.time() print("Elapsed (No Numba) = %s" % (end - start)) start = time.time() full_sum_numba(x) end = time.time() print("Elapsed (Numba) = %s" % (end - start)) start = time.time() full_sum_numba(x) end = time.time() print("Elapsed (Numba) = %s" % (end - start))

Here we added a native Python function without the **@jit** in front and will compare it with one which has. We will compare it here.

Elapsed (No Numba) = 38.08543515205383 Elapsed (No Numba) = 0.41634082794189453 Elapsed (No Numba) = 0.11176300048828125

That is some difference. Also, we have plotted a few more runs in the graph below.

It seems pretty evident.

## Step 3: Comparing it with Vectorization

If you don’t know what vectorization is, we can recommend this tutorial. The reason to have vectorization is to move the expensive for-loops into the function call to have optimized code run it.

That sounds a lot like what Numba can do. It can change the expensive for-loops into fast machine code.

But which one is faster?

Well, I think there are two parameters to try out. First, the size of the problem. Second, to see if the number of iterations matter.

import numpy as np from numba import jit import time @jit(nopython=True) def full_sum_numba(a): sum = 0.0 for i in range(a.shape[0]): for j in range(a.shape[1]): sum += a[i, j] return sum def full_sum_vectorized(a): return a.sum() iterations = 1000 size = 10000 x = np.random.rand(size, size) start = time.time() full_sum_vectorized(x) end = time.time() print("Elapsed (No Numba) = %s" % (end - start)) start = time.time() full_sum_numba(x) end = time.time() print("Elapsed (No Numba) = %s" % (end - start)) start = time.time() full_sum_numba(x) end = time.time() print("Elapsed (No Numba) = %s" % (end - start))

As a function of the size.

It is interesting that Numba is faster for small sized of the problem, while it seems like the vectorized approach outperforms Numba for bigger sizes.

And not surprisingly, the number of iterations only makes the difference bigger.

This is not surprising, as the code in a vectorized call can be more specifically optimized than the more general purpose Numba approach.

## Conclusion

Does that mean the Numba does not pay off to use?

No, not at all. First of all, we have only tried it for one vectorized approach, which was obviously very easy to optimize. Secondly, not all loops can be turned into vectorized code. In general it is difficult to have a state in a vectorized approach. Hence, if you need to keep track of some internal state in a loop it can be difficult to find a vectorized approach.