Quick NumPy Tutorial

What is NumPy?

NumPy is a scientific library that provides multidimensional array object with fast routines on it. NumPy is short for Numerical Python.

When we talk about NumPy often we refer to the powerful ndarray, which is the multidimensional array (N-dimensional array).

A few comparisons between Python lists and ndarray.

ndarrayPython list
Have fixed size at creation.Is dynamic. You can add and remove elements.
All elements have the same type.Elements have type independent of each other.
Can execute fast mathematical operations with simple syntax.Need loops to make operations on each element.
Comparison between Numpy and Python list

Examples showing the difference: Fixed after creation

The Numpy is imported by default imported by import numpy as np. To create a ndarray, you can use the array call as defined below.

import numpy as np

data = np.array([[1, 2, 3], [1, 2, 3]])
print(data)

Which will create an 2 dimensional array object with 2 times 3 elements.

[[1 2 3]
 [1 2 3]]

That will be a fixed sized ndarray. You cannot add new dimensions or elements to the the single arrays.

A Python list is a more flexible.

my_list = []
my_list.append(2)
my_list.append(4)
my_list.remove(2)
print(my_list)

Which demonstrates the flexibility and power of Python lists. It is simple to add and remove elements. The above code will result in the following output.

[4]

Examples showing the difference: One type

The type of a ndarray is stored in dtype. Interesting thing is that each element must have the same type.

import numpy as np

data = np.random.randn(2, 3)

print(data)
print(data.dtype)

It will result in a random ndarray of type float64.

[[-0.85925182 -0.89247774 -2.40920842]
 [ 0.84647869  0.27631307 -0.80772023]]
float64

An interesting way to demonstrate that only one type can be present in an ndarray, is by trying to create it with a mixture of ints and floats.

import numpy as np

data = np.array([[1.0, 2, 3], [1, 2, 3]])
print(data)
print(data.dtype)

As the first element is of type float they are all cast to float64.

[[1. 2. 3.]
 [1. 2. 3.]]
float64

While the following list is valid.

my_list = [1.0, 2, 3]
print(my_list)

Where the first element will be float the second and third element are ints.

[1.0, 2, 3]

Examples showing the difference: No loops needed

Many operations can be made directly on the ndarray.

import numpy as np

data = np.random.randn(2, 3)

print(data)
print(data*10)
print(data + data)

Which will result in the following output.

[[ 1.18303358 -2.20017954  0.46294824]
 [-0.56508587  0.0990272  -1.8431866 ]]
[[ 11.83033584 -22.00179538   4.62948243]
 [ -5.65085867   0.990272   -18.43186601]]
[[ 2.36606717 -4.40035908  0.92589649]
 [-1.13017173  0.1980544  -3.6863732 ]]

Expected right? But easy to multiply and add out of the box.

Similar of the Python list would be.

my_list = [1, 2, 3]
for i in range(len(my_list)):
    my_list[i] *= 10

for i in range(len(my_list)):
    my_list[i] += my_list[i]

And it is not even the same, as you write it directly to the old elements.

Another way to compare differences

It might at first glance seem like ndarrays are inflexible with all the restrictions comparing the Python lists. Yes, that is true, but the benefit is the speed.

import time
import numpy as np

my_arr = np.arange(1000000)
my_list = list(range(1000000))

start = time.time()
for _ in range(10): my_arr2 = my_arr * 2
end = time.time()
print(end - start)

start = time.time()
for _ in range(10): my_list2 = [x * 2 for x in my_list]
end = time.time()
print(end - start)

Which resulted in.

0.03456306457519531
0.9373760223388672

The advantage is that ndarrays are 10-100 times faster than Python lists, which makes a considerable impact on scientific calculations.