NumPy vs Pandas

What will we cover in this tutorial

A high level view of the differences of NumPy and Pandas libraries in Python. We will also make a short exploration of the performance differences in a specific use case.

Top level differences between NumPy and Pandas

First of all, the purpose of these libraries are different.

  • NumPy is made to manage n-dimensional numerical data. Think of it if you need to handle a lot of data all of the same type and numerical, but categorized in columns and rows.
  • Pandas is made for tabular data. This could be data from an excel sheet, where you have various types of data categorized in rows and columns.

There are more differences.

  • NumPy consist of the data type ndarray, which is create with fixed dimensions with only one element type.
  • Pandas consist of Series and DataFrames, which are more dynamic after creation.

Performance comparison of NumPy and Pandas

If you should guess? Pandas? Of course not. NumPy is great magnitude faster than Pandas.

Why?

Let us first examine it.

import time
import numpy as np
import pandas as pd
size = 100
iterations = 100000000//size
a = np.arange(size)
start = time.time()
for _ in range(iterations): a2 = a * a
end = time.time()
print(end - start)
n = pd.Series(a)
start = time.time()
for _ in range(iterations): n2 = n * n
end = time.time()
print(end - start)

Which results in the following comparison.

NumPy vs Pandas

I find it very interesting that the speed is so slow for small instances of Pandas, comparing to NumPy, while later it seems to go to Pandas advantage, but eventually it still seems to be NumPy.

Well, the flexibility of Pandas has a cost, which is high for small instances when making arithmetic operations as we did in the above example.

Next steps

Investigate further how NumPy and Pandas compare in performance for various functions.

Pandas and NumPy support a lot of functions in a vectorized way, which could be interesting to investigate. Do the restrictions of NumPy arrays give the underlying C/C++ code an advantage in performance?

Learn Python

Learn Python A BEGINNERS GUIDE TO PYTHON

  • 70 pages to get you started on your journey to master Python.
  • How to install your setup with Anaconda.
  • Written description and introduction to all concepts.
  • Jupyter Notebooks prepared for 17 projects.

Python 101: A CRASH COURSE

  1. How to get started with this 8 hours Python 101: A CRASH COURSE.
  2. Best practices for learning Python.
  3. How to download the material to follow along and create projects.
  4. A chapter for each lesson with a descriptioncode snippets for easy reference, and links to a lesson video.

Expert Data Science Blueprint

Expert Data Science Blueprint

  • Master the Data Science Workflow for actionable data insights.
  • How to download the material to follow along and create projects.
  • A chapter to each lesson with a Description, Learning Objective, and link to the lesson video.

Machine Learning

Machine Learning – The Simple Path to Mastery

  • How to get started with Machine Learning.
  • How to download the material to follow along and make the projects.
  • One chapter for each lesson with a Description, Learning Objectives, and link to the lesson video.

Leave a Comment