Learn how you can become a Python programmer in just 12 weeks.

    We respect your privacy. Unsubscribe at anytime.

    NumPy vs Pandas

    What will we cover in this tutorial

    A high level view of the differences of NumPy and Pandas libraries in Python. We will also make a short exploration of the performance differences in a specific use case.

    Top level differences between NumPy and Pandas

    First of all, the purpose of these libraries are different.

    • NumPy is made to manage n-dimensional numerical data. Think of it if you need to handle a lot of data all of the same type and numerical, but categorized in columns and rows.
    • Pandas is made for tabular data. This could be data from an excel sheet, where you have various types of data categorized in rows and columns.

    There are more differences.

    • NumPy consist of the data type ndarray, which is create with fixed dimensions with only one element type.
    • Pandas consist of Series and DataFrames, which are more dynamic after creation.

    Performance comparison of NumPy and Pandas

    If you should guess? Pandas? Of course not. NumPy is great magnitude faster than Pandas.

    Why?

    Let us first examine it.

    import time
    import numpy as np
    import pandas as pd
    size = 100
    iterations = 100000000//size
    a = np.arange(size)
    start = time.time()
    for _ in range(iterations): a2 = a * a
    end = time.time()
    print(end - start)
    n = pd.Series(a)
    start = time.time()
    for _ in range(iterations): n2 = n * n
    end = time.time()
    print(end - start)
    

    Which results in the following comparison.

    NumPy vs Pandas

    I find it very interesting that the speed is so slow for small instances of Pandas, comparing to NumPy, while later it seems to go to Pandas advantage, but eventually it still seems to be NumPy.

    Well, the flexibility of Pandas has a cost, which is high for small instances when making arithmetic operations as we did in the above example.

    Next steps

    Investigate further how NumPy and Pandas compare in performance for various functions.

    Pandas and NumPy support a lot of functions in a vectorized way, which could be interesting to investigate. Do the restrictions of NumPy arrays give the underlying C/C++ code an advantage in performance?

    Python for Finance: Unlock Financial Freedom and Build Your Dream Life

    Discover the key to financial freedom and secure your dream life with Python for Finance!

    Say goodbye to financial anxiety and embrace a future filled with confidence and success. If you’re tired of struggling to pay bills and longing for a life of leisure, it’s time to take action.

    Imagine breaking free from that dead-end job and opening doors to endless opportunities. With Python for Finance, you can acquire the invaluable skill of financial analysis that will revolutionize your life.

    Make informed investment decisions, unlock the secrets of business financial performance, and maximize your money like never before. Gain the knowledge sought after by companies worldwide and become an indispensable asset in today’s competitive market.

    Don’t let your dreams slip away. Master Python for Finance and pave your way to a profitable and fulfilling career. Start building the future you deserve today!

    Python for Finance a 21 hours course that teaches investing with Python.

    Learn pandas, NumPy, Matplotlib for Financial Analysis & learn how to Automate Value Investing.

    “Excellent course for anyone trying to learn coding and investing.” – Lorenzo B.

    Leave a Comment