Pandas for Financial Stock Analysis

What will we cover?

In this tutorial we will get familiar to work with DataFrames – the primary data structure in Pandas.

We will learn how to read a historical stock price data from Yahoo! Finance and load it into a DataFrame. This will be done by exporting a CSV file from Yahoo! Finance and load the data. Later we will learn how to read the data directly from the Yahoo! Finance API.

DataFrame is similar to an Excel sheet. DataFrames can contain data in a similar way as we will see in this lesson.

Then we will learn how to use the index of the dates. This will be necessary later when we make calculations later on.

The first part of the tutorial will give the foundation of what you need to know about DataFrames for financial analysis.

Step 1: Read the stock prices from Yahoo! Finance as CSV

In this first lesson we will download historical stock prices from Yahoo! Finance as CSV file and import them into our Jupyter notebook environment in a DataFrame.

If you are new to CSV files and DataFrames. Don’t worry, that is what we will cover here.

Let’s start by going to Yahoo! Finance and download the CVS file. In this course we have used Apple, but feel free to make similar calculation on a stock of your choice.

Go to Yahoo! Finance write AAPL (ticker for Apple) and press Historical Data and download the CSV data file.

The CSV data file will contain Comma Separated Values (CSV) similar to this.

Date,Open,High,Low,Close,Adj Close,Volume

The first line shows the column names (Date, Open, High, Low, Close, Adj Close, Volume). Then each line contains a data entry for a given day.

Step 2: Read the stock prices from CSV to Pandas DataFrame

n Jupyter Notebook start by importing the Pandas library. This is needed in order to load the data into a DataFrame.

import pandas as pd
data = pd.read_csv("AAPL.csv", index_col=0, parse_dates=True)

The read_csv(…) does all the magic for us. It will read the CSV file AAPL.csv. The AAPL.csv file is the one you downloaded from Yahoo! Finance (or from the zip-file downloaded above) and needs to be located in the same folder you are working from in your Jupyter notebook.

The arguments in read_csv(…) are the following.

  • index_col=0 this sets the first column of the CSV file to be the index. In this case, it is the Date column.
  • parse_dates=True this ensures that dates in the CSV file are interpreted as dates. This is important if you want to take advantage of the index being a time.

Step 3: Explore data types of columns and index

In the video lesson we explore the type of columns and index.


Which will reveal the data type and index of the DataFrame. Notice, that each column has its own data type.

Step 4: Indexing and slicing with DataFrames

We can use loc to lookup an index with a date.


This will show the data for that specific date. If you get an error it might be because your dataset does not contain the above date. Choose another one to see something similar to this.

Open         7.751500e+01
High         7.794250e+01
Low          7.622000e+01
Close        7.723750e+01
Adj Close    7.657619e+01
Volume       1.619400e+08
Name: 2020-01-27 00:00:00, dtype: float64

A more advanced option is to use an interval (or slice as it is called). Slicing with loc on a DataFrame is done by using a starting and ending index .loc[start:end] or an open ended index .loc[start:], which will take data beginning from start to the last data.


This will give all the data starting from 2020-01-01. Notice, that there is no data on January 1st, but since the index is interpreted as a datetime, it can figure out the first date after.

            Open        High        Low         Close    Adj Close       Volume
2021-01-04  133.520004  133.610001  126.760002  129.410004  129.410004  143301900
2021-01-05  128.889999  131.740005  128.429993  131.009995  131.009995  97664900
2021-01-06  127.720001  131.050003  126.379997  126.599998  126.599998  155088000
2021-01-07  128.360001  131.630005  127.860001  130.919998  130.919998  109578200
2021-01-08  132.429993  132.630005  130.229996  132.050003  132.050003  105158200
2021-01-11  129.190002  130.169998  128.500000  128.979996  128.979996  100620900

Similarly, you can create slicing with an open-ended start.


Another important way to index into DataFrames is by iloc[], which does it with index.


Where you can index from the start with index 0, 1, 2, 3, … Or from the end -1, -2, -3, -4, …

Want to learn more?

This is part of a 2-hour full video course in 8 parts about Technical Analysis with Python.

If you are serious about learning Python for Finance check out this course.

  • Learn Python for Finance with pandas and NumPy.
  • 21 hours of video in over 180 lectures.
  • “Excellent course for anyone trying to learn to code and invest.” – Lorenzo B.

Get Python for Finance HERE.

Python for Finance

Learn Python


  • 70 pages to get you started on your journey to master Python.
  • How to install your setup with Anaconda.
  • Written description and introduction to all concepts.
  • Jupyter Notebooks prepared for 17 projects.

Python 101: A CRASH COURSE

  1. How to get started with this 8 hours Python 101: A CRASH COURSE.
  2. Best practices for learning Python.
  3. How to download the material to follow along and create projects.
  4. A chapter for each lesson with a descriptioncode snippets for easy reference, and links to a lesson video.

Expert Data Science Blueprint

Expert Data Science Blueprint

  • Master the Data Science Workflow for actionable data insights.
  • How to download the material to follow along and create projects.
  • A chapter to each lesson with a Description, Learning Objective, and link to the lesson video.

Machine Learning

Machine Learning – The Simple Path to Mastery

  • How to get started with Machine Learning.
  • How to download the material to follow along and make the projects.
  • One chapter for each lesson with a Description, Learning Objectives, and link to the lesson video.

Leave a Comment