What will we cover in this tutorial?
In this tutorial we will show how to visualize time series with Matplotlib. We will do that using Jupyter notebook and you can download the resources (the notebook and data used) from here.
Step 1: What is a time series?
I am happy you asked.
The easiest way to understand it, is to show it. If you downloaded the resources and started the Jupyter notebook execute the following lines.
import pandas as pd data = pd.read_csv("stock_data.csv", index_col=0, parse_dates=True) data.head()
This will produce the following output.
High Low Open Close Volume Adj Close Date 2020-01-02 86.139999 84.342003 84.900002 86.052002 47660500.0 86.052002 2020-01-03 90.800003 87.384003 88.099998 88.601997 88892500.0 88.601997 2020-01-06 90.311996 88.000000 88.094002 90.307999 50665000.0 90.307999 2020-01-07 94.325996 90.671997 92.279999 93.811996 89410500.0 93.811996 2020-01-08 99.697998 93.646004 94.739998 98.428001 155721500.0 98.428001
You notice the the far left column is called Date and that is the index. This index has a time value, in this case, a date.
Time series data is data “stamped” by a time. In this case, it is time indexed by dates.
The data you see is historic stock prices.
Step 2: How to visualize data with Matplotlib
import matplotlib.pyplot as plt %matplotlib notebook data.plot()
Which will result in a chart similar to this one.
This is not impressive. It seems like something is wrong.
Actually, there is not. It just does what you ask for. It plots all the 6 columns all together in one chart. Because the Volume is such a high number, all the other columns are in the same brown line (the one that looks straight).
Step 3: Matplotlib has a functional and object oriented interface
This is often a bit confusing at first.
But Matplotlib has a functional and object oriented interface. We used the functional.
If you try to execute the following in your Jupyter notebook.
data['My col'] = data['Volume']*0.5 data['My col'].plot()
It would seem like nothing happened.
But then investigate your previous plot.
It got updated with a new line. Hence, instead of creating a new chart (or figure) it just added it to the existing one.
Step 4: How to make a new figure
What to do?
Well, you need to use the object oriented interface of Matplotlib.
You can do that as follows.
fig1, ax1 = plt.subplots() data['My col'].plot(ax=ax1)
Which will produce what you are looking for. A new figure.
Step 5: Make multiple plots in one figure
This is getting fun.
How can you create multiple plots in one figure?
On creating you actually do that.
fig2, ax2 = plt.subplots(2, 2) data['Open'].plot(ax=ax2[0, 0]) data['High'].plot(ax=ax2[0, 1]) data['Low'].plot(ax=ax2[1, 0]) data['Close'].plot(ax=ax2[1, 1]) plt.tight_layout()
Notice that subplots(2, 2) creates a 2 times 2 array of axis you can use to create a plot.
This should result in this chart.
Step 6: Make a histogram
This can be done as follows.
fig3, ax3 = plt.subplots() data.loc[:'2020-01-31', 'Volume'].plot.bar(ax=ax3)
Notice that we only take the first month of the Volume data here (data.loc[:’2020-01-31′, ‘Volume’]).
This should result in this figure.
Step 7: Save the figures
This is straight forward.
fig1.savefig("figure-1.png") fig2.savefig("figure-2.png") fig3.savefig("figure-3.png")
And the above figures should be available in the same location you are running your Jupyter notebook.