Excel Automation with Simple Moving Average from Python

What will we cover in this tutorial?

We will retrieve the historic stock prices and calculate the moving average. Then we will export the data to Excel and insert a chart, but all done from Python.

See the in depth explanation in the YouTube video. It also gives advice on how to interpret the Simple Moving Averages (SMA).

Step 1: Read historic stock prices

We will use the Pandas-datarader to get the historic prices of NFLX (the ticker for Netflix).

import pandas_datareader as pdr
import datetime as dt

ticker = "NFLX"
start = dt.datetime(2019, 1, 1)

data = pdr.get_data_yahoo(ticker, start)
print(data.head())

And you will get the historic data for Netflix from January 1st, 2019.

	High	Low	Open	Close	Volume	Adj Close
Date						
2019-01-02	269.750000	256.579987	259.279999	267.660004	11679500	267.660004
2019-01-03	275.790009	264.429993	270.200012	271.200012	14969600	271.200012
2019-01-04	297.799988	278.540009	281.880005	297.570007	19330100	297.570007
2019-01-07	316.799988	301.649994	302.100006	315.339996	18620100	315.339996
2019-01-08	320.589996	308.010010	319.980011	320.269989	15359200	320.269989

Step 2: Understand Moving Average

We will calculate the Simple Moving Average as defined on Investopedia.

Simple Moving Average

The Simple Moving Average (Now just referred to as Moving Average or MA) is defined by a period of days.

That is, the MA of a period of 10 (MA10) will take the average value of the last 10 close prices. This is done in a rolling way, hence, we will get a MA10 for every trading day in our historic data, except the first 9 days in our dataset.

We can similarly calculate a MA50 and MA200, which is a Moving Average of the last 50 and 200 days, respectively.

Step 3: Calculating the Moving Averages

We can do that by using rolling and mean.

And it is magic.

data['MA10'] = data['Close'].rolling(10).mean()
data['MA50'] = data['Close'].rolling(50).mean()
data['MA200'] = data['Close'].rolling(200).mean()

print(data.tail())

That was easy, right?

	High	Low	Open	Close	Volume	Adj Close	MA10	MA50	MA200
Date									
2021-01-12	501.089996	485.670013	500.000000	494.250000	5990400	494.250000	515.297998	502.918599	477.08175
2021-01-13	512.349976	493.010010	495.500000	507.790009	5032100	507.790009	512.989999	503.559600	477.76590
2021-01-14	514.500000	499.579987	507.350006	500.859985	4177400	500.859985	510.616995	503.894399	478.39270
2021-01-15	506.320007	495.100006	500.000000	497.980011	5890200	497.980011	506.341998	504.109600	479.06220
2021-01-19	509.250000	493.540009	501.000000	501.769989	11996900	501.769989	504.232999	504.205999	479.72065

Step 4: Visualize it with Matplotlib

We can see the data with Matplotlib.

import matplotlib.pyplot as plt

data[['Close', 'MA10', 'MA50']].loc['2020-01-01':].plot()
plt.show()

Resulting in the following plot.

The output

Where you can see how the MA10 and MA50 move according to the price.

Step 5: Export to Excel

Now we will export the data to Excel.

For this we need to import Pandas and use the XlsxWriter engine, where you can find the details of the code.

The code can be found here.

import pandas as pd

data = data.loc['2020-01-01':]
data = data.iloc[::-1]
writer = pd.ExcelWriter("technical.xlsx", 
                        engine='xlsxwriter', 
                        date_format = 'yyyy-mm-dd', 
                        datetime_format='yyyy-mm-dd')

sheet_name = 'Moving Average'
data[['Close', 'MA10', 'MA50']].to_excel(writer, sheet_name=sheet_name)


worksheet = writer.sheets[sheet_name]
workbook = writer.book

# Create a format for a green cell
green_cell = workbook.add_format({
    'bg_color': '#C6EFCE',
    'font_color': '#006100'
})

# Create a format for a red cell
red_cell = workbook.add_format({
    'bg_color': '#FFC7CE',                            
    'font_color': '#9C0006'
})


# Set column width of Date
worksheet.set_column(0, 0, 15)


for col in range(1, 4):
    # Create a conditional formatted of type formula
    worksheet.conditional_format(1, col, len(data), col, {
        'type': 'formula',                                    
        'criteria': '=C2>=D2',
        'format': green_cell
    })

    # Create a conditional formatted of type formula
    worksheet.conditional_format(1, col, len(data), col, {
        'type': 'formula',                                    
        'criteria': '=C2<D2',
        'format': red_cell
    })

# Create a new chart object.
chart1 = workbook.add_chart({'type': 'line'})

# Add a series to the chart.
chart1.add_series({
        'name': "MA10",
        'categories': [sheet_name, 1, 0, len(data), 0],
        'values': [sheet_name, 1, 2, len(data), 2],
})

# Create a new chart object.
chart2 = workbook.add_chart({'type': 'line'})

# Add a series to the chart.
chart2.add_series({
        'name': 'MA50',
        'categories': [sheet_name, 1, 0, len(data), 0],
        'values': [sheet_name, 1, 3, len(data), 3],
})

# Combine and insert title, axis names
chart1.combine(chart2)
chart1.set_title({'name': sheet_name + " " + ticker})
chart1.set_x_axis({'name': 'Date'})
chart1.set_y_axis({'name': 'Price'})

# Insert the chart into the worksheet.
worksheet.insert_chart('F2', chart1)

writer.close()

Where the output will be something similar to this.

Generated Excel sheet

How to Plot Time Series with Matplotlib

What will we cover in this tutorial?

In this tutorial we will show how to visualize time series with Matplotlib. We will do that using Jupyter notebook and you can download the resources (the notebook and data used) from here.

Step 1: What is a time series?

I am happy you asked.

The easiest way to understand it, is to show it. If you downloaded the resources and started the Jupyter notebook execute the following lines.

import pandas as pd

data = pd.read_csv("stock_data.csv", index_col=0, parse_dates=True)

data.head()

This will produce the following output.

	High	Low	Open	Close	Volume	Adj Close
Date						
2020-01-02	86.139999	84.342003	84.900002	86.052002	47660500.0	86.052002
2020-01-03	90.800003	87.384003	88.099998	88.601997	88892500.0	88.601997
2020-01-06	90.311996	88.000000	88.094002	90.307999	50665000.0	90.307999
2020-01-07	94.325996	90.671997	92.279999	93.811996	89410500.0	93.811996
2020-01-08	99.697998	93.646004	94.739998	98.428001	155721500.0	98.428001

You notice the the far left column is called Date and that is the index. This index has a time value, in this case, a date.

Time series data is data “stamped” by a time. In this case, it is time indexed by dates.

The data you see is historic stock prices.

Step 2: How to visualize data with Matplotlib

The above data is kept in a DataFrame (Pandas data object), this makes it straight forward to visualize it.

import matplotlib.pyplot as plt
%matplotlib notebook

data.plot()

Which will result in a chart similar to this one.

Result

This is not impressive. It seems like something is wrong.

Actually, there is not. It just does what you ask for. It plots all the 6 columns all together in one chart. Because the Volume is such a high number, all the other columns are in the same brown line (the one that looks straight).

Step 3: Matplotlib has a functional and object oriented interface

This is often a bit confusing at first.

But Matplotlib has a functional and object oriented interface. We used the functional.

If you try to execute the following in your Jupyter notebook.

data['My col'] = data['Volume']*0.5
data['My col'].plot()

It would seem like nothing happened.

But then investigate your previous plot.

Previous plot

It got updated with a new line. Hence, instead of creating a new chart (or figure) it just added it to the existing one.

If you want to learn more about functional and object oriented way of using Matplotlib we recommend this tutorial.

Step 4: How to make a new figure

What to do?

Well, you need to use the object oriented interface of Matplotlib.

You can do that as follows.

fig1, ax1 = plt.subplots()
data['My col'].plot(ax=ax1)

Which will produce what you are looking for. A new figure.

The new figure

Step 5: Make multiple plots in one figure

This is getting fun.

How can you create multiple plots in one figure?

On creating you actually do that.

fig2, ax2 = plt.subplots(2, 2)

data['Open'].plot(ax=ax2[0, 0])
data['High'].plot(ax=ax2[0, 1])
data['Low'].plot(ax=ax2[1, 0])
data['Close'].plot(ax=ax2[1, 1])
plt.tight_layout()

Notice that subplots(2, 2) creates a 2 times 2 array of axis you can use to create a plot.

This should result in this chart.

Result

Step 6: Make a histogram

This can be done as follows.

fig3, ax3 = plt.subplots()

data.loc[:'2020-01-31', 'Volume'].plot.bar(ax=ax3)

Notice that we only take the first month of the Volume data here (data.loc[:’2020-01-31′, ‘Volume’]).

This should result in this figure.

Step 7: Save the figures

This is straight forward.

fig1.savefig("figure-1.png")
fig2.savefig("figure-2.png")
fig3.savefig("figure-3.png")

And the above figures should be available in the same location you are running your Jupyter notebook.

Next step

If you want to learn more about functional and object oriented way of using Matplotlib we recommend this tutorial.

Create Financial Analysis and With Multiple Sheets and Charts in Excel from Python

What will we cover in this tutorial?

We will get financial data from FPM cloud (Free stock API) for the last years and generate a 3 sheet Excel sheet with charts similar to this. All done from Python.

Step 1: Collect the data

In this tutorial we are only going to use the example data on Apple, that is free available without registering on FMP Cloud. If you want to do it on another stock yo will need to register on their site.

What we need is the income statement and cash flow statement. They are available as JSON on their page (income statement and cash flow statement).

As our main library we will use Pandas.

The data can be collected with the following code (NOTICE: The code syntax highlighter changes & to &amp; in the URLs => remove the amp;).

import pandas as pd

income_statement = pd.read_json("https://fmpcloud.io/api/v3/income-statement/AAPL?limit=120&apikey=demo")
income_statement.set_index('date', inplace=True)
cash_flow = pd.read_json("https://fmpcloud.io/api/v3/cash-flow-statement/AAPL?limit=120&apikey=demo")
cash_flow.set_index('date', inplace=True)

Notice that we set the index to be the date column. This makes the further work easier.

Step 2: Prepare the data

The next step we need to do is to make the necessary calculations and prepare the data.

We are only interested Revenue, Earnings Per Share (EPS) and Free Cash Flow (FCF). So let’s take that data and keep it in a DataFrame (The Pandas main data structure).

data = income_statement[['revenue', 'eps']].copy()
data['fcf'] = cash_flow['freeCashFlow']

Now the data should look something like this.

	        revenue	        eps	fcf
date			
2020-09-26	274515000000	3.3600	7.336500e+10
2019-09-28	260174000000	2.9925	5.889600e+10
2018-09-29	265595000000	3.0025	6.412100e+10
2017-09-30	229234000000	2.3175	5.080300e+10
2016-09-24	215639000000	2.0875	5.349700e+10

Step 3: Calculate Percentage change

We are actually not interested in the actual values. The only thing we care about is the percentage growth from year to year.

As the data is in reverse order, and we need to calculate from the beginning to the end. We first need to order the dates from old to newest.

Then it is straight forward to calculate using pct_change() method.

data = data.sort_index()

data['rev %-chg'] = data['revenue'].pct_change()
data['eps %-chg'] = data['eps'].pct_change()
data['fcf %-chg'] = data['fcf'].pct_change()

Now the data should look similar to this (or the tail of it).

	        revenue	        eps	fcf	        rev %-chg	      eps %-chg	fcf %-chg
date						
2016-09-24	215639000000	2.0875	5.349700e+10	-0.077342	-0.100216	-0.233326
2017-09-30	229234000000	2.3175	5.080300e+10	0.063045	0.110180	-0.050358
2018-09-29	265595000000	3.0025	6.412100e+10	0.158620	0.295577	0.262150
2019-09-28	260174000000	2.9925	5.889600e+10	-0.020411	-0.003331	-0.081487
2020-09-26	274515000000	3.3600	7.336500e+10	0.055121	0.122807	0.245670

Step 5: Preparing data used in each Excel sheet

The next step is to prepare the data for each sheet in the Excel document.

We only need the last 10 years of data and will use loc to get that. Then we create a DataFrame for each sheet.

data_set = data.loc['2010-01-01':]
data_set.index = data_set.index.year

rev = data_set[['revenue', 'rev %-chg']].copy()
eps = data_set[['eps', 'eps %-chg']].copy()
fcf = data_set[['fcf', 'fcf %-chg']].copy()

Notice that we set the index only to be the year.

Step 6: Creating the Excel sheet

To create the Excel sheet we will use Pandas with the XlsxWriter.

This can be done as follows.

def create_sheet(writer, df, sheet_name):
    df.to_excel(writer, sheet_name=sheet_name)

writer = pd.ExcelWriter("financials.xlsx", engine='xlsxwriter')
workbook = writer.book

create_sheet(writer, rev, 'Revenue')
create_sheet(writer, eps, 'EPS')
create_sheet(writer, fcf, 'FCF')

workbook.close()

This will create a 3-sheet Excel document. One sheet for each DataFrame (rev, eps, fcf).

The function create_sheet(…) is practical, as each DataFrame is similar and we need to create the same sheet for each DataFrame.

The result will be in the Excel document financial.xlsx.

Step 7: Creating the charts

Now it pays to have the function create_sheet(…).

The full Excel document can be created by the following code.

def create_sheet(writer, df, sheet_name):
    df.to_excel(writer, sheet_name=sheet_name)
    worksheet = writer.sheets[sheet_name]
    workbook = writer.book
      
    chart = workbook.add_chart({'type': 'line'})
    
    chart.add_series({
        'name': sheet_name,
        'categories': [sheet_name, 1, 0, 11, 0],
        'values': [sheet_name, 1, 1, 11, 1],
    })
    
    column_chart = workbook.add_chart({'type': 'column'})
    
    column_chart.add_series({
        'name': sheet_name,
        'categories': [sheet_name, 1, 0, 11, 0],
        'values': [sheet_name, 1, 2, 11, 2],
        'y2_axis': True,
    })
    
    chart.combine(column_chart)
    chart.set_title({'name': sheet_name})
    chart.set_x_axis({'name': 'Date'})
    chart.set_y_axis({'name': 'Value'})
    column_chart.set_y2_axis({'name': '%-chg'})
    
    worksheet.insert_chart("E1", chart)

writer = pd.ExcelWriter("financials.xlsx", engine='xlsxwriter')
workbook = writer.book

create_sheet(writer, rev, 'Revenue')
create_sheet(writer, eps, 'EPS')
create_sheet(writer, fcf, 'FCF')

workbook.close()writer = pd.ExcelWriter("financials.xlsx", engine='xlsxwriter')
workbook = writer.book

create_sheet(writer, rev, 'Revenue')
create_sheet(writer, eps, 'EPS')
create_sheet(writer, fcf, 'FCF')

workbook.close()

The details of creating the charts in Excel can be found on XlsxWriter. Basically, it is sending informations to the XlsxWriter engine through dictionaries. You need to send the values that you would set in Excel, if you were working inside there.

Again, the result will be in the Excel document financial.xlsx.