In this tutorial we will explore some aspects of the Pandas-Datareader, which is an invaluable way to get data from many sources, including the World Bank and Yahoo! Finance.
In this tutorial we will investigate if the GDP of a country is correlated to the stock market.
In the previous tutorial we looked at the GDP per capita and compared it between countries. GDP per capita is a good way to compare country’s economy between each other.
In this tutorial we will look at the GDP and using the NY.GDP.MKTP.CD indicator of GDP in current US$.
We can extract the data by using using the download function from the Pandas-datareader library.
from pandas_datareader import wb
gdp = wb.download(indicator='NY.GDP.MKTP.CD', country='US', start=1990, end=2019)
print(gdp)
Resulting in the following output.
NY.GDP.MKTP.CD
country year
United States 2019 21427700000000
2018 20580223000000
2017 19485393853000
2016 18707188235000
2015 18219297584000
2014 17521746534000
2013 16784849190000
2012 16197007349000
2011 15542581104000
Then we need to gather the data from the stock market. As we look at the US stock market, the S&P 500 index is a good indicator of the market.
The ticker of S&P 500 is ^GSPC (yes, with the ^).
The Yahoo! Finance api is a great place to collect this type of data.
import pandas_datareader as pdr
import datetime as dt
start = dt.datetime(1990, 1, 1)
end = dt.datetime(2019, 12, 31)
sp500 = pdr.get_data_yahoo("^GSPC", start, end)['Adj Close']
print(sp500)
Resulting in the following output.
Date
1990-01-02 359.690002
1990-01-03 358.760010
1990-01-04 355.670013
1990-01-05 352.200012
1990-01-08 353.790009
...
2019-12-24 3223.379883
2019-12-26 3239.909912
2019-12-27 3240.020020
2019-12-30 3221.290039
2019-12-31 3230.780029
A good way to see if there is a correlation is simply by visualizing it.
This can be done with a few tweaks.
import pandas_datareader as pdr
import pandas as pd
import matplotlib.pyplot as plt
import datetime as dt
from pandas_datareader import wb
gdp = wb.download(indicator='NY.GDP.MKTP.CD', country='US', start=1990, end=2019)
gdp = gdp.unstack().T.reset_index(0)
gdp.index = pd.to_datetime(gdp.index, format='%Y')
start = dt.datetime(1990, 1, 1)
end = dt.datetime(2019, 12, 31)
sp500 = pdr.get_data_yahoo("^GSPC", start, end)['Adj Close']
data = sp500.to_frame().join(gdp, how='outer')
data = data.interpolate(method='linear')
ax = data['Adj Close'].plot()
ax = data['United States'].plot(ax=ax, secondary_y=True)
plt.show()
The GDP data needs to be formatted differently, by unstack’ing, transposing, and resetting the index. Then the index is converted from being strings of year to actually time series.
We use a outer join to get all the dates in the time series. Then we interpolate with a linear method to fill out the gab in the graph.
Finally, we make a plot af Adj Close of S&P 500 stock index and on of the GDP of United States, where we use the same graph, but using the secondary y-axis to plot. That means, the time series on the x-axis is the same.
The resulting graph is.
It could look like a correlation, which is visible in the aftermath of 2008.
Let’s try to make some correlation calculations.
First, let’s not just rely on how US GDP correlates to the US stock market. Let us try to relate it to other countries GDP and see how they relate to the strongest economy in the world.
import pandas_datareader as pdr
import pandas as pd
import matplotlib.pyplot as plt
import datetime as dt
from pandas_datareader import wb
gdp = wb.download(indicator='NY.GDP.MKTP.CD', country=['NO', 'FR', 'US', 'GB', 'DK', 'DE', 'SE'], start=1990, end=2019)
gdp = gdp.unstack().T.reset_index(0)
gdp.index = pd.to_datetime(gdp.index, format='%Y')
start = dt.datetime(1990, 1, 1)
end = dt.datetime(2019, 12, 31)
sp500 = pdr.get_data_yahoo("^GSPC", start, end)['Adj Close']
data = sp500.to_frame().join(gdp, how='outer')
data = data.interpolate(method='linear')
print(data.corr())
Where we compare it the the GDP for some more countries to verify our hypothesis.
Adj Close Denmark France Germany Norway Sweden United Kingdom United States
Adj Close 1.000000 0.729701 0.674506 0.727289 0.653507 0.718829 0.759239 0.914303
Denmark 0.729701 1.000000 0.996500 0.986769 0.975780 0.978550 0.955674 0.926139
France 0.674506 0.996500 1.000000 0.982225 0.979767 0.974825 0.945877 0.893780
Germany 0.727289 0.986769 0.982225 1.000000 0.953131 0.972542 0.913443 0.916239
Norway 0.653507 0.975780 0.979767 0.953131 1.000000 0.978784 0.933795 0.878704
Sweden 0.718829 0.978550 0.974825 0.972542 0.978784 1.000000 0.930621 0.916530
United Kingdom 0.759239 0.955674 0.945877 0.913443 0.933795 0.930621 1.000000 0.915859
United States 0.914303 0.926139 0.893780 0.916239 0.878704 0.916530 0.915859 1.000000
Now that is interesting. The US Stock market (Adj Close) correlates the strongest with the US GDP. Not surprising.
Of the chosen countries, the Danish GDP is the second most correlated to US stock market. The GDP of the countries correlate all strongly with the US GDP. There Norway correlates the least.
Continue the exploration of World Bank data.
Build and Deploy an AI App with Python Flask, OpenAI API, and Google Cloud: In…
Python REST APIs with gcloud Serverless In the fast-paced world of application development, building robust…
App Development with Python using Docker Are you an aspiring app developer looking to level…
Why Value-driven Data Science is the Key to Your Success In the world of data…
Harnessing the Power of Project-Based Learning and Python for Machine Learning Mastery In today's data-driven…
Is Python the right choice for Machine Learning? Should you learn Python for Machine Learning?…