Learn how you can become a Python programmer in just 12 weeks.

    We respect your privacy. Unsubscribe at anytime.

    How to Calculate Correlation between Stock Price Movements with Python

    What will we cover?

    In this lesson we will learn about correlation of assets, calculations of correlation, and risk and coherence.

    The learning objectives of this tutorial.

    • What is correlation and how to use it
    • Calculate correlation
    • Find negatively correlated assets
    Watch lesson

    Step 1: What is Correlation

    Correlation is a statistic that measures the degree to which two variables move in relation to each other. Correlation measures association, but doesn’t show if x causes y or vice versa.

    The correlation between two stocks is a number form -1 to 1 (both inclusive).

    • A positive correlation means, when stock x goes up, we expect stock y to go up, and opposite.
    • A negative correlation means, when stock x goes up, we expect stock y to go down, and opposite.
    • A zero correlation, we cannot say anything in relation to each other.

    The formula for calculating the correlation is quite a mouthful.

    Step 2: Calculate the Correlation with DataFrames (pandas)

    Luckily, the DataFrames can calculate it for us. Hence, we do not need to master how to do it.

    Let’s get started. First, we need to load some time series of historic stock prices.

    See this tutorial on how to work with portfolios.

    import pandas as pd
    import pandas_datareader as pdr
    import datetime as dt
    import numpy as np
     
    tickers = ['AAPL', 'TWTR', 'IBM', 'MSFT']
    start = dt.datetime(2020, 1, 1)
     
    data = pdr.get_data_yahoo(tickers, start)
    data = data['Adj Close']
     
    log_returns = np.log(data/data.shift())
    

    Where we also calculate the log returns.

    The correlation can be calculated as follows.

    log_returns.corr()
    

    That was easy, right? Remember we do it on the log returns to keep it on the same range.

    Symbols AAPL    TWTR    IBM MSFT
    Symbols             
    AAPL    1.000000    0.531973    0.518204    0.829547
    TWTR    0.531973    1.000000    0.386493    0.563909
    IBM 0.518204    0.386493    1.000000    0.583205
    MSFT    0.829547    0.563909    0.583205    1.000000
    

    We identify, that the correlation on the diagonal is 1.0. This is obvious, since the diagonal shows the correlation between itself (AAPL and AAPL, and so forth).

    Other than that, we can conclude that AAPL and MSFT are correlated the most.

    Step 3: Calculate the correlation to the general market

    Let’s add the S&P 500 to our DataFrame.

    sp500 = pdr.get_data_yahoo("^GSPC", start)
     
    log_returns['SP500'] = np.log(sp500['Adj Close']/sp500['Adj Close'].shift())
     
    log_returns.corr()
    

    Resulting in this.

    Where we see that AAPL and MSFT are mostly correlated to S&P 500 index. This is not surprising, as they are a big part of the weight of the market cap in the index.

    Step 4: Find Negative Correlated assets when Investing using Python

    We will add this helper function to help find correlations.

    We are in particular interested in negative correlation here.

    def test_correlation(ticker):
        df = pdr.get_data_yahoo(ticker, start)
        lr = log_returns.copy()
        lr[ticker] = np.log(df['Adj Close']/df['Adj Close'].shift())
        return lr.corr()
    

    This can help us find assets with a negative correlation.

    Now, let’s test.

    test_correlation("TLT")
    

    Resulting in this following.

    The negative correlation we are looking for.

    Step 5: Visualize the negative correlation

    This can be visualized to get a better understanding as follows.

    import matplotlib.pyplot as plt
    %matplotlib notebook
     
    def visualize_correlation(ticker1, ticker2):
        df = pdr.get_data_yahoo([ticker1, ticker2], start)
        df = df['Adj Close']
        df = df/df.iloc[0]
        fig, ax = plt.subplots()
        df.plot(ax=ax)
    

    With visualize_correlation(“AAPL”, “TLT”) we get.

    Where we see, when AAPL goes down, the TLT goes up.

    And if we look at visualize_correlation(“^GSPC”, “TLT”) (the S&P 500 index and TLT).

    Want to learn more?

    This is part of a 2.5-hour full video course in 8 parts about Risk and Return.

    In the next lesson you will learn how to use Linear Regression to Calculate the Beta to the General Market (S&P 500).

    12% Investment Solution

    Would you like to get 12% in return of your investments?

    D. A. Carter promises and shows how his simple investment strategy will deliver that in the book The 12% Solution. The book shows how to test this statement by using backtesting.

    Did Carter find a strategy that will consistently beat the market?

    Actually, it is not that hard to use Python to validate his calculations. But we can do better than that. If you want to work smarter than traditional investors then continue to read here.

    Python Circle

    Do you know what the 5 key success factors every programmer must have?

    How is it possible that some people become programmer so fast?

    While others struggle for years and still fail.

    Not only do they learn python 10 times faster they solve complex problems with ease.

    What separates them from the rest?

    I identified these 5 success factors that every programmer must have to succeed:

    1. Collaboration: sharing your work with others and receiving help with any questions or challenges you may have.
    2. Networking: the ability to connect with the right people and leverage their knowledge, experience, and resources.
    3. Support: receive feedback on your work and ask questions without feeling intimidated or judged.
    4. Accountability: stay motivated and accountable to your learning goals by surrounding yourself with others who are also committed to learning Python.
    5. Feedback from the instructor: receiving feedback and support from an instructor with years of experience in the field.

    I know how important these success factors are for growth and progress in mastering Python.

    That is why I want to make them available to anyone struggling to learn or who just wants to improve faster.

    With the Python Circle community, you can take advantage of 5 key success factors every programmer must have.

    Python Circle
    Python Circle

    Be part of something bigger and join the Python Circle community.

    Leave a Comment