Learn how you can become a Python programmer in just 12 weeks.

    We respect your privacy. Unsubscribe at anytime.

    3 Easy Steps to Get Started With Machine Learning: Understand the Concept and Implement Linear Regression in Python

    What will we cover in this article?

    • What is Machine Learning and how it can help you?
    • How does Machine Learning work?
    • A first example of Linear Regression in Python

    Step 1: How can Machine Learning help you?

    Machine Learning is a hot topic these days and it is easy to get confused when people talk about it. But what is Machine Learning and how can it you?

    I found the following explanation quite good.

    Classical vs modern (No machine learning vs machine learning) approach to predictions.
    Classical vs modern (No machine learning vs machine learning) approach to predictions.

    In the classical computing model every thing is programmed into the algorithms. This has the limitation that all decision logic need to be understood before usage. And if things change, we need to modify the program.

    With the modern computing model (Machine Learning) this paradigm is changes. We feed the algorithms with data, and based on that data, we do the decisions in the program.

    While this can seem abstract, this is a big change in thinking programming. Machine Learning has helped computers to have solutions to problems like:

    • Improved search engine results.
    • Voice recognition.
    • Number plate recognition.
    • Categorisation of pictures.
    • …and the list goes on.

    Step 2: How does Machine Learning work?

    I’m glad you asked. I was wondering about that myself.

    On a high level you can divide Machine Learning into two phases.

    • Phase 1: Learning
    • Phase 2: Prediction

    The Learning phase is divided into steps.

    Machine Learning: The Learning Phase: Training data, Pre-processing, Learning, Testing
    Machine Learning: The Learning Phase: Training data, Pre-processing, Learning, Testing

    It all starts with a training set (training data). This data set should represent the type of data that the Machine Learn model should be used to predict from in Phase 2 (predction).

    The pre-processing step is about cleaning up data. While the Machine Learning is awesome, it cannot figure out what good data looks like. You need to do the cleaning as well as transforming data into a desired format.

    Then for the magic, the learning step. There are three main paradigms in machine learning.

    • Supervised: where you tell the algorithm what categories each data item is in. Each data item from the training set is tagged with the right answer.
    • Unsupervised: is when the learning algorithm is not told what to do with it and it should make the structure itself.
    • Reinforcement: teaches the machine to think for itself based on past action rewards.

    Finally, the testing is done to see if the model is good. The training data was divided into a test set and training set. The test set is used to see if the model can predict from it. If not, a new model might be necessary.

    After that the Prediction Phase begins.

    How Machine Learning predicts new data.
    How Machine Learning predicts new data.

    When the model has been created it will be used to predict based on it from new data.

    Step 3: For our first example of Linear Regression in Python

    Installing the libraries

    Linear regression is a linear approach to modelling the relationship between a scalar response to one or more variables. In the case we try to model, we will do it for one single variable. Said in another way, we want map points on a graph to a line (y = a*x + b).

    To do that, we need to import various libraries.

    # Importing matplotlib to make a plot
    import matplotlib.pyplot as plt
    # work with number as array
    import numpy as np
    # we want to use linear_model (that uses datasets)
    from sklearn import linear_model
    

    The matplotlib library is used to make a plot, but is a comprehensive library for creating static, animated, and interactive visualizations in Python. If you do not have it installed you can do that by typing in the following command in a terminal.

    pip install matplotlib
    

    The numpy is a powerful library to calculate with N-dimensional arrays. If needed, you can install it by typing the following command in a terminal.

    pip install numpy
    

    Finally, you need the linear_model from the sklearn library, which you can install by typing the following command in a terminal.

    pip install scikit-learn
    

    Training data set

    This simple example will let you make a linear regression of an input of the following data set.

    # data set
    prices = [245, 312, 279, 308, 199, 409, 200, 400, 230]
    size = [50, 60, 35, 55, 30, 65, 30, 75, 25]
    

    Here some items are sold, but each item has a size. The first item was sold for 245 ($) and had a size of 50 (something). The next item was sold to 312 ($) and had a size of 60 (something).

    The sizes needs to be reshaped before we model it.

    # Importing matplotlib and numpy and sklearn
    import matplotlib.pyplot as plt
    # work with number as array
    import numpy as np
    # we want to use linear_model (that uses datasets)
    from sklearn import linear_model
    # data set
    prices = [245, 312, 279, 308, 199, 409, 200, 400, 230]
    size = [50, 60, 35, 55, 30, 65, 30, 75, 25]
    # reshape the input for regression ( second argument how many items
    size2 = np.array(size).reshape((-1, 1))
    print(size2)
    

    Which results in the following output.

    [[50]
     [60]
     [35]
     [55]
     [30]
     [65]
     [30]
     [75]
     [25]]
    

    Hence, the reshape((-1, 1)) transforms it from a row to a single array.

    Then for the linear regression.

    # Importing matplotlib and numpy and sklearn
    import matplotlib.pyplot as plt
    # work with number as array
    import numpy as np
    # we want to use linear_model (that uses datasets)
    from sklearn import linear_model
    # data set
    prices = [245, 312, 279, 308, 199, 409, 200, 400, 230]
    size = [50, 60, 35, 55, 30, 65, 30, 75, 25]
    # reshape the input for regression ( second argument how many items
    size2 = np.array(size).reshape((-1, 1))
    print(size2)
    regr = linear_model.LinearRegression()
    regr.fit(size2, prices)
    print("Coefficients", regr.coef_)
    print("intercepts", regr.intercept_)
    

    Which prints out the coefficient (a) and the intercept (b) of a formula y = a*x + b.

    Now you can predict future prices, when given a size.

    # How to predict
    size_new = 60
    price = size_new * regr.coef_ + regr.intercept_
    print(price)
    print(regr.predict([[size_new]]))
    

    Where you both can compute it directly (2nd line) or use the regression model (4th line).

    Finally, you can plot the linear regression as a graph.

    # Importing matplotlib and numpy and sklearn
    import matplotlib.pyplot as plt
    # work with number as array
    import numpy as np
    # we want to use linear_model (that uses datasets)
    from sklearn import linear_model
    # data set
    prices = [245, 312, 279, 308, 199, 409, 200, 400, 230]
    size = [50, 60, 35, 55, 30, 65, 30, 75, 25]
    # reshape the input for regression ( second argument how many items
    size2 = np.array(size).reshape((-1, 1))
    print(size2)
    regr = linear_model.LinearRegression()
    regr.fit(size2, prices)
    # Here we plot the graph
    x = np.array(range(20, 100))
    y = eval('regr.coef_*x + regr.intercept_')
    plt.plot(x, y)
    plt.scatter(size, prices, color='black')
    plt.ylabel('prices')
    plt.xlabel('size')
    plt.show()
    

    Which results in the following graph.

    Example of linear regression in Python
    Example of linear regression in Python

    Conclusion

    This is obviously a simple example of linear regression, as it only has one variable. This simple example shows you how to setup the environment in Python and how to make a simple plot.

    Python for Finance: Unlock Financial Freedom and Build Your Dream Life

    Discover the key to financial freedom and secure your dream life with Python for Finance!

    Say goodbye to financial anxiety and embrace a future filled with confidence and success. If you’re tired of struggling to pay bills and longing for a life of leisure, it’s time to take action.

    Imagine breaking free from that dead-end job and opening doors to endless opportunities. With Python for Finance, you can acquire the invaluable skill of financial analysis that will revolutionize your life.

    Make informed investment decisions, unlock the secrets of business financial performance, and maximize your money like never before. Gain the knowledge sought after by companies worldwide and become an indispensable asset in today’s competitive market.

    Don’t let your dreams slip away. Master Python for Finance and pave your way to a profitable and fulfilling career. Start building the future you deserve today!

    Python for Finance a 21 hours course that teaches investing with Python.

    Learn pandas, NumPy, Matplotlib for Financial Analysis & learn how to Automate Value Investing.

    “Excellent course for anyone trying to learn coding and investing.” – Lorenzo B.

    Leave a Comment