Learn how you can become a Python programmer in just 12 weeks.

    We respect your privacy. Unsubscribe at anytime.

    4 Easy Steps to Understand Unsupervised Machine Learning with an Example in Python

    Step 1: Learn what is unsupervised machine learning?

    An unsupervised machine learning model takes unlabelled (or categorised) data and lets the algorithm determined the answer for us.

    Unsupervised Machine Learning model - takes unstructured data and finds patterns itself
    Unsupervised Machine Learning model – takes unstructured data and finds patterns itself

    The unsupervised machine learning model data without apparent structures and tries to identify some patterns itself to create categories.

    Step 2: Understand the main types of unsupervised machine learning

    There are two main types of unsupervised machine learning types.

    • Clustering: Is used for grouping data into categories without knowing any labels before hand.
    • Association: Is a rule-based for discovering interesting relations between variables in large databases.

    In clustering the main algorithms used are K-means, hierarchy clustering, and hidden Markov model.

    And in the association the main algorithm used are Apriori and FP-growth.

    Step 3: How does K-means work

    The K-means works in iterative steps

    The k-means algorithm starts is an NP-hard problem, which mean there is no efficient way to solve in the general case. For this problem there are heuristics algorithms that converge fast to local optimum, which means you can find some optimum fast, but it might not be the best one, but often they can do just fine.

    Enough, theory.

    How does the algorithm work.

    • Step 1: Start by a set of k means. These can be chosen by taking k random point from the dataset (called the Random Partition initialisation method).
    • Step 2: Group each data point into the cluster of the nearest mean. Hence, each data point will be assigned to exactly one cluster.
    • Step 3: Recalculate the the means (also called centroids) to converge towards local optimum.

    Steps 2 and 3 are repeated until the grouping in Step 2 does not change any more.

    Step 4: A simple Python example with the k-means algorithm

    In this example we are going to start assuming you have the basic knowledge how to install the needed libraries. If not, then see the following article.

    First of, you need to import the needed libraries.

    import numpy as np
    import matplotlib.pyplot as plt
    from matplotlib import style
    from sklearn.cluster import KMeans
    

    In the first basic example we are only going to plot some points on a graph.

    style.use('ggplot')
    x = [1, 2, 0.3, 9.2, 2.4,  9, 12]
    y = [2, 4, 2.5, 8.5, 0.3, 11, 10]
    plt.scatter(x, y)
    plt.show()
    

    The first line sets a style of the graph. Then we have the coordinates in the arrays x and y. This format is used to feed the scatter.

    Output of the plot from scatter plotter in Python.
    Output of the plot from scatter plotter in Python.

    An advantage of plotting the points before you figure out how many clusters you want to use. Here it looks like there are two “groups” of plots, which translates into using to clusters.

    To continue, we want to use the k means algorithm with two clusters.

    import numpy as np
    import matplotlib.pyplot as plt
    from matplotlib import style
    from sklearn.cluster import KMeans
    style.use('ggplot')
    x = [1, 2, 0.3, 9.2, 2.4,  9, 12]
    y = [2, 4, 2.5, 8.5, 0.3, 11, 10]
    # We need to transform the input coordinates to plot use the k means algorithm
    X = []
    for i in range(len(x)):
        X.append([x[i], y[i]])
    X = np.array(X)
    # The number of clusters
    kmeans = KMeans(n_clusters=2)
    kmeans.fit(X)
    labels = kmeans.labels_
    # Then we want to have different colors for each type.
    colors = ['g.', 'r.']
    for i in range(len(X)):
        # And plot them one at the time
        plt.plot(X[i][0], X[i][1], colors[labels[i]], markersize=10)
    # Plot the centres (or means)
    plt.scatter(centroids[:, 0], centroids[:, 1], marker= "x", s=150, linewidths=5, zorder=10)
    plt.show()
    

    This results in the following result.

    Example of k means algorithm used on simple dataset
    Example of k means algorithm used on simple dataset

    Considerations when using K-Means algorithm

    We could have changed to use 3 clusters. That would have resulted in the following output.

    Using 3 clusters instead of two in the k-mean algorithm
    Using 3 clusters instead of two in the k-mean algorithm

    This is not optimal for this dataset, but could be hard to predict without this visual representation of the dataset.

    Uses of K-Means algorithm

    Here are some interesting uses of the K-means algorithms:

    • Personalised marketing to users
    • Identifying fake news
    • Spam filter in your inbox

    Python for Finance: Unlock Financial Freedom and Build Your Dream Life

    Discover the key to financial freedom and secure your dream life with Python for Finance!

    Say goodbye to financial anxiety and embrace a future filled with confidence and success. If you’re tired of struggling to pay bills and longing for a life of leisure, it’s time to take action.

    Imagine breaking free from that dead-end job and opening doors to endless opportunities. With Python for Finance, you can acquire the invaluable skill of financial analysis that will revolutionize your life.

    Make informed investment decisions, unlock the secrets of business financial performance, and maximize your money like never before. Gain the knowledge sought after by companies worldwide and become an indispensable asset in today’s competitive market.

    Don’t let your dreams slip away. Master Python for Finance and pave your way to a profitable and fulfilling career. Start building the future you deserve today!

    Python for Finance a 21 hours course that teaches investing with Python.

    Learn pandas, NumPy, Matplotlib for Financial Analysis & learn how to Automate Value Investing.

    “Excellent course for anyone trying to learn coding and investing.” – Lorenzo B.

    Leave a Comment