How to Export Pandas DataFrame to Excel and Create a Trendline Graph of Scatter Plot

What will we cover in this tutorial?

We will have some data in a Pandas DataFrame, which we want to export to an Excel sheet. Then we want to create a Scatter plot graph and fit that to a Excel trendline.

Step 1: Get the data

You might have some data already that you want to use. It can be from a HTML page (example) or CSV file.

For this purpose here we just generate some random data to use. We will use NumPy’s uniform function to generate it.

import pandas as pd
import numpy as np

# Generate some random increasing data
data = pd.DataFrame(
    {'A': [np.random.uniform(0.1*i, 0.1*i + 1) for i in range(100)],
     'B': [np.random.uniform(0.1*i, 0.1*i + 1) for i in range(100)]}
)
print(data)

Which will generate some slightly increasing data, which is nice to fit a graph to.

The output could look something like this.

            A          B
0    0.039515   0.778077
1    0.451888   0.210705
2    0.992493   0.961428
3    0.317536   1.046444
4    1.220419   1.388086

Step 2: Create an Excel XlsxWriter engine

This step might require that you install the XlsxWriter library, which is needed from the Pandas library.

This can be done by the following command.

pip install xlsxwriter

Now we can create the engine in our code.

import pandas as pd
import numpy as np

# Generate some random increasing data
data = pd.DataFrame(
    {'A': [np.random.uniform(0.1*i, 0.1*i + 1) for i in range(100)],
     'B': [np.random.uniform(0.1*i, 0.1*i + 1) for i in range(100)]}
)
# Create a Pandas Excel writer using XlsxWriter
excel_file = 'output.xlsx'
sheet_name = 'Data set'
writer = pd.ExcelWriter(excel_file, engine='xlsxwriter')

This will setup a Excel writer engine and be ready to write to file output.xlsx.

Step 3: Write the data to Excel and create a scatter graph with a fitted Trendline

This can be done by the following code, which uses the add_series function to insert a graph.

import pandas as pd
import numpy as np

# Generate some random increasing data
data = pd.DataFrame(
    {'A': [np.random.uniform(0.1*i, 0.1*i + 1) for i in range(100)],
     'B': [np.random.uniform(0.1*i, 0.1*i + 1) for i in range(100)]}
)
# Create a Pandas Excel writer using XlsxWriter
excel_file = 'output.xlsx'
sheet_name = 'Data set'
writer = pd.ExcelWriter(excel_file, engine='xlsxwriter')
data.to_excel(writer, sheet_name=sheet_name)
# Access the XlsxWriter workbook and worksheet objects from the dataframe.
workbook = writer.book
worksheet = writer.sheets[sheet_name]
# Create a scatter chart object.
chart = workbook.add_chart({'type': 'scatter'})
# Get the number of rows and column index
max_row = len(data)
col_x = data.columns.get_loc('A') + 1
col_y = data.columns.get_loc('B') + 1
# Create the scatter plot, use a trendline to fit it
chart.add_series({
    'name':       "Samples",
    'categories': [sheet_name, 1, col_x, max_row, col_x],
    'values':     [sheet_name, 1, col_y, max_row, col_y],
    'marker':     {'type': 'circle', 'size': 4},
    'trendline': {'type': 'linear'},
})
# Set name on axis
chart.set_x_axis({'name': 'Concentration'})
chart.set_y_axis({'name': 'Measured',
                  'major_gridlines': {'visible': False}})
# Insert the chart into the worksheet in field D2
worksheet.insert_chart('D2', chart)
# Close and save the Excel file
writer.save()

Result

The result should be similar to this.

The resulting Excel sheet.

That is how it can be done.

How To Solve Tower of Hanoi with Recursion

What will we cover in this tutorial?

We will first explain and understand Tower of Hanoi programming challenge. It is one of those programming challenges that are highly liked among programmers.

  • Understanding the problem is easy.
  • Solving it seems difficult.
  • Using recursion makes it easy and elegant to solve.

You should be one of those that masters the Tower of Hanoi.

Step 1: Understand the Tower of Hanoi challenge

Tower of Hanoi is a mathematical game, which has three rules. Before we set the rules, let’s see how our universe looks like.

A basic setup of Tower of Hanoi with 3 disks and 3 towers (often called rods)

The disk all have different sizes as pictured above.

The goal is to move all the disks from on tower (rod) to another one with the following 3 rules.

  1. You can only move one disk at the time.
  2. You can only take the top disk and place on top of another tower (rod).
  3. You cannot place a bigger disk on top of a smaller disk.

The first two rules combined means that you can only take one top disk and move it.

Say, in the above we have moved the disk 1 from the first to the second tower (rod).

After that move, we can move disk 2 or disk 1.

The third rule says, that we cannot move disk 2 on top of disk 1.

Those are the 3 rules of the game.

Now do yourself a favor and try to think how you would solve that. How do you get from here.

To here following the 3 rules.

Step 2: Recall recursion and unleash the power of it

Recursion is a method of solving a problem where the solution depends on solutions to smaller instances of the same problem.

https://en.wikipedia.org/wiki/Recursion_(computer_science)

While that is a beautiful and perfect definition – there is still need to exemplify what that means.

A simple example is to sum up the numbers from 1 to n.

It can be a bit difficult to connect the definition of recursions to getting the sum of the integers 1 + 2+ 3 + … + (n – 1) + n.

Let’s first try to do in the iterative way.

def summation(n):
    sum = 0
    for i in range(1, n + 1):
        sum += i
    return sum
print(summation(10))

While there is thing wrong with the above solution, you can turn that into a recursive function by the following.

def summation(n):
    if n == 0:
        return 0
    return n + summation(n - 1)
print(summation(10))

As you see, the problem of summing the numbers from 1 to n, is actually reversed to sum the numbers n, n – 1, n – 2, …, 1. You also notice, that it is true that the summation(n) is equal to n + summation(n – 1).

Wow, that was it. We are breaking the problem down to a problem of smaller size. Just like the definition said.

Also, notice the importance to have a base case in the function. In the above we choose for n == 0 to return 0. This ensures that the recursive calls to not continue forever (or when the Python interpreters stops due to maximum recursion depth).

Try it.

def summation(n):
    return n + summation(n - 1)
print(summation(10))

Well, what did we gain from making the function recursive?

Good question. The above might not be a good example of how recursion helps you. The example of Tower of Hanoi will show you the benefit. It will make your code easy and straight forward.

Step 3: Implement Tower of Hanoi with a recursive function

Now we need to think recursive. Consider the problem again.

How can we break that down to a smaller problem?

Think backwards. Just like the summation from Step 2. What do we need to make happen if we should move disk 3 from first tower (rod) to the last tower (rod)?

Exactly. Then we can move disk 3 to the final destination.

And after that, we should move the smaller problem of the 2 disks on top fo disk 3.

Wow. That is the formula. It is all you need to know.

  1. Move the smaller problem of 2 disks from first tower (rod) to second tower (rod).
  2. Move the big disk from first tower (rod) to last tower (rod).
  3. Move the smaller problem of 2 disks from second tower (rod) to last tower (rod).

Now we need to generalize that.

First understand that there can be any number of disks in an instance of Tower of Hanoi. This means that the problem starts with n disks. If we number the towers (rods) 0, 1, and 2.

Then we have that all n disks start on tower (rod) 0 and should end in tower (rod) 2.

Now we can break down the problem to the following. Given n disks that needs to be moved from start_tower to dest_tower (destination), using aux_tower (auxiliary).

  1. Move subproblem of n – 1 disks from start_tower to aux_tower.
  2. Move disk n to dest_tower.
  3. Move subproblem of n – 1 disk from aux_tower to dest_tower.

See the code below.

class Towers:
    def __init__(self, disks=3):
        self.disks = disks
        self.towers = [[]]*3
        self.towers[0] = [i for i in range(self.disks, 0, -1)]
        self.towers[1] = []
        self.towers[2] = []
    def __str__(self):
        output = ""
        for i in range(self.disks, -1, -1):
            for j in range(3):
                if len(self.towers[j]) > i:
                    output += " " + str(self.towers[j][i])
                else:
                    output += "  "
            output += "\n"
        return output + "-------"
    def move(self, from_tower, dest_tower):
        disk = self.towers[from_tower].pop()
        self.towers[dest_tower].append(disk)

def solve_tower_of_hanoi(towers, n, start_tower, dest_tower, aux_tower):
    # Base case - do nothing
    if n == 0:
        return
    # Move subproblem of n - 1 disks from start_tower to aux_tower.
    solve_tower_of_hanoi(towers, n - 1, start_tower, aux_tower, dest_tower)
    # Move disk n to dest_tower.
    towers.move(start_tower, dest_tower)
    print(towers)
    # Move subproblem of n - 1 disk from aux_tower to dest_tower.
    solve_tower_of_hanoi(towers, n - 1, aux_tower, dest_tower, start_tower)

t = Towers()
print(t)
solve_tower_of_hanoi(t, len(t.towers), 0, 2, 1)

The code includes a simple print function to see the trace of the movings.

Too fast?

Now this often seems a bit too fast. Didn’t we leave out all the subproblems?

That is the beauty of it. We actually didn’t

You tell the machine how to solve the problem using a smaller instance of the problem. This was done with the three things we did. First, move the subproblem away. Second, move the the biggest disk. Finally, move the subproblem on top of biggest disk.

How does that solve it all. See, you solve it for general n, hence, the smaller subproblems solves it by the same formula and we ensure the base case when there are no disks to move.

Master Selection Sort in Python in 3 Steps

What will we cover in this tutorial

Selection sort is one of the simplest sorting algorithms, which is a good algorithm to start with. While the algorithm is considered to be slow, it has the advantage of not using auxiliary space.

Step 1: Understand the Selection Sort algorithm

The goal of sorting is to take an unsorted array of integers and sort it.

Example given below.

[97, 29, 53, 92, 42, 36, 12, 57, 90, 76, 85, 81, 12, 61, 45, 3, 83, 34, 7, 48]
to
[3, 7, 12, 12, 29, 34, 36, 42, 45, 48, 53, 57, 61, 76, 81, 83, 85, 90, 92, 97]

The algorithm is the most intuitive way of sorting a list.

It works as follows.

  1. Go through the list to be sorted and find the smallest element.
  2. Switch the smallest element with the first position.

If you started with the following list.

[97, 29, 53, 92, 42, 36, 12, 57, 90, 76, 85, 81, 12, 61, 45, 3, 83, 34, 7, 48]

You would now have this list.

[3, 29, 53, 92, 42, 36, 12, 57, 90, 76, 85, 81, 12, 61, 45, 97, 83, 34, 7, 48]

Notice, that now we have the smallest element in the front of the list, we know that the second smallest element must be somewhere in the list starting from the second position all the way to the end.

Hence, you can repeat step the above 2 steps on the list excluding the first element.

This will give you the following list.

[3, 7, 53, 92, 42, 36, 12, 57, 90, 76, 85, 81, 12, 61, 45, 97, 83, 34, 29, 48]

Now we have that the first two elements are sorted, while the rest of the list is not sorted.

Hence, we can repeat the two steps again on the unsorted part of the list.

If we continue this until the we reach the end of the list. This should give us a sorted list.

Step 2: Implementation of Selection Sort

A beautiful thing about Selection Sort is that it does not use any auxiliary memory. If you are new to sorting, then this can be a big advantage if sorting large data sets.

The disadvantage of Selection Sort is the time complexity.

We will come back to that later.

The code of Selection Sort can be done in the following manner.

def selection_sort(list_to_sort):
    for i in range(len(list_to_sort)):
        index_of_min_value = i
        for j in range(i + 1, len(list_to_sort)):
            if list_to_sort[j] < list_to_sort[index_of_min_value]:
                index_of_min_value = j
        list_to_sort[i], list_to_sort[index_of_min_value] = list_to_sort[index_of_min_value], list_to_sort[i]

list_to_sort = [97, 29, 53, 92, 42, 36, 12, 57, 90, 76, 85, 81, 12, 61, 45, 3, 83, 34, 7, 48]
selection_sort(list_to_sort)
print(list_to_sort)

This will produce the correct output.

[3, 7, 12, 12, 29, 34, 36, 42, 45, 48, 53, 57, 61, 76, 81, 83, 85, 90, 92, 97]

Step 3: The time complexity of Selection Sort algorithm

Now this is the sad part of this simple algorithm. It does not perform good. A sorting algorithm is considered efficient if it runs in O(n log(n)), which Selection Sort does not.

The simple time complexity analysis is as follows.

Assume we have a list of n unsorted integers. Then the first iteration of the list will make n – 1 comparisons, the second iteration will make n – 2 comparisons, and so forth all the way down to 1 comparison.

This is the sum of 1 to n – 1, which is found by this formula (n – 1)(n – 2)/2, which is O(n^2).

Other than that the algorithm does n swapping of numbers. This is O(n).

This combines the algorithm to O(n + n^2) = O(n^2).

Next Step

This should wake your appetite to understand how you can make more efficient sorting.

Another good example of a simple sorting algorithm is the Insertion Sort algorithm.

For more efficient algorithm you should check out the Merge Sort algorithm.

If you want to be serious about sorting, check out my online course on the subject.

Why use Exceptions? How-to use Exceptions in the correct way!

What will we cover in this tutorial?

The key to use exceptions correct is to understand why we use them.

The best way to understand why to use exceptions is to see what happens when we do not use exceptions in our code.

After that exploration we will show how it can solve the examples with exceptions.

Step 1: In the perfect world!

Remember those times before the exception was invented?

No? Well, of course not. It is long time ago, in a distant past in a programming language you probably have not heard about.

As an aspiring programmer, you have probably seen and heard about exceptions, but never put them to use yourself.

The best way is to create examples that can be greatly simplified with exceptions.

Let’s keep the world simple.

def add(a, b):
    return a + b

print(add(2, 3))

This would print out the result 5.

As long as the world is clean and everybody uses the function correct, there is nothing to worry about.

But what if someone calls the function with wrong types of arguments. Then the function does not do as expected.

def add(a, b):
    return a + b

print(add("2", "3"))

This will print 23, which might be a bit surprising if you are not familiar with the function, and possibly how Python uses addition on strings.

So how to handle it.

Step 2: In the real world where people do not use things as intended

As you will realize in your careers of programming, it happens that other programmers do not use your function correctly.

Why would they do that?

Maybe they don’t take the time to read the good documentation you wrote. Or maybe they are just careless and working under hard time pressure.

Even more funny, it could be you. It could be code you wrote a year ago (or less), and the documentation was not that good as you thought, hence, you use your own function incorrectly.

To continue the simple example. One way to handle wrong input without exceptions could be as follows.

def add(a, b):
    if type(a) is not int or type(b) is not int:
        return None
    return a + b

result = add("2", "3")
if result is not None:
    print(result)
else:
    print("Invalid input format to function add!")

Oh, no! What happened. Actually, your function is not that difficult to understand. It just starts by checking if the input format is as expected. If not, return an error code. As this is Python, an error code can simply be the value None.

That might seem fine. But what about the user of your function? Now the user needs to validate that the correct format of the result was returned.

This makes the code more complex. Also, the user of your function needs to know more details of how your function handles invalid input, or whatever can go wrong in your function.

The problem is, that you pass on a problem to someone else, which needs to know about the details of how your function works.

This is the opposite of decoupling. And decoupling is considered good. Easier to program, easier to change, easier to maintain, to mention a few benefits.

Step 3: Solve it with exceptions

The core idea of exceptions is to handle the when something out of the ordinary happens from the main logic of a program.

What? Yes, the expected case in our simple function, is that the user provides integers as input. But it might happen they do not call it with integers. That is unexpected, that is something out of the ordinary, and this breaks the logic in how you would build the program.

So how should we do in the above example?

def add(a, b):
    if type(a) is not int or type(b) is not int:
        raise Exception("Input format incorrect")
    return a + b

Actually, your part of the obligations becomes easier.

How?

Because now you do not need to write complex documentation of the error return codes in your program. The documentation is kept automatically in the exception. It will even guide the user to where in the code the exception comes from.

def add(a, b):
    if type(a) is not int or type(b) is not int:
        raise Exception("Input format incorrect")
    return a + b

result = add("2", "3")

This would give the following result.

Traceback (most recent call last):
  File "/Users/admin/PycharmProjects/LearningSpace/test2.py", line 7, in <module>
    result = add("2", "3")
  File "/Users/admin/PycharmProjects/LearningSpace/test2.py", line 3, in add
    raise Exception("Input format incorrect")
Exception: Input format incorrect

Which leads the user of the function to where in the code it went wrong. Then it will be easier for them to figure out what is wrong and how to fix it.

Step 4: Try-catch from the user

It happens that the user of your function would like to handle the special case.

They now master what is going wrong, and want to inform their users of the problem.

def add(a, b):
    if type(a) is not int or type(b) is not int:
        raise Exception("Input format incorrect")
    return a + b

try:
    result = add("2", "3")
except:
    print("The input format is incorrect")

This leads to something you might not notice at first. It leads to a good easy flow in your program. It is easy to read and understand.

It is clear that this is not intended flow in the except-part of the program. Also, the normal flow in the program would be in the try-part.

It makes it easier to understand, which is the ultimate goal of a good programmer. Make your code easy to read for others. That is, including you in less than 6 months (we do forget our own code and the logic in the programs we write faster than we expect).