## What will we cover in this tutorial?

In this tutorial we will show how the Kruskal’s algorithm works. First, we will just give an example of what it does.

Consider the following a undirected edge-weighted graph. The vertices (or nodes) are numbered from 0 to 7 to make it simpler later. It contains edges, which are weighted. That means that the weight between vertex (or node) 0 and 1 is 6. Similar, the weight between vertex (or node) 0 and 2 is 3.

Our goal is to find a minimum spanning tree or forest if graph is not connected. That means we need to find a subset of edges, which connects all vertices (or nodes) with the minimum sum of weight.

First wenNotice, that all the vertices (nodes) are connected and we have chosen the edges the the lowest weight between them.

That is the challenge we want to solve. Given any undirected edge-weighted graph, find the minimum spanning tree of that graph.

In this tutorial we will understand and find a solution using Kruskal’s algorithm, which solves the above problem for an undirected edge-weighted graph.

## Step 1: High level understanding of Kruskal’s algorithm

We can see on wikipedia the following pseudo code is given.

```algorithm Kruskal(G) is
F:= ∅
for each v ∈ G.V do
MAKE-SET(v)
for each (u, v) in G.E ordered by weight(u, v), increasing do
if FIND-SET(u) ≠ FIND-SET(v) then
F:= F ∪ {(u, v)}
UNION(FIND-SET(u), FIND-SET(v))
return F
```

Which is actually quite nice. So let us first try to understand what happens in the code. Below we describe the pseudo code in bullets from 1 to 5.

1. The algorithm Kruskal takes an argument G, which represents the graph.
2. It creates a set F to be the empty set. The set F represents the forest, and we will add edges to the forest, F, as the algorithm iterates over the edges later.
3. Then it creates a set for each vertex (or node) in the graph. These sets needs to be represented somehow, which is not given from the algorithm. Also, each set only contains one vertex. As the algorithm iterates along, it will join (or make unions) the sets, as the vertices are joined.
4. Then we iterate over the edges, which are ordered in increasing order of the weight. That is, we take the edge with the lowest weight first (if more have the same weight, we just take any of them).
• It will check if the vertices (u and v), which the edge connects, are in the same set.
• If they are not in the same set, then it will add the edge to the forest, F, and join the sets to one.
5. After iterating over all edges, we return the forest F.

## Step 2: A visual walk-through of Kruskal’s algorithm.

Still a lot to figure out. Let’s try with an example. Below illustration depicts the graph, G, given as argument to Kruskal’s algorithm in bullet 1 (in previous Step of the tutorial).

At bullet 2, we create an empty set, F, to represent the solution.

Then at bullet 3 we make empty sets of all vertices. We can represent that with a graph with no edges.

At bullet 4 we start iterating over the edges. There we will take the edge with the lowest weight from G. As there are two with weight 1, we can take any. Below we represent the sets as each group of vertices which are connected (only vertex 5 and 7 are connected), the rest are sets of a single vertex. The solution, F, is represented by the edges in the diagram.

Then we take the next edge, again it is an edge with weight 1. We check if they are in the same set. Otherwise we do the same. As they are not, we do the same.

And we continue like that.

Again.

Again.

Again.

Again.

Now it will continue to iterate over the edges, but none of them will be added, as we only have one set left, there is no more to be added.

Hence, it will return F = [(5-7), (2-4), (3-7), (1-2), (5-6), (2-4), (0-2)] in bullet 5 after the last iteration in the loop.

## Step 3: Representing the graph

We need two classes to implement Kruskal’s algorithm. One for representing the graph and one for representing the sets.

Let’s start with the graph. What is relevant in a graph. Let’s take a look at it again.

Here are the features of the class.

1. It contains numbered vertices from 0 to 7. That is we need to know how many vertices there are.
2. It contains edges between the vertices. They can be represented by two vertex indices. Also, it should have a weight.
3. In the pseudo code, we also see that we need to access how many vertices there are. Get the edges and have the sorted by the weight in increasing order.

Now how do you do that?

```class Graph:
def __init__(self, vertices):
self.vertices = vertices
self.edges = []
self.edges.append((u, v, weight))
def sort_edges(self):
self.edges = sorted(self.edges, key=lambda item: item[2])
def get_edges(self):
return self.edges
def number_of_vertices(self):
return self.vertices
```

In this way, we can create a graph with a number of vertices and add edges. When we have done that, we can sort them and get the. Also, we can get the number of vertices.

W can represent the above graph by the following code.

```g = Graph(8)
```

That should represent the graph we have used in this tutorial.

## Step 4: Representing the sets in the algorithm

A simple temptation would be to represent the sets in the algorithm as a Python list of sets. But that would not give the time complexity of the algorithm.

Hence, we need to read what is needed. On wikipedia it says, the code is implemented with a disjoint-set data structure. The disjoint-set data structure has very efficient lookups of which set a vertex belongs to, as well as fast unions. That is needed to keep the complexity of Kruskal’s algorithm to be O(E log(V)), where E is the number of edges and V the number of vertices.

The disjoint-data set structure has faster than O(log(V)) lookup and union operations. Hence, it will keep the time complexity to O(E log(V)). More about that later.

First for a implementation of the disjoint-data set structure in Python.

```class DisjointSet:
def __init__(self, size):
self.parent = []
self.rank = []
for node in range(size):
self.parent.append(node)
self.rank.append(0)
def find(self, v):
if self.parent[v] == v:
return v
else:
return self.find(self.parent[v])
def union(self, u, v):
u_root = self.find(u)
v_root = self.find(v)
if self.rank[u_root] < self.rank[v_root]:
self.parent[u_root] = v_root
elif self.rank[u_root] > self.rank[v_root]:
self.parent[v_root] = u_root
else:
self.parent[v_root] = u_root
self.rank[u_root] += 1
```

We will not go into details of the time complexities of this data structure.

Notice the following.

1. The constructor creates the number of disjoint sets, as given as argument in the constructor.
2. Find returns the root in the set. Hence, there is one uniquely defined vertex in the set that is the root. Hence, for any two vertices in the same set, the find function will always return the same root for each of the two vertices.
3. The union function joins two sets based on a rank, in order to keep the time complexities.

## Step 5: Kruskal’s algorithm

Now we have the understanding and the two helper classes for the Kruskal’s algorithm. Then the implementation is almost identical to the pseudo code from wikipedia. The pseudo code is given here for comparison.

```algorithm Kruskal(G) is
F:= ∅
for each v ∈ G.V do
MAKE-SET(v)
for each (u, v) in G.E ordered by weight(u, v), increasing do
if FIND-SET(u) ≠ FIND-SET(v) then
F:= F ∪ {(u, v)}
UNION(FIND-SET(u), FIND-SET(v))
return F
```

And the actual Python code is given here.

```def kruskal(graph):
forest = []
graph.sort_edges()
disjoint_set = DisjointSet(graph.number_of_vertices())
for (u, v, weight) in graph.get_edges():
if disjoint_set.find(u) != disjoint_set.find(v):
forest.append((u, v, weight))
disjoint_set.union(u, v)
return forest
```

Now let’s try our example given in this tutorial.

Below you have the full code.

```class Graph:
def __init__(self, vertices):
self.vertices = vertices
self.edges = []
self.edges.append((u, v, weight))
def sort_edges(self):
self.edges = sorted(self.edges, key=lambda item: item[2])
def get_edges(self):
return self.edges
def number_of_vertices(self):
return self.vertices

class DisjointSet:
def __init__(self, size):
self.parent = []
self.rank = []
for node in range(size):
self.parent.append(node)
self.rank.append(0)
def find(self, v):
if self.parent[v] == v:
return v
else:
return self.find(self.parent[v])
def union(self, u, v):
u_root = self.find(u)
v_root = self.find(v)
if self.rank[u_root] < self.rank[v_root]:
self.parent[u_root] = v_root
elif self.rank[u_root] > self.rank[v_root]:
self.parent[v_root] = u_root
else:
self.parent[v_root] = u_root
self.rank[u_root] += 1

def kruskal(graph):
forest = []
graph.sort_edges()
disjoint_set = DisjointSet(graph.number_of_vertices())
for (u, v, weight) in graph.get_edges():
if disjoint_set.find(u) != disjoint_set.find(v):
forest.append((u, v, weight))
disjoint_set.union(u, v)
return forest

g = Graph(8)
result = kruskal(g)
for r in result:
print(r)
```

This code will output.

```(2, 4, 1)
(5, 7, 1)
(1, 2, 2)
(3, 7, 2)
(0, 2, 3)
(4, 7, 3)
(5, 6, 3)
```

Which is the same edges as we identified manually in Step 2 of this tutorial.

## Bonus: Complexity of the algorithm

Let’s have the algorithm here again.

```def kruskal(graph):
forest = []
graph.sort_edges()
disjoint_set = DisjointSet(graph.number_of_vertices())
for (u, v, weight) in graph.get_edges():
if disjoint_set.find(u) != disjoint_set.find(v):
forest.append((u, v, weight))
disjoint_set.union(u, v)
return forest
```

We have that E is the number of edges and V is the number for vertices in the graph.

Then the sort on the graph uses O(E log(E)), but as we have that E is upper bonded by V^2, it follows that O(log(E)) = O(log(V^2)) = O(2 log(V)) = O(log(V)).

The for-loop over the edges runs E times. In each loop we call find and possible union on the disjoint-set data structure. These calls where upper bounded by O(log(V)). Hence, the main loop runs in O(E log(V)).

All in all O(E log(V) + E log(V)) = O(E log(V)).

## What will we cover in this tutorial?

In this tutorial we will make a dynamic programming solution to the Edit Distance Problem.

Consider the following problem. Given a string u = “abbabc” and v = “ababac”, and the operations to insert, remove or change a character in string u. What is the minimum amount of operations you need to transform u into v.

1. Remove b: u = “abbabc”remove b -> u’ = “ababc”.
2. Insert a: u’ = “ababc” – insert a -> u” = “ababac”.

Hence, with two operations we transform u into v.

How can you make a program that solves the general problem for two arbitrary strings, u and v, with minimal operations? And even better, how can we solve it with dynamic programming and how does that help us?

## Step 1: Solve the Edit Distance Problem without Dynamic Programming

It is a good idea to solve a problem with a normal approach first. This makes you understand the problem at heart.

To get started with solving a problem like this, it is often a good idea to look at some simple base cases.

Assume that u and v are either the empty string () or one character (‘a’, ‘b’, or ‘c’, to limit it).

We have some cases here.

1. If we have that u = ” (the empty string) and v = ‘a’, then we need to insert ‘a’ into u to get v.
2. If we have that v=” and u=’b’, then we need to remove a character from u to get v.
3. Then, if we have that u=’a’ and v=’a’, then do not need to do anything, as they are equal.
4. Finally, if we have that u=’a’ and v=’b’, then we can use the change operations to turn u into v.

Using that logic, can we turn that into a recursive function, which takes the last character of the sting with the above operations.

There is a catch, of course, we need to consider case 4. In that case, we should try all three options and use the best one.

Let’s try to write that as Python code.

```def naive_edit_distance(u, v):
if len(u) == 0:
return len(v) # Number of insert operations to fill out.
if len(v) == 0:
return len(u) # Number of delete operations to remove chars
if u[-1] == v[-1]: # Characters are identical, no operations
return naive_edit_distance(u[:-1], v[:-1])
return min(naive_edit_distance(u, v[:-1]) + 1, # insert operation
naive_edit_distance(u[:-1], v) + 1, # delete operation
naive_edit_distance(u[:-1], v[:-1]) + 1) # change operation
```

Now, let us first see how this works. Assume we call with the following arguement

```naive_edit_distance("ab", "bc")
```

Then let’s make a call tree to see how that would run with the above code.

Now the result will be propagated all the way up, where each of the leaves will take the minimum value.

This will result in 2 operations is optimal. Also, notice that the operations are not unique. We see two paths given the same number of operations.

## Step 2: Solving the Edith Distance problem with dynamic programming

If you investigate the above call tree, then you will notice, that some calls are identical. E.g., the call in with u=’a’ and v=’b’ occurs twice, and we calculate the same 6 consecutive calls below.

With dynamic programming you can avoid calculating the same calls again and again.

The easiest way to turn the above algorithm into a dynamic algorithm, is by storing the value of each call in a global value. Then for each calculated value, store it, and before making a new calculation, check if it exists already.

This turns into the following code.

```# A global dictionary to store already called values
results = {}

def get_distance_dynamic_programming(u, v):
global results
if (u, v) in results: # Check if call already is calculated
return results[(u, v)]
if len(u) == 0:
results[(u, v)] = len(v) # Number of insert operations to fill out.
elif len(v) == 0:
results[(u, v)] = len(u) # Number of delete operations to remove chars
elif u[-1] == v[-1]: # Characters are identical, no operations
results[(u, v)] = get_distance_dynamic_programming(u[:-1], v[:-1])
else:
results[(u, v)] = min(get_distance_dynamic_programming(u, v[:-1]) + 1, # insert operation
get_distance_dynamic_programming(u[:-1], v) + 1, # delete operation
get_distance_dynamic_programming(u[:-1], v[:-1]) + 1) # change operation
return results[(u, v)]
```

Hence, the main difference is that we have a global variable, called results, we check in the beginning if the call value (the arguments, u and v) exists, if so return the value.

Else, we do the same, except that we store the value in results, before returning it.

## Step 3: Return the operations used to transform the string

Imagine you need to return the operations you used to transform string u to v.

This can be done by the following code.

```# Solution which returns the operations
operations = {}

def get_distance_operations(u, v):
global operations
if (u, v) in operations:
return operations[(u, v)]
if len(u) == 0:
operations[(u, v)] = (len(v), [("insert", c) for c in v])
elif len(v) == 0:
operations[(u, v)] = (len(u), [("remove", c) for c in u])
elif u[-1] == v[-1]:
operations[(u, v)] = get_distance_operations(u[:-1], v[:-1])
else:
val1, ops1 = get_distance_operations(u, v[:-1])
val2, ops2 = get_distance_operations(u[:-1], v)
val3, ops3 = get_distance_operations(u[:-1], v[:-1])
if val1 <= val2 and val1 <= val3:
operations[(u, v)] = (val1 + 1, ops1 + [("insert", v[-1])])
elif val2 <= val1 and val2 <= val3:
operations[(u, v)] = (val2 + 1, ops2 + [("remove", u[-1])])
else:
operations[(u, v)] = (val3 + 1, ops3 + [("change", u[-1] + "->" + v[-1])])
return operations[(u, v)]

res = get_distance_operations("abbabc", "ababac")
print(res)
```

The above code would return the following output.

```(2, [('remove', 'b'), ('insert', 'a')])
```

That is amazing. Another cool problem solved.

## What will we cover in this tutorial?

The Knapsack Problem is defined as follows.

Given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit and the total value is as large as possible.

https://en.wikipedia.org/wiki/Knapsack_problem

In this tutorial we will give an efficient algorithm, the greedy approximation algorithm, which gives at least 1/2 of the optimal solution.

## Step 1: Understand the Knapsack Problem

Imagine you have 3 items. Item 1 has weight 2 kg with value \$20. The next item, item 2, has weight 1 kg with value \$15. Finally, item 3 has weight 3 kg, but value \$35.

The challenge is that you can only bring 4 kg. How to get the most value with you?

You can take item 1 and item 2, then you have 3 kg and for \$45 of value. You cannot take item 3, as that will add up to 6 kg, which is too much.

On the other hand, you could take item 2 and item 3, then you have 4 kg in total and for \$50.

The Knapsack Problem is to determine the most valuable items to take within some bound of weight.

Now this seems easy to solve, or does it? Actually, the Knapsack Problem is NP-complete, which means it is difficult to solve.

## Step 2: A greedy approximation algorithm to solve it efficiently

To get the optimal solution is difficult, but there exists efficient approximations that are easy to implement.

Let’s take a look at the greedy approximation algorithm.

```def greedy_algorithm(w, v, W):
ordered = []
for vi, wi in zip(v, w):
ordered.append((vi/wi, vi, wi))
ordered.sort(reverse=True)
X = []
weight = 0
val = 0
for _, vi, wi in ordered:
if weight + wi <= W:
X.append((vi, wi))
weight += wi
val += vi
return X, val
```

The above algorithm is not final. But let’s start by understanding what it does.

First it orders the items based on the highest value per weight. Then it simply adds the items along in that order.

Hence if we have the items from step 1.

Then we would have them ordered, Item 2 (15 \$/kg), Item 3 (11.67 \$/kg), and Item 1 (10 \$/kg). Hence, the algorithm would add Item 2 first, as it gives the most value per kg. Then it adds Item 3, as this is the item left, which adds the most value per kg. Then there is no more to add.

Why is this optimal? Well you add the items in the order of what adds the most value.

## Step 3: Modifying the greedy approximation to give at least 1/2 of the optimal solution

In the example from Step 2, it all turned out fine.

But we have a problem.

Imagine the items. Item 1′ of value \$15 and 1 kg. Then Item 2′ of value \$40 and 4 kg. We still assume we can bring 4 kg.

The above algorithm would order it by Item 1′ (15 \$/kg) and Item 2′ (10 \$/kg).

That would only result in taking Item 1′ of 1 kg and value \$20. Because the next item, Item 2′, weighs too much to bring along.

The optimal solution is taking Item 2′ of \$40 of value.

The algorithm from previous step can be modified a bit to ensure at least 1/2 of value of the optimal solution. Notice, that the above example does not even give that.

Let’s examine the following modification.

```def modified_greedy_algorithm(w, v, W):
ordered = []
for vi, wi in zip(v, w):
ordered.append((vi/wi, vi, wi))
ordered.sort(reverse=True)
S1 = []
weight = 0
val = 0
for _, vi, wi in ordered:
if weight + wi <= W:
S1.append((vi, wi))
weight += wi
val += vi
else:
S2 = [(vi, wi)]
val2 = vi
if val > val2:
return S1, val
else:
return S2, val2
```

It is almost identical. Except that when it reaches the first item, which does not fit in, it checks if that item alone gives higher value than the already added items.

Why does that give at least the 1/2 of the optimal solution.

Well, if we could take partial items, then the optimal value would be given by the following. Let k be the first item that does not fit in. Then items 1, 2, …, k – 1, and a portion of item k be the optimal solution. Let that portion be given by alpha. Notice, that alpha is great or equal to 0 end less and equal to 1.

Hence an upper bound of the optimal value is given by v_1 + v_2 + … + v_(k-1) + alpha*v_k = V_max_bound.

Hence, we have that items 1, 2, …, k – 1 or item k alone, must be at least half of V_max_bound. Hence, it shows that the above algorithm gives at least half of the optimal solution.