What will we cover?
How to create a video like the one below using Pandas + GeoPandas + OpenCV in Python.
- How to collect newest COVID-19 data in Python using Pandas.
- Prepare data and calculate values needed to create Choropleth map
- Get Choropleth map from GeoPandas and prepare to combine it
- Get the data frame by frame to the video
- Combine it all to a video using OpenCV
Step 1: Get the daily reported COVID-19 data world wide
This data is available from the European Centre for Disease Prevention and Control and can be found here.
All we need is to download the csv file, which has all the historic data from all the reported countries.
This can be done as follows.
import pandas as pd
# Just to get more rows, columns and display width
pd.set_option('display.max_rows', 300)
pd.set_option('display.max_columns', 300)
pd.set_option('display.width', 1000)
# Get the updated data
table = pd.read_csv("https://opendata.ecdc.europa.eu/covid19/casedistribution/csv")
print(table)
This will give us an idea of how the data is structured.
dateRep day month year cases deaths countriesAndTerritories geoId countryterritoryCode popData2019 continentExp Cumulative_number_for_14_days_of_COVID-19_cases_per_100000
0 01/10/2020 1 10 2020 14 0 Afghanistan AF AFG 38041757.0 Asia 1.040961
1 30/09/2020 30 9 2020 15 2 Afghanistan AF AFG 38041757.0 Asia 1.048847
2 29/09/2020 29 9 2020 12 3 Afghanistan AF AFG 38041757.0 Asia 1.114565
3 28/09/2020 28 9 2020 0 0 Afghanistan AF AFG 38041757.0 Asia 1.343261
4 27/09/2020 27 9 2020 35 0 Afghanistan AF AFG 38041757.0 Asia 1.540413
... ... ... ... ... ... ... ... ... ... ... ... ...
46221 25/03/2020 25 3 2020 0 0 Zimbabwe ZW ZWE 14645473.0 Africa NaN
46222 24/03/2020 24 3 2020 0 1 Zimbabwe ZW ZWE 14645473.0 Africa NaN
46223 23/03/2020 23 3 2020 0 0 Zimbabwe ZW ZWE 14645473.0 Africa NaN
46224 22/03/2020 22 3 2020 1 0 Zimbabwe ZW ZWE 14645473.0 Africa NaN
46225 21/03/2020 21 3 2020 1 0 Zimbabwe ZW ZWE 14645473.0 Africa NaN
[46226 rows x 12 columns]
First we want to convert the dateRep to a date object (cannot be seen in the above, but the dates are represented by a string). Then use that as index for easier access later.
import pandas as pd
# Just to get more rows, columns and display width
pd.set_option('display.max_rows', 300)
pd.set_option('display.max_columns', 300)
pd.set_option('display.width', 1000)
# Get the updated data
table = pd.read_csv("https://opendata.ecdc.europa.eu/covid19/casedistribution/csv")
# Convert dateRep to date object
table['date'] = pd.to_datetime(table['dateRep'], format='%d/%m/%Y')
# Use date for index
table = table.set_index('date')
Step 2: Prepare data and compute values needed for plot
What makes sense to plot?
Good question. In a Choropleth map you will color according to a value. Here we will color in darker red the higher the value a country is represented with.
If we plotted based on number new COVID-19 cases, this would be high for countries with high populations. Hence, the number of COVID-19 cases per 100,000 people is used.
Using new COVID-19 cases per 100,000 people can be volatile and change drastic from day to day. To even that out, a 7 days rolling sum can be used. That is, you take the sum of the last 7 days and continue that process through your data.
To make it even less volatile, the average of the last 14 days of the 7 days rolling sum is used.
And no, it is not just something invented by me. It is used by the authorities in my home country to calculate rules of which countries are open for travel or not.
This can by the data above be calculated by computing that data.
def get_stat(country_code, table):
data = table.loc[table['countryterritoryCode'] == country_code]
data = data.reindex(index=data.index[::-1])
data['7 days sum'] = data['cases'].rolling(7).sum()
data['7ds/100000'] = data['7 days sum'] * 100000 / data['popData2019']
data['14 mean'] = data['7ds/100000'].rolling(14).mean()
return data
The above function takes the table we returned from Step 1 and extract a country based on a country code. Then it reverses the data to have the dates in chronological order.
After that, it computes the 7 days rolling sum. Then computes the new cases by the population in the country in size of 100,000 people. Finally, it computes the 14 days average (mean) of it.
Step 3: Get the Choropleth map data and prepare it
GeoPandas is an amazing library to create Choropleth maps. But it does need your attention when you combine it with other data.
Here we want to combine it with the country codes (ISO_A3). If you inspect the data, some of the countries are missing that data.
Other than that the code is straight forward.
import pandas as pd
import geopandas
# Just to get more rows, columns and display width
pd.set_option('display.max_rows', 300)
pd.set_option('display.max_columns', 300)
pd.set_option('display.width', 1000)
# Get the updated data
table = pd.read_csv("https://opendata.ecdc.europa.eu/covid19/casedistribution/csv")
# Convert dateRep to date object
table['date'] = pd.to_datetime(table['dateRep'], format='%d/%m/%Y')
# Use date for index
table = table.set_index('date')
def get_stat(country_code, table):
data = table.loc[table['countryterritoryCode'] == country_code]
data = data.reindex(index=data.index[::-1])
data['7 days sum'] = data['cases'].rolling(7).sum()
data['7ds/100000'] = data['7 days sum'] * 100000 / data['popData2019']
data['14 mean'] = data['7ds/100000'].rolling(14).mean()
return data
# Read the data to make a choropleth map
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
world = world[(world.pop_est > 0) & (world.name != "Antarctica")]
# Store data per country to make it easier
data_by_country = {}
for index, row in world.iterrows():
# The world data is not fully updated with ISO_A3 names
if row['iso_a3'] == '-99':
country = row['name']
if country == "Norway":
world.at[index, 'iso_a3'] = 'NOR'
row['iso_a3'] = "NOR"
elif country == "France":
world.at[index, 'iso_a3'] = 'FRA'
row['iso_a3'] = "FRA"
elif country == 'Kosovo':
world.at[index, 'iso_a3'] = 'XKX'
row['iso_a3'] = "XKX"
elif country == "Somaliland":
world.at[index, 'iso_a3'] = '---'
row['iso_a3'] = "---"
elif country == "N. Cyprus":
world.at[index, 'iso_a3'] = '---'
row['iso_a3'] = "---"
# Add the data for the country
data_by_country[row['iso_a3']] = get_stat(row['iso_a3'], table)
This will create a dictionary (data_by_country) with the needed data for each country. Notice, we do it like this, because not all countries have the same number of data points.
Step 4: Create a Choropleth map for each date and save it as an image
This can be achieved by using matplotlib.
The idea is to go through all dates and look for each country if they have data for that date and use it if they have.
import pandas as pd
import geopandas
import matplotlib.pyplot as plt
# Just to get more rows, columns and display width
pd.set_option('display.max_rows', 300)
pd.set_option('display.max_columns', 300)
pd.set_option('display.width', 1000)
# Get the updated data
table = pd.read_csv("https://opendata.ecdc.europa.eu/covid19/casedistribution/csv")
# Convert dateRep to date object
table['date'] = pd.to_datetime(table['dateRep'], format='%d/%m/%Y')
# Use date for index
table = table.set_index('date')
def get_stat(country_code, table):
data = table.loc[table['countryterritoryCode'] == country_code]
data = data.reindex(index=data.index[::-1])
data['7 days sum'] = data['cases'].rolling(7).sum()
data['7ds/100000'] = data['7 days sum'] * 100000 / data['popData2019']
data['14 mean'] = data['7ds/100000'].rolling(14).mean()
return data
# Read the data to make a choropleth map
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
world = world[(world.pop_est > 0) & (world.name != "Antarctica")]
# Store data per country to make it easier
data_by_country = {}
for index, row in world.iterrows():
# The world data is not fully updated with ISO_A3 names
if row['iso_a3'] == '-99':
country = row['name']
if country == "Norway":
world.at[index, 'iso_a3'] = 'NOR'
row['iso_a3'] = "NOR"
elif country == "France":
world.at[index, 'iso_a3'] = 'FRA'
row['iso_a3'] = "FRA"
elif country == 'Kosovo':
world.at[index, 'iso_a3'] = 'XKX'
row['iso_a3'] = "XKX"
elif country == "Somaliland":
world.at[index, 'iso_a3'] = '---'
row['iso_a3'] = "---"
elif country == "N. Cyprus":
world.at[index, 'iso_a3'] = '---'
row['iso_a3'] = "---"
# Add the data for the country
data_by_country[row['iso_a3']] = get_stat(row['iso_a3'], table)
# Create an image per date
for day in pd.date_range('12-31-2019', '10-01-2020'):
print(day)
world['number'] = 0.0
for index, row in world.iterrows():
if day in data_by_country[row['iso_a3']].index:
world.at[index, 'number'] = data_by_country[row['iso_a3']].loc[day]['14 mean']
world.plot(column='number', legend=True, cmap='OrRd', figsize=(15, 5))
plt.title(day.strftime("%Y-%m-%d"))
plt.savefig(f'image-{day.strftime("%Y-%m-%d")}.png')
plt.close()
This will create an image for each day. These images will be combined.
Step 5: Create a video from images with OpenCV
Using OpenCV to create a video from a sequence of images is quite easy. The only thing you need to ensure is that it reads the images in the correct order.
import cv2
import glob
img_array = []
filenames = glob.glob('image-*.png')
filenames.sort()
for filename in filenames:
print(filename)
img = cv2.imread(filename)
height, width, layers = img.shape
size = (width, height)
img_array.append(img)
out = cv2.VideoWriter('covid.avi', cv2.VideoWriter_fourcc(*'DIVX'), 15, size)
for i in range(len(img_array)):
out.write(img_array[i])
out.release()
Where we use the VideoWriter from OpenCV.
This results in this video.
Learn Python

Learn Python A BEGINNERS GUIDE TO PYTHON
- 70 pages to get you started on your journey to master Python.
- How to install your setup with Anaconda.
- Written description and introduction to all concepts.
- Jupyter Notebooks prepared for 17 projects.
Python 101: A CRASH COURSE
- How to get started with this 8 hours Python 101: A CRASH COURSE.
- Best practices for learning Python.
- How to download the material to follow along and create projects.
- A chapter for each lesson with a description, code snippets for easy reference, and links to a lesson video.
Expert Data Science Blueprint

Expert Data Science Blueprint
- Master the Data Science Workflow for actionable data insights.
- How to download the material to follow along and create projects.
- A chapter to each lesson with a Description, Learning Objective, and link to the lesson video.
Machine Learning

Machine Learning – The Simple Path to Mastery
- How to get started with Machine Learning.
- How to download the material to follow along and make the projects.
- One chapter for each lesson with a Description, Learning Objectives, and link to the lesson video.
How can I change the date that the video starts?
Hi Augustin,
After the index is set to use date you can limit the start time:
table = table.loc[‘2021-01-01’]