What will we cover in this tutorial
- As example we will use the html table from a wikipedia page. In this case the one listing countries by meat consumption.
- We will see how to read the table data into a Pandas DataFrame with a single call.
- Then how to merge it with a DataFrame containing data to color countries.
- Finally, how to add the colors to leaflet map using a Python library.
Step 1: Read the data to a Pandas DataFrame
We need to inspect the page we are going to parse from. In this case it is the world meat consumption from wikipedia.

What we want to do is to gather the data from the table and plot it to a world map using colors to indicate the meat consumption.




The easiest way to work with data is by using pandas DataFrames. The Pandas library has a read_html function, which returns all tables from a webpage.
This can be achieved by the following code. If you use read_html for the first time, you will need to instal lxml, see this tutorial for details.
import pandas as pd
# The URL we will read our data from
url = 'https://en.wikipedia.org/wiki/List_of_countries_by_meat_consumption'
# read_html returns a list of tables from the URL
tables = pd.read_html(url)
# The data is in the first table - this changes from time to time - wikipedia is updated all the time.
table = tables[0]
print(table.head())
Resulting in the following output.
Country Kg/person (2002)[9][note 1] Kg/person (2009)[10]
0 Albania 38.2 NaN
1 Algeria 18.3 19.5
2 American Samoa 24.9 26.8
3 Angola 19.0 22.4
4 Antigua and Barbuda 56.0 84.3
Step 2: Merging the data to world map
The next step thing we want to do is to map it to a world map that we can color.
This can be done by using geopandas.
import pandas as pd
import geopandas
# The URL we will read our data from
url = 'https://en.wikipedia.org/wiki/List_of_countries_by_meat_consumption'
# read_html returns a list of tables from the URL
tables = pd.read_html(url)
# The data is in the first table - this changes from time to time - wikipedia is updated all the time.
table = tables[0]
print(table.head())
# Read the geopandas dataset
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
print(world.head())
Which results in the following output.
Country Kg/person (2002)[9][note 1] Kg/person (2009)[10]
0 Albania 38.2 NaN
1 Algeria 18.3 19.5
2 American Samoa 24.9 26.8
3 Angola 19.0 22.4
4 Antigua and Barbuda 56.0 84.3
pop_est continent name iso_a3 gdp_md_est geometry
0 920938 Oceania Fiji FJI 8374.0 MULTIPOLYGON (((180.00000 -16.06713, 180.00000...
1 53950935 Africa Tanzania TZA 150600.0 POLYGON ((33.90371 -0.95000, 34.07262 -1.05982...
2 603253 Africa W. Sahara ESH 906.5 POLYGON ((-8.66559 27.65643, -8.66512 27.58948...
3 35623680 North America Canada CAN 1674000.0 MULTIPOLYGON (((-122.84000 49.00000, -122.9742...
4 326625791 North America United States of America USA 18560000.0 MULTIPOLYGON (((-122.84000 49.00000, -120.0000...
Where we can see the column Country of the table DataFrame should be merged with the column name in the world DataFrame.
Let’s do the merge on that.
import pandas as pd
import geopandas
# The URL we will read our data from
url = 'https://en.wikipedia.org/wiki/List_of_countries_by_meat_consumption'
# read_html returns a list of tables from the URL
tables = pd.read_html(url)
# The data is in the first table - this changes from time to time - wikipedia is updated all the time.
table = tables[0]
# Read the geopandas dataset
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
# Merge the two DataFrames together
table = world.merge(table, how="left", left_on=['name'], right_on=['Country'])
print(table.head())
Which results in the following output.
pop_est continent ... kg/person (2009)[10] kg/person (2017)[11]
0 920938 Oceania ... 38.8 NaN
1 53950935 Africa ... 9.6 6.82
2 603253 Africa ... NaN NaN
3 35623680 North America ... 94.3 69.99
4 326625791 North America ... 120.2 98.60
[5 rows x 10 columns]
Where we also notice that some rows do not have any data from table, resulting in values NaN. To get a clearer view we will remove those rows.
import pandas as pd
import geopandas
# The URL we will read our data from
url = 'https://en.wikipedia.org/wiki/List_of_countries_by_meat_consumption'
# read_html returns a list of tables from the URL
tables = pd.read_html(url)
# The data is in the first table - this changes from time to time - wikipedia is updated all the time.
table = tables[0]
# Read the geopandas dataset
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
# Merge the two DataFrames together
table = world.merge(table, how="left", left_on=['name'], right_on=['Country'])
# Clean data: remove rows with no data
table = table.dropna(subset=['kg/person (2002)[9][note 1]'])
The rows can be removed by using dropna.
Step 3: Add the data by colors on an interactive world map
Finally, you can use folium to create a leaflet map.
import pandas as pd
import folium
import geopandas
# The URL we will read our data from
url = 'https://en.wikipedia.org/wiki/List_of_countries_by_meat_consumption'
# read_html returns a list of tables from the URL
tables = pd.read_html(url)
# The data is in the first table - this changes from time to time - wikipedia is updated all the time.
table = tables[0]
# Read the geopandas dataset
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
# Merge the two DataFrames together
table = world.merge(table, how="left", left_on=['name'], right_on=['Country'])
# Clean data: remove rows with no data
table = table.dropna(subset=['kg/person (2002)[9][note 1]'])
# Create a map
my_map = folium.Map()
# Add the data
folium.Choropleth(
geo_data=table,
name='choropleth',
data=table,
columns=['Country', 'kg/person (2002)[9][note 1]'],
key_on='feature.properties.name',
fill_color='OrRd',
fill_opacity=0.7,
line_opacity=0.2,
legend_name='Meat consumption in kg/person'
).add_to(my_map)
my_map.save('meat.html')
Resulting a html webpage like this one.



