Plot World Data to Map Using Python in 3 Easy Steps

What will we cover in this tutorial

  • As example we will use the html table from a wikipedia page. In this case the one listing countries by meat consumption.
  • We will see how to read the table data into a Pandas DataFrame with a single call.
  • Then how to merge it with a DataFrame containing data to color countries.
  • Finally, how to add the colors to leaflet map using a Python library.

Step 1: Read the data to a Pandas DataFrame

We need to inspect the page we are going to parse from. In this case it is the world meat consumption from wikipedia.

From wikipedia.

What we want to do is to gather the data from the table and plot it to a world map using colors to indicate the meat consumption.

End result

The easiest way to work with data is by using pandas DataFrames. The Pandas library has a read_html function, which returns all tables from a webpage.

This can be achieved by the following code. If you use read_html for the first time, you will need to instal lxml, see this tutorial for details.

import pandas as pd

# The URL we will read our data from
url = 'https://en.wikipedia.org/wiki/List_of_countries_by_meat_consumption'
# read_html returns a list of tables from the URL
tables = pd.read_html(url)

# The data is in the first table - this changes from time to time - wikipedia is updated all the time.
table = tables[0]

print(table.head())

Resulting in the following output.

               Country  Kg/person (2002)[9][note 1] Kg/person (2009)[10]
0              Albania                         38.2                  NaN
1              Algeria                         18.3                 19.5
2       American Samoa                         24.9                 26.8
3               Angola                         19.0                 22.4
4  Antigua and Barbuda                         56.0                 84.3

Step 2: Merging the data to world map

The next step thing we want to do is to map it to a world map that we can color.

This can be done by using geopandas.

import pandas as pd
import geopandas


# The URL we will read our data from
url = 'https://en.wikipedia.org/wiki/List_of_countries_by_meat_consumption'
# read_html returns a list of tables from the URL
tables = pd.read_html(url)

# The data is in the first table - this changes from time to time - wikipedia is updated all the time.
table = tables[0]

print(table.head())

# Read the geopandas dataset
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))

print(world.head())

Which results in the following output.

               Country  Kg/person (2002)[9][note 1] Kg/person (2009)[10]
0              Albania                         38.2                  NaN
1              Algeria                         18.3                 19.5
2       American Samoa                         24.9                 26.8
3               Angola                         19.0                 22.4
4  Antigua and Barbuda                         56.0                 84.3
     pop_est      continent                      name iso_a3  gdp_md_est                                           geometry
0     920938        Oceania                      Fiji    FJI      8374.0  MULTIPOLYGON (((180.00000 -16.06713, 180.00000...
1   53950935         Africa                  Tanzania    TZA    150600.0  POLYGON ((33.90371 -0.95000, 34.07262 -1.05982...
2     603253         Africa                 W. Sahara    ESH       906.5  POLYGON ((-8.66559 27.65643, -8.66512 27.58948...
3   35623680  North America                    Canada    CAN   1674000.0  MULTIPOLYGON (((-122.84000 49.00000, -122.9742...
4  326625791  North America  United States of America    USA  18560000.0  MULTIPOLYGON (((-122.84000 49.00000, -120.0000...

Where we can see the column Country of the table DataFrame should be merged with the column name in the world DataFrame.

Let’s do the merge on that.

import pandas as pd
import geopandas


# The URL we will read our data from
url = 'https://en.wikipedia.org/wiki/List_of_countries_by_meat_consumption'
# read_html returns a list of tables from the URL
tables = pd.read_html(url)

# The data is in the first table - this changes from time to time - wikipedia is updated all the time.
table = tables[0]

# Read the geopandas dataset
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))

# Merge the two DataFrames together
table = world.merge(table, how="left", left_on=['name'], right_on=['Country'])

print(table.head())

Which results in the following output.

     pop_est      continent  ... kg/person (2009)[10] kg/person (2017)[11]
0     920938        Oceania  ...                 38.8                  NaN
1   53950935         Africa  ...                  9.6                 6.82
2     603253         Africa  ...                  NaN                  NaN
3   35623680  North America  ...                 94.3                69.99
4  326625791  North America  ...                120.2                98.60

[5 rows x 10 columns]

Where we also notice that some rows do not have any data from table, resulting in values NaN. To get a clearer view we will remove those rows.

import pandas as pd
import geopandas


# The URL we will read our data from
url = 'https://en.wikipedia.org/wiki/List_of_countries_by_meat_consumption'
# read_html returns a list of tables from the URL
tables = pd.read_html(url)

# The data is in the first table - this changes from time to time - wikipedia is updated all the time.
table = tables[0]

# Read the geopandas dataset
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))

# Merge the two DataFrames together
table = world.merge(table, how="left", left_on=['name'], right_on=['Country'])

# Clean data: remove rows with no data
table = table.dropna(subset=['kg/person (2002)[9][note 1]'])

The rows can be removed by using dropna.

Step 3: Add the data by colors on an interactive world map

Finally, you can use folium to create a leaflet map.

import pandas as pd
import folium
import geopandas


# The URL we will read our data from
url = 'https://en.wikipedia.org/wiki/List_of_countries_by_meat_consumption'
# read_html returns a list of tables from the URL
tables = pd.read_html(url)

# The data is in the first table - this changes from time to time - wikipedia is updated all the time.
table = tables[0]

# Read the geopandas dataset
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))

# Merge the two DataFrames together
table = world.merge(table, how="left", left_on=['name'], right_on=['Country'])

# Clean data: remove rows with no data
table = table.dropna(subset=['kg/person (2002)[9][note 1]'])

# Create a map
my_map = folium.Map()

# Add the data
folium.Choropleth(
    geo_data=table,
    name='choropleth',
    data=table,
    columns=['Country', 'kg/person (2002)[9][note 1]'],
    key_on='feature.properties.name',
    fill_color='OrRd',
    fill_opacity=0.7,
    line_opacity=0.2,
    legend_name='Meat consumption in kg/person'
).add_to(my_map)
my_map.save('meat.html')

Resulting a html webpage like this one.

2 Replies to “Plot World Data to Map Using Python in 3 Easy Steps”

  1. Hi Rune. I want to flat two minor errors here, which I encountered while trying to run the code:
    First, the table should be equal to first table on the wikipedia page, which is also the same in your Youtube video:
    table=tables[0]

    Second, the “k” in “Kg/person” should be small letter and not capital as per the Wikipedia page.

    Also, I couldn’t get the map at the end. I got RuntimeError: b’no arguments in initialization list’ in my system. I tried to google it, however, couldn’t figure out how to solve it. Any ideas?

    1. Hi Himalya,

      Thank you for using my tutorial. I have updated the changes you point out. Wikipedia is constantly updated and then the code needs to be updated accordingly.

      I ran the code and get the map in meat.html. Notice that the generation also needed an update from the capital K to a lowercase k.

      See if it works now. Otherwise, please ask again.

      Cheers,
      Rune

Leave a Reply