Learn how you can become a Python programmer in just 12 weeks.

    We respect your privacy. Unsubscribe at anytime.

    Visualize Inflation for 2019 using Pandas-datareader and GeoPandas

    What will we cover in this tutorial?

    In this tutorial we will visualize the inflation on a map. This will be done by getting the inflation data directly from World Bank using the Pandas-datareader. This data will be joined with data from GeoPandas, which provides a world map we can use to create a Choropleth map.

    The end result

    Step 1: Retrieve the inflation data from World Bank

    The Pandas-datareader has an interface to get data from World Bank. To find interesting data from World Bank you should explore data.worldbank.org, which contains various interesting indicators.

    When you find one, like the Inflation, consumer prices (annual %), we will use, you can see that you can download it in CSV, XML, or excel. But we are not old fashioned, hence, we will use the direct API to get fresh data every time we run our program.

    To use the API, we need the indicator, which you will find in the url. In this case.

    https://data.worldbank.org/indicator/FP.CPI.TOTL.ZG
    

    Hence we have it FP.CPI.TOTL.ZG.

    Using the Pandas-datareader API you can get the data by running the following piece of code.

    from pandas_datareader import wb
    data = wb.download(indicator='FP.CPI.TOTL.ZG', country='all', start=2019, end=2019)
    print(data)
    

    If you inspect the output, you will see it is structured a bit inconvenient.

                                                             FP.CPI.TOTL.ZG
    country                                            year                
    Arab World                                         2019        1.336016
    Caribbean small states                             2019             NaN
    Central Europe and the Baltics                     2019        2.664561
    Early-demographic dividend                         2019        3.030587
    East Asia & Pacific                                2019        1.773102
    East Asia & Pacific (excluding high income)        2019        2.779172
    East Asia & Pacific (IDA & IBRD countries)         2019        2.779172
    

    It has two indexes.

    We want to reset index 1 (the year) and, which will make year to a column. Then for convenience we should rename the columns.

    from pandas_datareader import wb
    data = wb.download(indicator='FP.CPI.TOTL.ZG', country='all', start=2019, end=2019)
    data = data.reset_index(1)
    data.columns = ['year', 'inflation']
    print(data)
    

    Resulting in the following.

                                                        year  inflation
    country                                                            
    Arab World                                          2019   1.336016
    Caribbean small states                              2019        NaN
    Central Europe and the Baltics                      2019   2.664561
    Early-demographic dividend                          2019   3.030587
    East Asia & Pacific                                 2019   1.773102
    East Asia & Pacific (excluding high income)         2019   2.779172
    East Asia & Pacific (IDA & IBRD countries)          2019   2.779172
    

    Step 2: Retrieve the world map data

    The world map data is available from GeoPandas. At first glance everything is easy.

    import geopandas
    map = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
    map = map[map['name'] != 'Antarctica']
    print(map)
    

    Where I excluded Antarctica for visual purposes. Inspecting some of the output.

            pop_est                continent                      name iso_a3   gdp_md_est                                           geometry
    0        920938                  Oceania                      Fiji    FJI      8374.00  MULTIPOLYGON (((180.00000 -16.06713, 180.00000...
    1      53950935                   Africa                  Tanzania    TZA    150600.00  POLYGON ((33.90371 -0.95000, 34.07262 -1.05982...
    2        603253                   Africa                 W. Sahara    ESH       906.50  POLYGON ((-8.66559 27.65643, -8.66512 27.58948...
    3      35623680            North America                    Canada    CAN   1674000.00  MULTIPOLYGON (((-122.84000 49.00000, -122.9742...
    4     326625791            North America  United States of America    USA  18560000.00  MULTIPOLYGON (((-122.84000 49.00000, -120.0000...
    5      18556698                     Asia                Kazakhstan    KAZ    460700.00  POLYGON ((87.35997 49.21498, 86.59878 48.54918...
    6      29748859                     Asia                Uzbekistan    UZB    202300.00  POLYGON ((55.96819 41.30864, 55.92892 44.99586...
    

    It seems to be a good match to join the data on the name column.

    To make it easy, we can make the name column index.

    import geopandas
    map = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
    map = map[map['name'] != 'Antarctica']
    map = map.set_index('name')
    

    Step 3: Joining the datasets

    This is the fun part of Data Science. Why? I am glad you asked. Well, it was an irony. The challenge will be apparent in a moment. There are various ways to deal with it, but in this tutorial we will use a simplistic approach.

    Let us do the join.

    from pandas_datareader import wb
    import geopandas
    pd.set_option('display.width', 3000)
    pd.set_option('display.max_columns', 300)
    pd.set_option('display.max_rows', 500)
    map = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
    map = map[map['name'] != 'Antarctica']
    map = map.set_index('name')
    data = wb.download(indicator='FP.CPI.TOTL.ZG', country='all', start=2019, end=2019)
    data = data.reset_index(1)
    data.columns = ['year', 'inflation']
    map = map.join(data, how='outer')
    print(map)
    

    Where I use an outer join, to get all the “challenges” visible.

    Russia                                              1.422575e+08                   Europe    RUS   3745000.00  MULTIPOLYGON (((178.72530 71.09880, 180.00000 ...   NaN        NaN
    Russian Federation                                           NaN                      NaN    NaN          NaN                                               None  2019   4.470367
    ...
    United States                                                NaN                      NaN    NaN          NaN                                               None  2019   1.812210
    United States of America                            3.266258e+08            North America    USA  18560000.00  MULTIPOLYGON (((-122.84000 49.00000, -120.0000...   NaN        NaN
    

    Where I only took two snippets. The key thing is here, that the data from GeoPandas, containing the map, and data from World Bank, containing the inflation rates we want to color the map with, are not joined.

    Hence, we need to join United States together with United States of America. And Russia with Russian Federation.

    We would use a location service, which maps counties to country codes. Hence, mapping each data sets country names to country codes (note that GeoPandas already has 3 letter country codes, but some are missing, like Norway and more). This approach still can have some missing pieces, as some country names are not known by the mapping.

    Another approach is to look find all the data not mapped and rename them in one of the datasets. This can take some time, but I did most of them in the following.

    from pandas_datareader import wb
    import geopandas
    map = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
    map = map[map['name'] != 'Antarctica']
    map = map.set_index('name')
    index_change = {
        'United States of America': 'United States',
        'Yemen': 'Yemen, Rep.',
        'Venezuela': 'Venezuela, RB',
        'Syria': 'Syrian Arab Republic',
        'Solomon Is.': 'Solomon Islands',
        'Russia': 'Russian Federation',
        'Iran': 'Iran, Islamic Rep.',
        'Gambia': 'Gambia, The',
        'Kyrgyzstan': 'Kyrgyz Republic',
        'Mauritania': 'Mauritius',
        'Egypt': 'Egypt, Arab Rep.'
    }
    map = map.rename(index=index_change)
    data = wb.download(indicator='FP.CPI.TOTL.ZG', country='all', start=2019, end=2019)
    data = data.reset_index(1)
    data.columns = ['year', 'inflation']
    map = map.join(data, how='outer')
    

    Step 4: Making a Choropleth map based on our dataset

    The simple plot of the data will not be very insightful. But let’s try that first.

    map.plot('inflation')
    plt.title("Inflation 2019")
    plt.show()
    

    Resulting in the following.

    The default result.

    A good way to get inspiration is to check out the documentation with examples.

    From the GeoPandas documentation

    Where you see a cool color map with scheme=’quantiles’. Let’s try that.

    map.plot('inflation', cmap='OrRd', scheme='quantiles')
    plt.title("Inflation 2019")
    plt.show()
    

    Resulting in the following.

    Closer

    Adding grey tone to countries not mapped, adding a legend, setting the size. Then we are done. The full source code is here.

    from pandas_datareader import wb
    import geopandas
    import pandas as pd
    import matplotlib.pyplot as plt
    map = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
    map = map[map['name'] != 'Antarctica']
    map = map.set_index('name')
    index_change = {
        'United States of America': 'United States',
        'Yemen': 'Yemen, Rep.',
        'Venezuela': 'Venezuela, RB',
        'Syria': 'Syrian Arab Republic',
        'Solomon Is.': 'Solomon Islands',
        'Russia': 'Russian Federation',
        'Iran': 'Iran, Islamic Rep.',
        'Gambia': 'Gambia, The',
        'Kyrgyzstan': 'Kyrgyz Republic',
        'Mauritania': 'Mauritius',
        'Egypt': 'Egypt, Arab Rep.'
    }
    map = map.rename(index=index_change)
    data = wb.download(indicator='FP.CPI.TOTL.ZG', country='all', start=2019, end=2019)
    data = data.reset_index(1)
    data.columns = ['year', 'inflation']
    map = map.join(data, how='outer')
    map.plot('inflation', cmap='OrRd', scheme='quantiles', missing_kwds={"color": "lightgrey"}, legend=True, figsize=(14,5))
    plt.title("Inflation 2019")
    plt.show()
    

    Resulting in the following output.

    Inflation data from World Bank mapped on a Choropleth map using GeoPandas and MatPlotLib.

    Python Circle

    Do you know what the 5 key success factors every programmer must have?

    How is it possible that some people become programmer so fast?

    While others struggle for years and still fail.

    Not only do they learn python 10 times faster they solve complex problems with ease.

    What separates them from the rest?

    I identified these 5 success factors that every programmer must have to succeed:

    1. Collaboration: sharing your work with others and receiving help with any questions or challenges you may have.
    2. Networking: the ability to connect with the right people and leverage their knowledge, experience, and resources.
    3. Support: receive feedback on your work and ask questions without feeling intimidated or judged.
    4. Accountability: stay motivated and accountable to your learning goals by surrounding yourself with others who are also committed to learning Python.
    5. Feedback from the instructor: receiving feedback and support from an instructor with years of experience in the field.

    I know how important these success factors are for growth and progress in mastering Python.

    That is why I want to make them available to anyone struggling to learn or who just wants to improve faster.

    With the Python Circle community, you can take advantage of 5 key success factors every programmer must have.

    Python Circle
    Python Circle

    Be part of something bigger and join the Python Circle community.

    1 thought on “Visualize Inflation for 2019 using Pandas-datareader and GeoPandas”

    Leave a Comment