CSV GroupBy Processing to Excel with Charts using pandas (Python)

How to create Excel charts from a CSV file in Python

You will learn how to read CSV data to Excel using Python. It will be a bit more, you will read the CSV data from GitHub, then group the data by unique values in a column and sum it. Then how to group and sum data on a monthly basis. Finally, how to export this into a multiple-sheet Excel document with the chart.

Step 1: Get and inspect the data

We can use pandas to read the CSV data (see more about CSV files here).

import pandas as pd
url = 'https://raw.githubusercontent.com/LearnPythonWithRune/LearnPython/main/files/SalesData.csv'
data = pd.read_csv(url, delimiter=';', parse_dates=True, index_col='Date')
print(data.head())

This will read our data directly from GitHub and show the first few lines.

            Sales rep        Item  Price  Quantity  Sale
Date                                                    
2020-05-31        Mia     Markers      4         1     4
2020-02-01        Mia  Desk chair    199         2   398
2020-09-21     Oliver       Table   1099         2  2198
2020-07-15  Charlotte    Desk pad      9         2    18
2020-05-27       Emma        Book     12         1    12

This data shows different sales representatives and a list of their sales in 2020.

Step 2: Use GroupBy to get sales of each to represent and monthly sales

It is easy to group data by columns. The below code will first group all the Sales reps and sum their sales. Second, it will group the data in months and sum it up.

repr_sales = data.groupby("Sales rep").sum()['Sale']
print(repr_sales)
monthly_sale = data.groupby(pd.Grouper(freq='M')).sum()['Sale']
monthly_sale.index = monthly_sale.index.month_name()
print(monthly_sale)

This gives.

Sales rep
Charlotte     74599
Emma          65867
Ethan         40970
Liam          66989
Mia           88199
Noah          78575
Oliver        89355
Sophia       103480
William       80400
Name: Sale, dtype: int64
Date
January      69990
February     51847
March        67500
April        58401
May          40319
June         59397
July         64251
August       51571
September    55666
October      50093
November     57458
December     61941
Name: Sale, dtype: int64

Step 3: Create a multiple-sheet Excel document with charts

Now for the export magic.

workbook = pd.ExcelWriter("SalesReport.xlsx")
repr_sales.to_excel(workbook, sheet_name='Sales per rep')
monthly_sale.to_excel(workbook, sheet_name='Monthly')
chart1 = workbook.book.add_chart({'type': 'column'})
# Configure the first series.
chart1.add_series({
    'name':       'Sales per rep',
    'categories': '=\'Sales per rep\'!$A$2:$A$10',
    'values':     '=\'Sales per rep\'!$B$2:$B$10',
})
workbook.sheets['Sales per rep'].insert_chart('D2', chart1)
chart1 = workbook.book.add_chart({'type': 'column'})
# Configure the first series.
chart1.add_series({
    'name':       'Monthly sales',
    'categories': '=Monthly!$A$2:$A$13',
    'values':     '=Monthly!$B$2:$B$13',
})
workbook.sheets['Monthly'].insert_chart('D2', chart1)
workbook.close()

This will create an Excel document called SalesReport.xlsx in your working directory.

To get a detailed explanation see the video at the top of the post.

Want to learn more?

Want to learn more Python, then this is part of an 8 hours FREE video course with full explanations, projects on each level, and guided solutions.

The course is structured with the following resources to improve your learning experience.

  • 17 video lessons teaching you everything you need to know to get started with Python.
  • 34 Jupyter Notebooks with lesson code and projects.
  • 2 FREE eBooks to support your Python learning.

See the full FREE course page here.

If you instead want to learn more about Machine Learning. Do not worry.

Then check out my Machine Learning with Python course.

  • 15 video lessons teaching you all aspects of Machine Learning
  • 30 JuPyter Notebooks with lesson code and projects
  • 10 hours of FREE video content to support your learning journey.

Go to the course page for details.

Learn Python

Learn Python A BEGINNERS GUIDE TO PYTHON

  • 70 pages to get you started on your journey to master Python.
  • How to install your setup with Anaconda.
  • Written description and introduction to all concepts.
  • Jupyter Notebooks prepared for 17 projects.

Python 101: A CRASH COURSE

  1. How to get started with this 8 hours Python 101: A CRASH COURSE.
  2. Best practices for learning Python.
  3. How to download the material to follow along and create projects.
  4. A chapter for each lesson with a descriptioncode snippets for easy reference, and links to a lesson video.

Expert Data Science Blueprint

Expert Data Science Blueprint

  • Master the Data Science Workflow for actionable data insights.
  • How to download the material to follow along and create projects.
  • A chapter to each lesson with a Description, Learning Objective, and link to the lesson video.

Machine Learning

Machine Learning – The Simple Path to Mastery

  • How to get started with Machine Learning.
  • How to download the material to follow along and make the projects.
  • One chapter for each lesson with a Description, Learning Objectives, and link to the lesson video.

3 thoughts on “CSV GroupBy Processing to Excel with Charts using pandas (Python)”

  1. in windows 10 i get error when trying the example above:
    AttributeError: ‘OpenpyxlWriter’ object has no attribute ‘add_chart’

    using pytho3.10 64bit

    Nevertheless, I like your style of presenting python, and the hard work put into all the tutorials. A big thank you! I have a long way to go, but I am enjoying it so far!

    Reply
  2. further to my earlier message where i noted this doesn’t work in win doze environ. I am glad to advise that it does work fine under macos big sur (arm64), also running python3.10 (but natively under arm64)

    Reply
    • Hi,
      There can be two things.
      Most likely it just needs to install the XlsxWriter.
      This is done as follows:

      pip install xlsxwriter

      This should be run in a terminal in the environment you use.

      If this does not solve the problem, try to update the pandas package

      pip install –upgrade pandas

      This will update it to the newest version.

      Hope it helps
      Rune

      Reply

Leave a Comment