What will we cover in this tutorial?
We will have some data in a Pandas DataFrame, which we want to export to an Excel sheet. Then we want to create a Scatter plot graph and fit that to a Excel trendline.
Step 1: Get the data
You might have some data already that you want to use. It can be from a HTML page (example) or CSV file.
For this purpose here we just generate some random data to use. We will use NumPy’s uniform function to generate it.
import pandas as pd import numpy as np # Generate some random increasing data data = pd.DataFrame( {'A': [np.random.uniform(0.1*i, 0.1*i + 1) for i in range(100)], 'B': [np.random.uniform(0.1*i, 0.1*i + 1) for i in range(100)]} ) print(data)
Which will generate some slightly increasing data, which is nice to fit a graph to.
The output could look something like this.
A B 0 0.039515 0.778077 1 0.451888 0.210705 2 0.992493 0.961428 3 0.317536 1.046444 4 1.220419 1.388086
Step 2: Create an Excel XlsxWriter engine
This step might require that you install the XlsxWriter library, which is needed from the Pandas library.
This can be done by the following command.
pip install xlsxwriter
Now we can create the engine in our code.
import pandas as pd import numpy as np # Generate some random increasing data data = pd.DataFrame( {'A': [np.random.uniform(0.1*i, 0.1*i + 1) for i in range(100)], 'B': [np.random.uniform(0.1*i, 0.1*i + 1) for i in range(100)]} ) # Create a Pandas Excel writer using XlsxWriter excel_file = 'output.xlsx' sheet_name = 'Data set' writer = pd.ExcelWriter(excel_file, engine='xlsxwriter')
This will setup a Excel writer engine and be ready to write to file output.xlsx.
Step 3: Write the data to Excel and create a scatter graph with a fitted Trendline
This can be done by the following code, which uses the add_series function to insert a graph.
import pandas as pd import numpy as np # Generate some random increasing data data = pd.DataFrame( {'A': [np.random.uniform(0.1*i, 0.1*i + 1) for i in range(100)], 'B': [np.random.uniform(0.1*i, 0.1*i + 1) for i in range(100)]} ) # Create a Pandas Excel writer using XlsxWriter excel_file = 'output.xlsx' sheet_name = 'Data set' writer = pd.ExcelWriter(excel_file, engine='xlsxwriter') data.to_excel(writer, sheet_name=sheet_name) # Access the XlsxWriter workbook and worksheet objects from the dataframe. workbook = writer.book worksheet = writer.sheets[sheet_name] # Create a scatter chart object. chart = workbook.add_chart({'type': 'scatter'}) # Get the number of rows and column index max_row = len(data) col_x = data.columns.get_loc('A') + 1 col_y = data.columns.get_loc('B') + 1 # Create the scatter plot, use a trendline to fit it chart.add_series({ 'name': "Samples", 'categories': [sheet_name, 1, col_x, max_row, col_x], 'values': [sheet_name, 1, col_y, max_row, col_y], 'marker': {'type': 'circle', 'size': 4}, 'trendline': {'type': 'linear'}, }) # Set name on axis chart.set_x_axis({'name': 'Concentration'}) chart.set_y_axis({'name': 'Measured', 'major_gridlines': {'visible': False}}) # Insert the chart into the worksheet in field D2 worksheet.insert_chart('D2', chart) # Close and save the Excel file writer.save()
Result
The result should be similar to this.

That is how it can be done.