Data Visualization with matplotlib: Build Bar Chart in 6 easy steps

Data visualization is the graphical representation of information and data. It helps in identifying trends, patterns, and outliers that might go unnoticed in a plain data table. Bar charts, line graphs, histograms, and pie charts are some popular ways to visualize data.

In Python, we use powerful libraries like Pandas and Matplotlib to:

  • Manipulate and process data easily.
  • Generate high-quality, visually appealing charts.

In the code we are analyzing, we will focus on using Pandas to aggregate sales and tips data and Matplotlib for data visualization with a bar chart.

Data Visualization with matplotlib: Build Bar Chart in 7 easy steps

Understanding the code step by step:

Step 1: Importing Necessary Libraries

import pandas as pd                    # for dataset manipulation
import matplotlib.pyplot as plt        # for data visualization
import numpy as np                     # for statistical data
Python

The code begins by importing three essential libraries:

  • Pandas: A data manipulation library used to handle structured data, like reading and aggregating the CSV file.
  • Matplotlib: A plotting library used for generating graphs and visualizations.
  • Numpy: A numerical computing library. It provides tools arange() to generate ranges of values (useful for bar positioning).

Step 2: Reading the Dataset

df = pd.read_csv('tips.csv')
Python

Here, we load the dataset using pd.read_csv(). This function reads the CSV file and stores the data in a DataFrame, which is a table-like data structure in Pandas.

Data Visualization with matplotlib: Build Bar Chart in 7 easy steps

Step 3: Aggregating Data using Pandas

ndf = df.groupby('day').agg(
    total_sale=('total_bill', 'sum'),
    total_tip=('tip', 'sum')
).reset_index()
print(ndf)
Python

In this step, we group the data by the day column using the groupby() function. Then, we use the agg() function to:

  • Sum the total bills (total_bill) for each day.
  • Sum the tips (tip) for each day.

Step 4: Setting up the Bar Chart with Matplotlib

x_pos = np.arange(len(ndf))
plt.title('Sale vs Tips on Days', color='red')
Python
  1. Creating positions for bars:
    We use np.arange() to generate an array representing the positions of the bars on the x-axis. It is important to space out the bars correctly.
  2. Adding a title:
    plt.title() sets the title of the chart. The color='red' argument makes the title appear in red.
Data Visualization with matplotlib: Build Bar Chart in 7 easy steps

Step 5: Customizing the Chart for Better Insights

plt.bar(x_pos, ndf['total_sale'], width=0.5, label='sale', color='Green')
plt.bar(x_pos + 0.2, ndf['total_tip'], width=0.3, label='tips', color='orange')
Python
  1. Creating two bar plots:
    We generate two sets of bars: one for sales and another for tips.
    • The first plt.bar() plots the sales data.
    • The second plt.bar() plots the tips data but shifts the bars slightly to the right (x_pos + 0.2) to prevent overlap.
  2. Customizing colors and widths:
    • The color parameter sets the bar colors.
    • The width parameter controls the width of the bars.
    • The label parameter provides a name for the legend.

These customizations improve the readability of the chart by clearly distinguishing between sales and tips.

Step 6: Displaying the Final Chart

plt.xlabel('Days', color='green')
plt.ylabel('Sale', color='green')
plt.xticks(x_pos, ndf['day'])
plt.legend()
plt.grid()
plt.show()
Python

Adding axis labels:
plt.xlabel() and plt.ylabel() define the labels for the x and y axes, respectively. These labels help viewers understand what the chart represents.

Setting x-axis ticks:
plt.xticks() positions the day names directly under each bar. We use x_pos as the positions and ndf['day'] as the labels.

Adding a legend and grid:

  • plt.legend() displays a legend to explain what each bar color represents.
  • plt.grid() adds grid lines, making it easier to compare values visually.

Displaying the chart:
Finally, plt.show() renders the chart and displays it to the user.

data visualization

For more such content and regular updates, follow us on FacebookInstagramLinkedIn

Output and Insights

The plot generated by this code compares monthwise retail sales for 2023 and 2024. Some key takeaways include:

  • Identifying Trends: You can quickly see if sales are increasing or decreasing on specific days.
  • Comparing Performance: It’s easy to compare performance between the sales and tips on a particular day.
Data Visualization with matplotlib: Build Bar Chart in 7 easy steps

The Code:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

df=pd.read_csv('tips.csv')

ndf=df.groupby('day').agg(
    total_sale=('total_bill','sum'),
    total_tip=('tip','sum')
).reset_index()
print(ndf)


x_pos=np.arange(len(ndf))
plt.title('Sale vs Tips on Days',color='red')
plt.bar(x_pos,ndf['total_sale'],width=0.5,label='sale',color='Green')
plt.bar(x_pos+0.2,ndf['total_tip'],width=0.3,label='tips',color='orange')

plt.xlabel('Days',color='green')
plt.ylabel('Sale',color='green')

plt.xticks(x_pos,ndf['day'])

plt.legend()
plt.grid()
plt.show()
Python

If you’re ready to embark on a rewarding career as a data analyst in the data science field, consider enrolling in a comprehensive course that focuses on Python.

At ConsoleFlare, we offer tailored courses that provide hands-on experience and in-depth knowledge to help you master Python and excel in your data science journey. Join us and take the first step towards becoming a data science expert with Python at your fingertips.

Register yourself with ConsoleFlare for our free workshop on data science. In this workshop, you will get to know each tool and technology that is required for you to become a data analyst from scratch and also which will make you skillfully eligible for any other data science profile.

Data Visualization with matplotlib: Build Bar Chart in 7 easy steps

Thinking, Why Console Flare?

  • Recently, ConsoleFlare has been recognized as one of the Top 10 Most Promising Data Science Training Institutes of 2023.
  • Console Flare offers the opportunity to learn data science in Hindi, just like you speak daily.
  • Console Flare believes in the idea of “What to learn and what not to learn” and this can be seen in their curriculum structure. They have designed their program based on what you need to learn for data science and nothing else.
  • Want more reasons,

Register yourself  & we will help you switch your career to Data Science in just 6 months.

Data Visualization with matplotlib: Build Bar Chart in 7 easy steps

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top