How To Perform IPL Analysis And Visualization With The Help Of 1 Library (pandas)

IPL Analysis:

IPL analysis plays a major role in owning a team, making decisions, and deciding team batting or bowling order. We will be using the dataset by Kaggle, and try to dig insights. We will only be using pandas. So the only thing you need to analyze the data is to get the dataset.

Step 1: Importing Libraries

We will start by importing all the necessary libraries before analyzing the data.

import pandas as pd

Step 2: Download The Dataset

We will be working on the dataset of the 2008 – 2020 Data of IPL. Before Everything downloads the Dataset from here.

Step 3: Importing The Dataset

df = pd.read_csv('IPL Ball-by-Ball 2008-2020.csv')
df.head()

IPL DataSet Information :

df.info()
#   Column            Non-Null Count   Dtype 
---  ------            --------------   ----- 
 0   id                193468 non-null  int64 
 1   inning            193468 non-null  int64 
 2   over              193468 non-null  int64 
 3   ball              193468 non-null  int64 
 4   batsman           193468 non-null  object
 5   non_striker       193468 non-null  object
 6   bowler            193468 non-null  object
 7   batsman_runs      193468 non-null  int64 
 8   extra_runs        193468 non-null  int64 
 9   total_runs        193468 non-null  int64 
 10  non_boundary      193468 non-null  int64 
 11  is_wicket         193468 non-null  int64 
 12  dismissal_kind    9495 non-null    object
 13  player_dismissed  9495 non-null    object
 14  fielder           6784 non-null    object
 15  extras_type       10233 non-null   object
 16  batting_team      193468 non-null  object
 17  bowling_team      193277 non-null  object
dtypes: int64(9), object(9)
memory usage: 26.6+ MB

IPL Analysis 1: List Of Seasons

you can get all the seasons in the dataset for cricket analysis by applying unique() function on the season column so that seasons don’t repeat. Like this:

df.season.unique()
array([2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2019,
       2018, 2020, 2021], dtype=int64)

IPL Analysis 2: IPL Matches Season Wise

How many IPL matches were played in each season can be determined by matchid.

df.groupby(['match_id','season']).count().index.droplevel(level=0).value_counts().sort_index().plot(kind='bar')
ipl analysis

IPL Analysis 3: Most IPL Matches Played In a Stadium

According to our analysis, most ipl matches are played in M.chinaswamy stadium.

We have grouped venue and match id to count how many matches are played in any stadium.

%matplotlib inline
df.groupby(['venue','match_id']).count().droplevel(level=1).index.value_counts().sort_values(ascending=False)[:10].plot(kind='bar')
ipl analysis

IPL Analysis 4: Number of IPL Matches Played By Each Team

df['bowling_team'].value_counts().sort_values(ascending=False).plot(kind='barh')
ipl analysis

IPL Analysis 5 : Most Run Scored by IPL Teams

We have grouped all batting team and added all the runs scored by teams.

No wonder, Mumbai Indians tops the list.

%matplotlib inline
df.groupby(['batting_team'])['run'].sum().sort_values(ascending=False).plot(kind='barh')
ipl analysis

IPL Analysis 5 : Most IPL Runs by a Batsman

We have group all the strikers and add all the runs. Virat Kohli tops the list.


df.groupby(['striker'])['runs_off_bat'].sum().sort_values(ascending=False)[:10].plot(kind='bar')
ipl analysis

Average Run by Teams in Powerplay

df[df['over']<6].groupby(['match_id','batting_team']).sum()['run'].groupby('batting_team').mean().sort_values(ascending=False)[2:].plot(kind='barh')
ipl analysis

Most IPL Century by a Player

runs = df.groupby(['striker','match_id'])['runs_off_bat'].sum()
runs[runs >= 100].droplevel(level=1).groupby('striker').count().sort_values(ascending=False)[:10].plot(kind='barh')
ipl analysis

Most IPL Fifty by a Player

runs = df.groupby(['striker','start_date'])['runs_off_bat'].sum()
data= runs[runs >= 50].droplevel(level=1).groupby('striker').count().sort_values(ascending=False)[:10].plot(kind='barh')
ipl analysis

Most Sixes in an IPL Inning

df[df['runs_off_bat'] == 6].groupby(['start_date','striker']).count()['season'].sort_values(ascending=False).droplevel(level=0)[:10].plot(kind='barh')
ipl analysis

Most (4s) hit by a Batsman

data = df[df['runs_off_bat'] == 4]['striker'].value_counts()[:10].plot(kind='bar')
ipl analysis

Most runs in an IPL season by Player

df.groupby(['striker','season'])['runs_off_bat'].sum().sort_values(ascending=False)[:10].plot(kind='bar')
ipl analysis

No. of Sixes in IPL Seasons

data = df[df['runs_off_bat'] == 6].groupby('season').count()['match_id'].sort_values(ascending=False).plot(kind='barh')
ipl analysis

Highest Individual IPL Score

df.groupby(['striker','start_date'])['runs_off_bat'].sum().sort_values(ascending=False)[:10].plot(kind='barh')
ipl analysis

Most run conceded by a bowler in an inning

df.groupby(['bowler','start_date'])['run'].sum().droplevel(level=1).sort_values(ascending=False)[:10].plot(kind='barh')
ipl analysis

Most IPL Wickets by a Bowler

lst = 'caught,bowled,lbw,stumped,caught and bowled,hit wicket'
df[df['wicket_type'].apply(lambda x: True if x in lst and x != ' ' else False)]['bowler'].value_counts()[:10].plot(kind='barh')
ipl analysis

Most Dot Ball by a Bowler

data = df[df['run'] == 0].groupby('bowler').count()['match_id'].sort_values(ascending=False)[:10].plot(kind='barh')
ipl analysis

Most Wickets by an IPL Team

lst = 'caught,bowled,lbw,stumped,caught and bowled,hit wicket'
data = df[df['wicket_type'].apply(lambda x: True if x in lst and x != ' ' else False)]['bowling_team'].value_counts()
df.groupby(['batting_team'])['extras'].agg('sum').sort_values(ascending=False).plot(kind='barh')
ipl analysis

Most No Balls by an IPL team

df.groupby(['batting_team'])['noballs'].agg('sum').sort_values(ascending=False).plot(kind='bar')
ipl analysis

As you have noticed, we have analyzed a lot of things using pandas and matplotlib. These analyses alone are sufficient enough to take some very important decisions. Imagine a Data analyst, doing a postmortem of data and digging insights much more complex than these.

This is what you do, as a Data analyst in any company, you improve the decision-making process by giving them insights like these.

ipl analysis

If you want to learn to analyze data and become a data scientist, we are offering our courses here.

Go through the courses and learn Data analysis to become a Data analyst in less than 7 months.

Follow our Insta Page for more info like this: Console Flare (@consoleflare) is on Instagram

Want to see IPL stats : IPLT20.com – Indian Premier League Official Website

One thought on “How To Perform IPL Analysis And Visualization With The Help Of 1 Library (pandas)

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top