Pandas is a powerful and widely used Python library for data manipulation and analysis. It provides intuitive tools to work with DataFrames and Series.
When handling large datasets, filtering and querying are essential techniques for cleaning data, preparing reports, and extracting insights. Mastering these skills can significantly improve your ability to analyze and work with data effectively.
In this article, you’ll explore some useful tips and tricks for filtering and querying data with Pandas.
Basics of Data Filtering
You can easily extract rows and columns in Pandas using Boolean indexing.
Example:
python
CopyEdit
import pandas as pd
data = pd.DataFrame({
‘Name’: [‘Alice’, ‘Bob’, ‘Charlie’, ‘David’],
‘Age’: [25, 32, 30, 29],
‘Department’: [‘HR’, ‘IT’, ‘Finance’, ‘IT’]
})
# Filter rows where Department is IT
it_employees = data[data[‘Department’] == ‘IT’]
Conditional Filtering with Multiple Conditions
You can combine multiple conditions using logical operators:
- & → AND
- | → OR
- ~ → NOT
Example:
python
CopyEdit
# Employees from IT department and older than 40
it_senior = data[(data[‘Department’] == ‘IT’) & (data[‘Age’] > 40)]
Using .query() Method for SQL-like Filtering
The .query() method enables SQL-like syntax for filtering, which can make complex conditions easier to read.
Example:
python
CopyEdit
it_senior_query = data.query(“Department == ‘IT’ and Age > 30”)
Using .isin() for Multiple Values
The .isin() method allows filtering based on multiple values in a column — great for categorical data.
Example:
python
CopyEdit
# Select employees from HR or Finance departments
selected_dept = data[data[‘Department’].isin([‘HR’, ‘Finance’])]
Filtering with String Methods
String methods like .startswith(), .contains(), and .endswith() help when working with text-heavy datasets.
Example:
python
CopyEdit
# Filter names starting with ‘A’
a_names = data[data[‘Name’].str.startswith(‘A’)]
Filtering Missing Data
Handling missing values is an important part of data cleaning. Use .isnull() and .notnull() to identify and filter such data.
Example:
python
CopyEdit
missing_values = data[data[‘Department’].isnull()]
Filtering Numeric Ranges with .between()
To filter numeric values within a range, .between() provides a clean and readable solution.
Example:
python
CopyEdit
# Employees aged between 28 and 32
mid_aged = data[data[‘Age’].between(28, 32)]
This is especially useful for analyzing age groups, salary ranges, or time periods.
Using .loc[] for Conditional Filtering
The .loc[] method allows you to filter rows and select specific columns at the same time.
Example:
python
CopyEdit
# Retrieve names of employees in IT department
data.loc[data[‘Department’] == ‘IT’, ‘Name’]
Index-Based Filtering
You can use .iloc[] for position-based indexing, and .loc[] for label-based indexing.
Example:
python
CopyEdit
# First three rows of the dataset
first_rows = data.iloc[:3]
Date Filtering
When working with datetime data, convert columns using .to_datetime(), then filter using .dt accessor.
Example:
python
CopyEdit
# Convert column to datetime
sales[‘Date’] = pd.to_datetime(sales[‘Date’])
# Filter sales made in March 2024
march_sales = sales[sales[‘Date’].dt.month == 3]
# Filter sales from the year 2024
yearly_sales = sales[sales[‘Date’].dt.year == 2024]
Filtering with Custom Functions
For complex and dynamic filtering, you can use .apply() or .map() with a custom function.
Example:
python
CopyEdit
def is_young(age):
return age < 30
young_employees = data[data[‘Age’].apply(is_young)]
Filter by Row Number or Rank
You can perform rank-based filtering using .nlargest(), .nsmallest(), or .rank().
Example:
python
CopyEdit
# Top 4 oldest employees
top_age = data.nlargest(4, ‘Age’)
Dynamic Filters with Variables
Dynamic filtering lets you use parameters inside your .query() statements.
Example:
python
CopyEdit
min_age = 38
filtered_data = data.query(“Age > @min_age”)
Conclusion
Mastering data filtering and querying with Pandas will greatly enhance your data manipulation skills. You can filter by conditions, indexes, text patterns, or date ranges — all essential for effective data analysis.
If you want to become a successful data analyst, consider enrolling at ConsoleFlare. There, you’ll learn industry-relevant skills from experienced professionals — and their strong placement support will help you land a high-paying job.
For more such content and regular updates, follow us on Facebook, Instagram, LinkedIn