How to use Error Handling in Python 101: A Simple Guide for Future Data Scientists

Error Handling 101: In the world of programming, either as a data scientist or a software developer, writing code that works successfully is only one-half of the coin. Now you must be thinking about the other half. The other half stays where, One must make sure that the code runs smoothly while handling the errors and unexpected situations gracefully. This is the point where error handling plays an important role in the successful execution of a code. Effective error handling has proved itself to be an important skill for any aspiring data scientist.

In this blog, we will cover the concept of error handling in Python, we will explore the techniques and best practices that will improve your coding. By the end of this blog, you’ll have a solid understanding of exception handling and the proper use of try-except-else-finally blocks.

Before proceeding let us understand first…

Why Error Handling in Data Science Matters!

  1. Data Integrity: While working with large datasets, errors in data processing can lead to incorrect results or corrupted data that hamper the decision-making process of any organization. Proper error handling helps maintain data integrity.
  2. Long-running Processes: Data science often involves complex and time-consuming calculations. Error handling ensures that if something goes wrong, you can gracefully recover or at least save intermediate results.
  3. Automation: Many data scientists take the help of automation for the tasks they need to perform on a regular basis. In automation, there is a wide scope of errors and error handling is crucial for creating reliable, self-running scripts and pipelines.
  4. Debugging: Concepts of error handling help you in writing well-structured codes that make it easier for the data scientist to detect and debug the code which saves valuable time for further deployments.

These are a few important points that show the importance of error handling in the day-to-day tasks of a data scientist.

Now let’s have a look at an example to understand why and how error handling is important for us.

Let us create a function that calculates the average age of a group of friends.

def calculate_average_age(ages):
    total = sum(ages)
    average = total / len(ages)
    return average

friends_ages = [25, 30, 35, 40]
average_age = calculate_average_age(friends_ages)
print(f"The average age is: {average_age}")
Python

This code will work perfectly but what if we provide an empty list accidentally?

no_friends = []
average_age = calculate_average_age(no_friends)
Python

Here comes the error because we are trying to divide by the length of the list which is 0. Now we need to handle this error.

Introducing try-except for error handling

The try-except blocks act like a safety net for a code but how do we use it:

def calculate_average_age(ages):
    try:
        total = sum(ages)
        average = total / len(ages)
        return average
    except ZeroDivisionError:
        return "Error: The list is empty!"

# Now let's try it again
no_friends = []
result = calculate_average_age(no_friends)
print(result)  
Python

Now the program has caught the error but it will not stop the program from performing further tasks.

Handling Different Types of Errors

Until now we have covered a single error and that is division by 0. Now let us have a look at other possibilities too.

def calculate_average_age(ages):
    try:
        total = sum(ages)
        average = total / len(ages)
        return average
    except ZeroDivisionError:
        return "Error: The list is empty!"
    except TypeError:
        return "Error: All ages must be numbers!"

# Let's test it
print(calculate_average_age([]))  
print(calculate_average_age([25, "thirty", 35]))  
print(calculate_average_age([25, 30, 35]))  
Python

Now this code can handle empty lists and lists with non-numeric values.

Using else and finally:

There are some scenarios when we want to perform a task only if there are no errors or perform something only if there are no errors. This is the point where we use else and finally

def process_age_data(ages):
    try:
        average = calculate_average_age(ages)
    except Exception as e:
        print(f"An error occurred: {e}")
    else:
        print(f"The average age is: {average}")
    finally:
        print("Data processing completed.")

# Let's try it out
process_age_data([25, 30, 35])  # This will work normally
process_age_data([])  # This will show our error message
Python
error handling

The else block runs if there were no errors and the finally block always runs, error or not.

Raising your custom errors:

There are some scenarios where we want to create custom errors as per our requirements.
For example, let’s say we do not want to accept age over 100.

def validate_age(age):
    if age < 0 or age > 120:
        raise ValueError(f"{age} is not a valid age!")
    return age

try:
    validate_age(150)
except ValueError as e:
    print(e)  # This will print: "150 is not a valid age!"
Python

By raising our own errors, we can catch problems early and guide the flow of the code:

For more such content and regular updates, follow us on FacebookInstagramLinkedIn

A Real Data Science Example of Error Handling

Let’s put it all together with a simple data science task: calculating the average temperature from a weather dataset.

import csv
import random
from datetime import datetime, timedelta

def generate_weather_data(filename, num_rows=100):
    # Start date for our data
    start_date = datetime(2023, 1, 1)
    
    with open(filename, 'w', newline='') as file:
        writer = csv.writer(file)
        
        # Write the header
        writer.writerow(['Date', 'Temperature'])
        
        for i in range(num_rows):
            date = start_date + timedelta(days=i)
            
            # Introduce some errors in the data
            if i % 20 == 0:  # Every 20th row will have an error
                temperature = 'ERROR'
            elif i % 25 == 0:  # Every 25th row will have an extreme value
                temperature = random.uniform(-50, 150)
            else:
                # Generate a random temperature between -10 and 40 degrees
                temperature = round(random.uniform(-10, 40), 1)
            
            writer.writerow([date.strftime('%Y-%m-%d'), temperature])

    print(f"Data has been written to {filename}")

# Generate the data
generate_weather_data('weather_data.csv')
Python

This example is a glimpse of how we can use error handling in a real data science scenario. We handle file-related errors, and data conversion errors, and even checked for empty data sets.

Conclusion: Your Next Steps in Data Science

Now that you have learned the basics of error handling in Python. This skill will help you write more reliable and more accurate codes for data science projects but this is just the beginning of your career,

To truly excel in your career in data science field, you need to have hands on:

  1. Python- A programming language
  2. Data Manipulations and Analytics- Numpy and Pandas
  3. Data Visualizations- Matplotlib and Seaborn
  4. SQL
  5. Reporting with PowerBI
  6. Bigdata with Pyspark -Databricks
  7. Machine Learning

Are you interested in creating your career in the field of data science? Our Masters in Data Science course covers all these topics and more. You’ll learn from real-world examples, work on exciting projects, and gain the skills you need to become a proficient data scientist.

Don’t let errors in your code hold you back. Join our course today and take the first step towards becoming a data science expert!

Remember, every error you encounter is an opportunity to learn and improve.

Register yourself with ConsoleFlare for our free workshop on data science. In this workshop, you will get to know each tool and technology of data analysis from scratch that will make you skillfully eligible for any data science profile.

To join this workshop, register yourself on consoleflare and we will call you back.

Thinking, Why Console Flare?

  • Recently, ConsoleFlare has been recognized as one of the Top 10 Most Promising Data Science Training Institutes of 2023.
  • Console Flare offers the opportunity to learn Data Science in Hindi, just like how you speak daily.
  • Console Flare believes in the idea of “What to learn and what not to learn” and this can be seen in their curriculum structure. They have designed their program based on what you need to learn for data science and nothing else.
  • Want more reasons,

Register yourself on consoleflare and we will call you back.

WE ARE CONSOLE FLARE

We can help you Switch your career to Data Science in just 6 months.

Happy coding, future data scientists!

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top