How to Check Data Type

In Pandas, you can check the data type of a column to determine if it is numeric or string (or any other type). This is useful for ensuring that operations are performed on the correct data types.


Checking Data Types with dtype

You can use the dtype attribute to inspect the data type of a column:

import pandas as pd

# Create a DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'Los Angeles', 'Chicago']
}

df = pd.DataFrame(data)

# Check if a column is numeric
print(df['Age'].dtype)
# Check if a column is string (object type in Pandas)
print(df['Name'].dtype)

Output:

int64
object

Using pd.api.types for Type Checking

Pandas provides utility functions in pd.api.types to check for specific data types:

pes:

from pandas.api.types import is_numeric_dtype, is_string_dtype

# Check if the 'Age' column is numeric
print(is_numeric_dtype(df['Age']))

# Check if the 'Name' column is a string
print(is_string_dtype(df['Name']))

Output:

True
True

Use Cases

  1. Data Validation: Ensure columns contain the expected data type before performing operations.

  2. Conditional Logic: Apply different logic based on the column's data type.

    Example:

    def process_column(column):
        if is_numeric_dtype(column):
            return column.mean()  # Calculate the mean for numeric columns
        elif is_string_dtype(column):
            return column.value_counts()  # Get value counts for string columns
    
    print(process_column(df['Age']))
    print(process_column(df['Name']))

    Output:

    30.0
    Alice      1
    Bob        1
    Charlie    1
    dtype: int64

    By checking the data type of columns, you can write robust and flexible code that handles different types of data effectively.

Last updated