How to Check Data Type
In Pandas, you can check the data type of a column to determine if it is numeric or string (or any other type). This is useful for ensuring that operations are performed on the correct data types.
Checking Data Types with dtype
dtype
You can use the dtype
attribute to inspect the data type of a column:
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']
}
df = pd.DataFrame(data)
# Check if a column is numeric
print(df['Age'].dtype)
# Check if a column is string (object type in Pandas)
print(df['Name'].dtype)
Output:
int64
object
Using pd.api.types
for Type Checking
pd.api.types
for Type CheckingPandas provides utility functions in pd.api.types
to check for specific data types:
pes:
from pandas.api.types import is_numeric_dtype, is_string_dtype
# Check if the 'Age' column is numeric
print(is_numeric_dtype(df['Age']))
# Check if the 'Name' column is a string
print(is_string_dtype(df['Name']))
Output:
True
True
Use Cases
Data Validation: Ensure columns contain the expected data type before performing operations.
Conditional Logic: Apply different logic based on the column's data type.
Example:
def process_column(column): if is_numeric_dtype(column): return column.mean() # Calculate the mean for numeric columns elif is_string_dtype(column): return column.value_counts() # Get value counts for string columns print(process_column(df['Age'])) print(process_column(df['Name']))
Output:
30.0 Alice 1 Bob 1 Charlie 1 dtype: int64
By checking the data type of columns, you can write robust and flexible code that handles different types of data effectively.
Last updated