Intro to Data Visualization
  • Introduction
  • Getting started
    • Introduction to Pandas
    • Accessing Files on Colab
    • Reviewing Data
      • Understanding type(data) in Pandas
    • Data Types
      • Categorical Data
      • Numeric Data
      • Temporal Data
      • Geographic Data
    • How to Check Data Type
    • Slicing and Subsetting DataFrames
    • Aggregating Data
  • Visualization Types
    • Exploratory Process
    • Explanatory Process
  • data exploration
    • Exploration Overview
    • Exploration with Plotly
      • Exploring Distributions
      • Exploring Relationships
      • Exploring with Regression Plots
      • Exploring Correlations
      • Exploring Categories
      • Exploring Time Series
      • Exploring Stocks with Candlestick
      • Exploring with Facets
      • Exploring with Subplots
    • Exploring with AI
  • Data Explanation
    • Data Explanation with Plotly
      • Using Text
      • Using Annotations
      • Using Color
      • Using Shape
      • Accessibility
      • Using Animations
    • Use Cases
  • Exercises and examples
    • Stock Market
      • Loading Yahoo! Finance Data
      • Use Cases for YF
      • Exploring YF Data
      • Understanding Boeing Data Over Time
      • Polishing the visualization
      • Analyzing with AI
      • Comparisons
    • The Gapminder Dataset
      • Loading the Gapminder Data
      • Use Cases
      • Exploring the Data
      • Exporting a Static Image
Powered by GitBook
On this page
  1. Getting started

How to Check Data Type

In Pandas, you can check the data type of a column to determine if it is numeric or string (or any other type). This is useful for ensuring that operations are performed on the correct data types.


Checking Data Types with dtype

You can use the dtype attribute to inspect the data type of a column:

import pandas as pd

# Create a DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'Los Angeles', 'Chicago']
}

df = pd.DataFrame(data)

# Check if a column is numeric
print(df['Age'].dtype)
# Check if a column is string (object type in Pandas)
print(df['Name'].dtype)

Output:

int64
object

Using pd.api.types for Type Checking

Pandas provides utility functions in pd.api.types to check for specific data types:

pes:

from pandas.api.types import is_numeric_dtype, is_string_dtype

# Check if the 'Age' column is numeric
print(is_numeric_dtype(df['Age']))

# Check if the 'Name' column is a string
print(is_string_dtype(df['Name']))

Output:

True
True

Use Cases

  1. Data Validation: Ensure columns contain the expected data type before performing operations.

  2. Conditional Logic: Apply different logic based on the column's data type.

    Example:

    def process_column(column):
        if is_numeric_dtype(column):
            return column.mean()  # Calculate the mean for numeric columns
        elif is_string_dtype(column):
            return column.value_counts()  # Get value counts for string columns
    
    print(process_column(df['Age']))
    print(process_column(df['Name']))

    Output:

    30.0
    Alice      1
    Bob        1
    Charlie    1
    dtype: int64

    By checking the data type of columns, you can write robust and flexible code that handles different types of data effectively.

PreviousGeographic DataNextSlicing and Subsetting DataFrames

Last updated 3 months ago