Intro to Data Visualization
  • Introduction
  • Getting started
    • Introduction to Pandas
    • Accessing Files on Colab
    • Reviewing Data
      • Understanding type(data) in Pandas
    • Data Types
      • Categorical Data
      • Numeric Data
      • Temporal Data
      • Geographic Data
    • How to Check Data Type
    • Slicing and Subsetting DataFrames
    • Aggregating Data
  • Visualization Types
    • Exploratory Process
    • Explanatory Process
  • data exploration
    • Exploration Overview
    • Exploration with Plotly
      • Exploring Distributions
      • Exploring Relationships
      • Exploring with Regression Plots
      • Exploring Correlations
      • Exploring Categories
      • Exploring Time Series
      • Exploring Stocks with Candlestick
      • Exploring with Facets
      • Exploring with Subplots
    • Exploring with AI
  • Data Explanation
    • Data Explanation with Plotly
      • Using Text
      • Using Annotations
      • Using Color
      • Using Shape
      • Accessibility
      • Using Animations
    • Use Cases
  • Exercises and examples
    • Stock Market
      • Loading Yahoo! Finance Data
      • Use Cases for YF
      • Exploring YF Data
      • Understanding Boeing Data Over Time
      • Polishing the visualization
      • Analyzing with AI
      • Comparisons
    • The Gapminder Dataset
      • Loading the Gapminder Data
      • Use Cases
      • Exploring the Data
      • Exporting a Static Image
Powered by GitBook
On this page
  • Main Data Types
  • 1. Categorical Data
  • 2. Numeric Data
  • 3. Temporal Data
  • 4. Geographic Data
  • Pandas Data Types
  1. Getting started

Data Types

Main Data Types

There are four main types of data:

1. Categorical Data

Categorical data represents classifications or labels. Pandas has a special data type called category to optimize memory usage and performance.

2. Numeric Data

Numeric data represents numerical values and is used for computations and analysis.

3. Temporal Data

Temporal data represents specific times or durations. These are typically stored as datetime objects in Pandas.

4. Geographic Data

Geographic data represents location-related information, such as coordinates or region names. While Pandas does not have a specific data type for geographic data, it can be represented as strings or numerical values.


Pandas Data Types

In Pandas, objects primarily have the following data types:

  1. Numeric:

    • int64: For integer numbers.

    • float64: For floating-point numbers.

    • complex: For complex numbers (less common).

  2. String/Object:

    • object: Typically used for string or mixed data types (strings and numbers). It’s the default data type for text data in Pandas.

  3. Boolean:

    • bool: Represents True and False values. In visualization, Boolean data is viewed as categorical data.

  4. Datetime:

    • datetime64[ns]: For dates and times, with nanosecond precision.

  5. Timedelta:

    • timedelta64[ns]: For differences between datetime values.

  6. Categorical:

    • category: Represents categorical data, which can save memory and improve performance when working with repeated values.

Geographic data is not represented as a different type of data in Pandas DataFrame.

Example of Data Types in a DataFrame

import pandas as pd

# Create a DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],       # String/Object
    'Age': [25, 30, 35],                      # int64
    'Height': [5.5, 6.0, 5.8],                # float64
    'IsStudent': [True, False, False],        # bool
    'JoinDate': ['2023-01-01', '2023-02-01', '2023-03-01']  # datetime64
}

df = pd.DataFrame(data)

# Convert 'JoinDate' to datetime
df['JoinDate'] = pd.to_datetime(df['JoinDate'])

# Display data types
print(df.dtypes)

Output

csharpCopy codeName                 object
Age                   int64
Height              float64
IsStudent              bool
JoinDate     datetime64[ns]
dtype: object

These data types allow Pandas to perform optimized operations tailored to the type of data you are working with. If needed, you can use .astype() to convert columns to a specific type.


PreviousUnderstanding type(data) in PandasNextCategorical Data

Last updated 3 months ago