Data Types
Main Data Types
There are four main types of data:
1. Categorical Data
Categorical data represents classifications or labels. Pandas has a special data type called category
to optimize memory usage and performance.
2. Numeric Data
Numeric data represents numerical values and is used for computations and analysis.
3. Temporal Data
Temporal data represents specific times or durations. These are typically stored as datetime
objects in Pandas.
4. Geographic Data
Geographic data represents location-related information, such as coordinates or region names. While Pandas does not have a specific data type for geographic data, it can be represented as strings or numerical values.
Pandas Data Types
In Pandas, objects primarily have the following data types:
Numeric:
int64: For integer numbers.
float64: For floating-point numbers.
complex: For complex numbers (less common).
String/Object:
object: Typically used for string or mixed data types (strings and numbers). It’s the default data type for text data in Pandas.
Boolean:
bool: Represents
True
andFalse
values. In visualization, Boolean data is viewed as categorical data.
Datetime:
datetime64[ns]: For dates and times, with nanosecond precision.
Timedelta:
timedelta64[ns]: For differences between datetime values.
Categorical:
category: Represents categorical data, which can save memory and improve performance when working with repeated values.
Geographic data is not represented as a different type of data in Pandas DataFrame.
Example of Data Types in a DataFrame
Output
These data types allow Pandas to perform optimized operations tailored to the type of data you are working with. If needed, you can use .astype()
to convert columns to a specific type.
Last updated