Intro to Data Visualization
  • Introduction
  • Getting started
    • Introduction to Pandas
    • Accessing Files on Colab
    • Reviewing Data
      • Understanding type(data) in Pandas
    • Data Types
      • Categorical Data
      • Numeric Data
      • Temporal Data
      • Geographic Data
    • How to Check Data Type
    • Slicing and Subsetting DataFrames
    • Aggregating Data
  • Visualization Types
    • Exploratory Process
    • Explanatory Process
  • data exploration
    • Exploration Overview
    • Exploration with Plotly
      • Exploring Distributions
      • Exploring Relationships
      • Exploring with Regression Plots
      • Exploring Correlations
      • Exploring Categories
      • Exploring Time Series
      • Exploring Stocks with Candlestick
      • Exploring with Facets
      • Exploring with Subplots
    • Exploring with AI
  • Data Explanation
    • Data Explanation with Plotly
      • Using Text
      • Using Annotations
      • Using Color
      • Using Shape
      • Accessibility
      • Using Animations
    • Use Cases
  • Exercises and examples
    • Stock Market
      • Loading Yahoo! Finance Data
      • Use Cases for YF
      • Exploring YF Data
      • Understanding Boeing Data Over Time
      • Polishing the visualization
      • Analyzing with AI
      • Comparisons
    • The Gapminder Dataset
      • Loading the Gapminder Data
      • Use Cases
      • Exploring the Data
      • Exporting a Static Image
Powered by GitBook
On this page
  • Importance of Visualizing Categorical Variables
  • Bar Charts
  • Dot Plot
  • Simple Dot Plot
  1. data exploration
  2. Exploration with Plotly

Exploring Categories

Importance of Visualizing Categorical Variables

Visualizing categorical data is an essential part of data analysis. It is critical to understand the patterns and relationships of categories within a dataset. Common visualization techniques for categorical data include bar charts, dot plots, and pie charts.

Bar charts are particularly effective for showing comparisons between categories by displaying the frequency or proportion of each category as bars. Bar charts can also be used to visualize subcategories, either as side-by-side charts or as stacked bar charts. For instance, grouped or stacked bar charts can be used to compare multiple categorical variables side by side or to break down a category into subcategories.

Dot plots are similar to scatter plots, except they have a categorical axis. Dot plots can be used to examine relationships between categorical and numerical data by encoding categories using color, shape, or size. These visualizations are especially useful for identifying trends, patterns, or outliers within categorical datasets.

Pie charts, while only suitable for representing proportions of a whole, are often used when the dataset contains only a few distinct categories to avoid visual clutter.

All these types of charts provide an immediate and clear depiction of the most and least prominent categories, helping analysts quickly grasp the structure of the data.

Plotly provides a simple and powerful way to create various types of charts for categorical visualization. Advanced tools like Plotly enable interactive visualizations of categorical data, offering features like hover-over tooltips, filtering, and zooming for deeper analysis.


Bar Charts

Simple Bar Chart

import pandas as pd
import plotly.express as px

df = px.data.gapminder().query("year == 2007")

fig = px.bar(df, x="continent", y="pop")

fig.show()

To change the order of the bars:

import plotly.express as px

df = px.data.gapminder().query("year == 2007")

fig = px.bar(df, x="continent", y="pop")

# Sort bars in descending order based on the y-axis values
fig.update_layout(xaxis={'categoryorder': 'total descending'})

fig.show()

Stacked Bar Chart

Stacked bar charts are useful for showing the contribution of different subcategories within each main category.

import pandas as pd
import plotly.express as px

df = px.data.gapminder().query("year == 2007")

fig = px.bar(df, x="continent", y="pop", color="country")

# Sort bars in descending order based on the y-axis values
fig.update_layout(xaxis={'categoryorder': 'total descending'})

fig.show()

Side-by-Side Bar Chart

Side-by-side bar charts (also known as grouped bar charts) are ideal for comparing multiple subcategories across categories.

# Sample data with subcategories
data = {
    'Category': ['A', 'A', 'B', 'B', 'C', 'C'],
    'Subcategory': ['X', 'Y', 'X', 'Y', 'X', 'Y'],
    'Values': [10, 5, 20, 10, 15, 10]
}

df = pd.DataFrame(data)

# Create a side-by-side bar chart
fig = px.bar(df, x='Category', y='Values', color='Subcategory', barmode='group', title='Side-by-Side Bar Chart')
fig.show()

Features of Plotly Bar Charts

  • Interactivity: Bar charts are interactive, allowing users to hover over bars to see details.

  • Customization: Easily adjust colors, labels, and titles to suit your needs.

  • Grouping and Stacking: Control the layout with the barmode parameter.

These examples demonstrate how to use Plotly to create visually appealing and insightful bar charts to analyze your data effectively.


Dot Plot

A dot plot is a scatterplot with a categorical axis.

Simple Dot Plot

import plotly.express as px
df = px.data.medals_long()

fig = px.scatter(df, y="nation", x="count", color="medal", symbol="medal")
fig.update_traces(marker_size=10)
fig.show()

Grouped or Side-by-Side Dot plot

import plotly.express as px

df = px.data.medals_long()

fig = px.scatter(df, y="count", x="nation", color="medal")
fig.update_traces(marker_size=10)
fig.update_layout(scattermode="group", scattergap=0.75)
fig.show()
PreviousExploring CorrelationsNextExploring Time Series

Last updated 3 months ago