Intro to Data Visualization
  • Introduction
  • Getting started
    • Introduction to Pandas
    • Accessing Files on Colab
    • Reviewing Data
      • Understanding type(data) in Pandas
    • Data Types
      • Categorical Data
      • Numeric Data
      • Temporal Data
      • Geographic Data
    • How to Check Data Type
    • Slicing and Subsetting DataFrames
    • Aggregating Data
  • Visualization Types
    • Exploratory Process
    • Explanatory Process
  • data exploration
    • Exploration Overview
    • Exploration with Plotly
      • Exploring Distributions
      • Exploring Relationships
      • Exploring with Regression Plots
      • Exploring Correlations
      • Exploring Categories
      • Exploring Time Series
      • Exploring Stocks with Candlestick
      • Exploring with Facets
      • Exploring with Subplots
    • Exploring with AI
  • Data Explanation
    • Data Explanation with Plotly
      • Using Text
      • Using Annotations
      • Using Color
      • Using Shape
      • Accessibility
      • Using Animations
    • Use Cases
  • Exercises and examples
    • Stock Market
      • Loading Yahoo! Finance Data
      • Use Cases for YF
      • Exploring YF Data
      • Understanding Boeing Data Over Time
      • Polishing the visualization
      • Analyzing with AI
      • Comparisons
    • The Gapminder Dataset
      • Loading the Gapminder Data
      • Use Cases
      • Exploring the Data
      • Exporting a Static Image
Powered by GitBook
On this page
  • Time Series
  • Regions
  1. Exercises and examples
  2. The Gapminder Dataset

Exploring the Data

Let's start with the first use case, looking at Life Expectancy. We are interested in thinking about the factors that influence Life Expectancy. Looking at the columns dataset, and considering our use case, we can ask questions related to:

  • Economics: What is the correlation between life expectancy and GDP per capita?

  • Time Series: Does life expectancy change (increase/decrease) over time? Are the patterns consistent by region?

  • Regions: Are there geographic / regional patterns in life expectancy?

Let's explore the first questions, and you can explore the second two questions on your own.


Q1: Economics

Let's create some visualizations related to Life Expectancy and GDP.

Life expectancy and GDP

import plotly.express as px

fig = px.scatter(
    data_frame=df = px.data.gapminder(),
    x="lifeExp",
    y="gdpPercap",
    
)
fig.show()

Output:

Let's add a title and just look at Year = 2007.

import plotly.express as px

fig = px.scatter(
    data_frame=df[df['year'] == 2007],
    x="lifeExp",
    y="gdpPercap",
    color="continent",
    title="Life Expectancy v GDP: 2007"
    
)
fig.show()

Output

Time Series

There are several other options for charts looking at Life Expectancy and GDP per capita that explore how this effect changes over time.

  • What about a visualization with pooled data, not just 2007? What about different colors by year?

  • What about a visualization of the relationship for each continent?

  • Does it look different to include a regression plot?

Regions

  • What is the correlation between life expectancy and GDP per capita?

  • Time Series: Does life expectancy change (increase/decrease) over time? Are the patterns consistent by region?

  • Regions: Are there geographic / regional correlations?

PreviousUse CasesNextExporting a Static Image

Last updated 3 months ago