Intro to Data Visualization
  • Introduction
  • Getting started
    • Introduction to Pandas
    • Accessing Files on Colab
    • Reviewing Data
      • Understanding type(data) in Pandas
    • Data Types
      • Categorical Data
      • Numeric Data
      • Temporal Data
      • Geographic Data
    • How to Check Data Type
    • Slicing and Subsetting DataFrames
    • Aggregating Data
  • Visualization Types
    • Exploratory Process
    • Explanatory Process
  • data exploration
    • Exploration Overview
    • Exploration with Plotly
      • Exploring Distributions
      • Exploring Relationships
      • Exploring with Regression Plots
      • Exploring Correlations
      • Exploring Categories
      • Exploring Time Series
      • Exploring Stocks with Candlestick
      • Exploring with Facets
      • Exploring with Subplots
    • Exploring with AI
  • Data Explanation
    • Data Explanation with Plotly
      • Using Text
      • Using Annotations
      • Using Color
      • Using Shape
      • Accessibility
      • Using Animations
    • Use Cases
  • Exercises and examples
    • Stock Market
      • Loading Yahoo! Finance Data
      • Use Cases for YF
      • Exploring YF Data
      • Understanding Boeing Data Over Time
      • Polishing the visualization
      • Analyzing with AI
      • Comparisons
    • The Gapminder Dataset
      • Loading the Gapminder Data
      • Use Cases
      • Exploring the Data
      • Exporting a Static Image
Powered by GitBook
On this page
  1. Exercises and examples

The Gapminder Dataset

PreviousComparisonsNextLoading the Gapminder Data

Last updated 3 months ago

The Gapminder dataset is a well-known data resource often used for data analysis and visualization. The data source, , contains data on income, life expectancy and child mortality by country and is used by . This dataset contains information about global development trends, focusing on indicators like population, life expectancy, and GDP per capita across countries and regions over time. This dataset is particularly useful for exploring relationships between socio-economic and health variables and how they evolve.

Key Features of the Gapminder Dataset:

  • Country: The name of the country or region.

  • Year: The year for which the data is recorded.

  • Life Expectancy: The average lifespan of people in a given country.

  • Population: The total population of the country.

  • GDP per Capita: The gross domestic product divided by the population, reflecting the average economic output per person.

  • Region: Country is divided into region4, four world regions, and region6, six world regions.

Example of Gapminder Data:

Here is a subset of the datset in 2007.

Country

Year

Life Expectancy

Population

GDP per Capita

Afghanistan

2007

43.828

31889923

974.5803384

Albania

2007

76.423

3600523

5937.029526

Algeria

2007

72.301

33333216

6223.367465

Angola

2007

42.731

12420476

4797.231267

Why Use the Gapminder Dataset?

The Gapminder dataset provides numerous benefits.

  • Versatility: It allows for a variety of analyses, from simple univariate plots to complex multivariate visualizations.

  • Time Series Analysis: The dataset spans multiple years, making it ideal for studying trends and changes over time.

  • Cross-Disciplinary Applications: Useful for economics, public health, and social sciences.

Common Visualizations with Gapminder

  1. Scatter Plots: Show relationships between variables like GDP per capita and life expectancy.

  2. Line Charts: Track changes in a specific variable over time, such as population growth.

  3. Bubble Charts: Combine multiple dimensions, such as population size, GDP per capita, and life expectancy, for rich visual storytelling.

Here’s an example of creating a bubble chart using Plotly and the Gapminder dataset:

import plotly.express as px

df = px.data.gapminder()

# Create a bubble chart
fig = px.scatter(
    df[df['year'] == 2007],  # Filter for the year 2007
    x='gdpPercap', 
    y='lifeExp', 
    size='pop', 
    color='continent', 
    hover_name='country', 
    log_x=True,  # Log scale for better readability
    title='Gapminder Data: GDP vs Life Expectancy (2007)'
)
fig.show()

The Gapminder dataset is an excellent tool for teaching and practicing data visualization and analysis, providing insights into global development and its disparities over time.

Systema Globalis
Gapminder