Intro to Data Visualization
  • Introduction
  • Getting started
    • Introduction to Pandas
    • Accessing Files on Colab
    • Reviewing Data
      • Understanding type(data) in Pandas
    • Data Types
      • Categorical Data
      • Numeric Data
      • Temporal Data
      • Geographic Data
    • How to Check Data Type
    • Slicing and Subsetting DataFrames
    • Aggregating Data
  • Visualization Types
    • Exploratory Process
    • Explanatory Process
  • data exploration
    • Exploration Overview
    • Exploration with Plotly
      • Exploring Distributions
      • Exploring Relationships
      • Exploring with Regression Plots
      • Exploring Correlations
      • Exploring Categories
      • Exploring Time Series
      • Exploring Stocks with Candlestick
      • Exploring with Facets
      • Exploring with Subplots
    • Exploring with AI
  • Data Explanation
    • Data Explanation with Plotly
      • Using Text
      • Using Annotations
      • Using Color
      • Using Shape
      • Accessibility
      • Using Animations
    • Use Cases
  • Exercises and examples
    • Stock Market
      • Loading Yahoo! Finance Data
      • Use Cases for YF
      • Exploring YF Data
      • Understanding Boeing Data Over Time
      • Polishing the visualization
      • Analyzing with AI
      • Comparisons
    • The Gapminder Dataset
      • Loading the Gapminder Data
      • Use Cases
      • Exploring the Data
      • Exporting a Static Image
Powered by GitBook
On this page
  1. data exploration
  2. Exploration with Plotly

Exploring with Regression Plots

Understanding Regression Plots

Regression plots are a powerful tool for exploring data, as they help uncover relationships and trends between variables. By fitting a regression line or curve through a scatter plot, these plots make it easier to visualize the underlying patterns in the data and assess whether two variables are positively, negatively, or not at all correlated. Regression plots are particularly useful for identifying linear or non-linear relationships and for making predictions based on observed data. In addition, they often include confidence intervals, which provide a measure of the uncertainty or reliability of the regression model. This makes them a key tool in exploratory data analysis, helping users gain insights into how variables interact and whether there are any underlying dependencies.

In Plotly Express, regression plots can be easily created with the trendline parameter in functions like px.scatter. This feature allows users to add linear, polynomial, or other types of regression lines to their scatter plots with just a few lines of code. Plotly Express also enables customization of the regression model and its display, including controlling the type of trendline and showing statistical metrics like R-squared values for assessing model fit. Furthermore, the interactivity of Plotly plots enhances regression analysis by allowing users to zoom, pan, and hover over data points to gain deeper insights into specific relationships. Regression plots in Plotly Express are particularly valuable for understanding complex datasets, as they simplify the process of identifying key trends and drawing actionable conclusions from the data.

Let's explore three different examples of plots with regression (trend) lines.


1: Scatter Plot with Trend Line

# Sample data
data = {
    'X': [1, 2, 3, 4, 5],
    'Y': [2, 4.1, 6, 8.1, 10]
}

df = pd.DataFrame(data)

# Scatter plot with a trend line
fig = px.scatter(df, x='X', y='Y', trendline='ols', title='Scatter Plot with Trend Line')
fig.show()

2: Polynomial Regression

# Scatter plot with polynomial trend line
fig = px.scatter(df, x='X', y='Y', trendline='lowess', title='Polynomial Regression')
fig.show()

3: Adding Trendline Details

Plotly allows you to extract trendline statistics for deeper analysis. In this example, the trendline is an OLS regression, and you can get the regression results.

# Scatter plot with OLS trend line
fig = px.scatter(df, x='X', y='Y', trendline='ols', title='Scatter Plot with OLS Trend Line')
fig.show()

# Accessing trendline results
trendline_results = px.get_trendline_results(fig)
print(trendline_results.iloc[0].summary)

These examples illustrate how to create regression plots with Plotly, whether using simple linear regression or more complex methods like LOWESS (Locally Weighted Scatterplot Smoothing).

PreviousExploring RelationshipsNextExploring Correlations

Last updated 3 months ago