Build awareness and adoption for your software startup with Circuit.

Creating Interactive Analytics Dashboards Using Python’s Streamlit

This article gives a guide to building an interactive analytics dashboard with Streamlit in Python.

Superstore Sales Analytics Dashboard

Introduction

Imagine you’re a Data Scientist, Analyst, Machine Learning Engineer, or a Web Developer, and you want to transform your Python data script into a quick actionable insight. Streamlit is your answer. Streamlit is an open-source Python framework that allows you to effortlessly create web applications. With Streamlit, you can turn your data scripts into shareable web apps in minutes, not weeks, and it’s absolutely free to use.

In this article, you will learn how to build an analytics dashboard using the Streamlit Python package. We will guide you through the process step by step, from installing the necessary packages to creating interactive visualizations.

Python Packages

Before we get started, you’ll need to install the following packages using pip:

pip install streamlit numpy pandas streamlit_extras millify altair plotly

After installing the required packages, import them into your python script.

import streamlit as st # streamlit package
import numpy as np
import pandas as pd
from millify import millify # shortens values (10_000 ---> 10k)
from streamlit_extras.metric_cards import style_metric_cards # beautify metric card with css
import plotly.graph_objects as go
import altair as alt 

Load the dataset

For this tutorial, we’ll use the Superstore Sales dataset available on Superstore Sales 2023 — dataset by ehughes | data.world.

preview of the Superstore Sales dataset.

Before building a dashboard with your data, it’s essential to define the Key Performance Indicators (KPIs) you want to visualize on the dashboard.

Here are some KPIs to consider:

  • What are the total sales?
  • How much profit was made?
  • How many distinct orders did the store receive?
  • What are the top 10 products by sales?
  • What are the top 10 products by profit?
  • What is the average number of days it takes to ship an Order?
  • What are the sales trends for different product categories over the year?

These questions will give you a guide on what to plot in your dashboard.

Now, let’s load the dataset and prepare it for our dashboard:

# this function get the % change for any column by year and the specified aggregate
def get_per_year_change(col,df,metric):
    # Group by years and calculate the specified metric
    grp_years = df.groupby(''year'')[col].agg([metric])[metric]
    # Calculate the % change
    grp_years = grp_years.pct_change() * 100
    grp_years.fillna(0, inplace=True)
    grp_years = grp_years.apply(lambda x: f"{x:.1f}%" if pd.notnull(x) else ''NaN'')

    return grp_years
# Cache the dataset for better performance
@st.cache_data(ttl=3600)
def load_data():
    df = pd.read_excel(''Sample - Superstore.xls'',sheet_name=0)
    # Extract the year and store it as a new column
    df[''year''] = df[''Order Date''].dt.year
    # Calculate the difference between Shipped date and order date
    df[''days to ship''] = abs(df[''Ship Date'']- df[''Order Date'']).dt.days

    # Calculate the % change of sales, profit and orders over the years
    grp_years_sales = get_per_year_change(''Sales'',df,''sum'')
    grp_year_profit = get_per_year_change(''Profit'',df,''sum'')
    grp_year_orders = get_per_year_change(''Order ID'',df,''count'')

    return df, grp_years_sales, grp_year_profit,grp_year_orders

# load cached data
df_original ,grp_years_sales, grp_year_profit,grp_year_orders = load_data()

The load_data() function, get the superstore dataset, extracts the year, number of days to ship an item and creates DataFrames for percentage change from the the previous year for sales, profit and number of orders.

Metric Card Container

Let’s create a metric card to display the KPI metrics for total sales, total profit, and the count of distinct orders. We’ll also show their percentage change compared to the previous year.

Streamlit’s st.metric displays a metric, with an optional indicator of how the metric changed.

# creates the container for page title
dash_1 = st.container()

with dash_1:
    st.markdown("<h2 style=''text-align: center;''>Superstore Sales Dashboard</h2>", unsafe_allow_html=True)
    st.write("")


# creates the container for metric card
dash_2 = st.container()

with dash_2:
    # get kpi metrics
    total_sales = df[''Sales''].sum()
    total_profit = df[''Profit''].sum()
    total_orders = df[''Order ID''].nunique()

    col1, col2, col3 = st.columns(3)
    # create column span
    col1.metric(label="Sales", value= "$"+millify(total_sales, precision=2) , delta=sales_per_change)
    
    col2.metric(label="Profit", value= "$"+millify(total_profit, precision=2), delta=profit_per_change)
    
    col3.metric(label="Orders", value=total_orders, delta=order_count_per_change)
    
    # this is used to style the metric card
    style_metric_cards(border_left_color="#DBF227")

Metric Card Container Showing Sales, Profit, and Orders KPI.

Charts Container

Now, let’s create horizontal bar charts to display the metrics for the top 10 products by total sales and by total profit.

In the code below we first query the dataframe by grouping by sum of the sales and sum of profit and then get the top 10s.

Streamlit’s st.altair_chart is used to display chart using the Altair library.

# container for top 10 best selling and most profitable products
with dash_3:
    
    # create columna for both graph
    col1,col2 = st.columns(2)
    # get the top 10 best selling products
    top_product_sales = df.groupby(''Product Name'')[''Sales''].sum()
    top_product_sales = top_product_sales.nlargest(10)
    top_product_sales = pd.DataFrame(top_product_sales).reset_index()

    # get the top 10 most profitable products
    top_product_profit = df.groupby(''Product Name'')[''Profit''].sum()
    top_product_profit = top_product_profit.nlargest(10)
    top_product_profit = pd.DataFrame(top_product_profit).reset_index()
   
    
    # create the altair chart
    with col1:
        chart = alt.Chart(top_product_sales).mark_bar(opacity=0.9,color="#9FC131").encode(
                x=''sum(Sales):Q'',
                y=alt.Y(''Product Name:N'', sort=''-x'')   
            )
        chart = chart.properties(title="Top 10 Selling Products" )

        
        st.altair_chart(chart,use_container_width=True)

    # create the altair chart
    with col2:
        chart = alt.Chart(top_product_profit).mark_bar(opacity=0.9,color="#9FC131").encode(
                x=''sum(Profit):Q'',
                y=alt.Y(''Product Name:N'', sort=''-x'')
                
            )
        chart = chart.properties(title="Top 10 Most Profitable Products" )

        st.altair_chart(chart,use_container_width=True)

Top 10 selling and most profitable product Chart.

Now to finalize our dashboard we will create charts for displaying the metrics; average number of days it takes to ship an order and sales trends for different product categories over the years.

We use the plotly’s gauge plot to show the average shipping days, the min and max days as well. then we create a bar chart to show to total sales for different products categories over the years.

# container for avg shipping days and sales of different products categories over the years
with dash_4:

    col1,col2 = st.columns([1,2])

    with col1:
        value =int(np.round(df[''days to ship''].mean()))  # Example value

        fig = go.Figure(go.Indicator(
            mode="gauge+number",
            value=value,
            title={''text'': "Average Shipping Days"},
            gauge={''axis'': {''range'': [df[''days to ship''].min() , df[''days to ship''].max()]},
                ''bar'': {''color'': "#005C53"},
                }
        ))

        fig.update_layout(height=350) 

        st.plotly_chart(fig, use_container_width=True)
  
    
    with col2:
        custom_colors = {''Furniture'': ''#005C53'', ''Office Supplies'': ''#9FC131'', ''Technology'': ''#042940''}


        bars = alt.Chart(df).mark_bar().encode(
            y=alt.Y(''sum(Sales):Q'', stack=''zero'', axis=alt.Axis(format=''~s'') ),
            x=alt.X(''year:N''),
            #color=alt.Color(''Category'')
            color=alt.Color(''Category:N'', scale=alt.Scale(domain=list(custom_colors.keys()), range=list(custom_colors.values())))

        )

        text = alt.Chart(df).mark_text(dx=-15, dy=30, color=''white'').encode(
             y=alt.Y(''sum(Sales):Q'', stack=''zero'', axis=alt.Axis(format=''~s'') ),
            x=alt.X(''year:N''),
            detail=''Category:N'',
            text=alt.Text(''sum(Sales):Q'', format=''~s'')
          )

        chart = bars + text

        chart = chart.properties(title="Sales trends for Product Categories over the years" )

        st.altair_chart(chart,use_container_width=True)

Avg ship days and product categories sales chart.

Complete dashboard Putting everything together results in a modern and interactive dashboard that provides valuable insights at a glance.

Superstore Sales Dashboard in Streamlit (superstore-sales-app.streamlit.app)

Conclusion and Call to Action

Thank you for reading! In this article, you’ve learned how to create a dynamic analytics dashboard using Streamlit in Python. With Streamlit’s simplicity and flexibility, you can turn your data into actionable insights quickly.

Now, it’s your turn. Go ahead and build your real-time, live dashboards with Streamlit. You can find the code for this project on GitHub. Feel free to reach out to me at mathiasjr007@gmail.com or connect with me on LinkedIn.

Happy dashboarding! 😊




Continue Learning