Streaming Responses from Langchain’s ChatModels to Streamlit App

In this tutorial, we will create a Streamlit app that can stream responses from Langchain’s ChatModels to Streamlit’s components. The effect is similar to ChatGPT’s interface, which displays partial responses from the LLM as they become available.

With the rise of Large Language Models (LLMs), Streamlit has become an increasingly popular choice for data scientists and developers to quickly create interactive web applications. These applications can showcase the capabilities of LLMs in a user-friendly and visually appealing manner.

However, one challenge that arises when working with LLMs is the potentially long waiting time for responses, especially when the output is long. This delay can lead users to believe that the Streamlit app has crashed or become unresponsive, as nothing significant changes on the screen until the entire response is returned by the API.

In this tutorial, we address this issue by implementing a streaming effect similar to the ChatGPT interface, which displays partial responses from the LLM as they become available. This streaming approach enhances the user experience on the Streamlit app by providing a more responsive interface.

Here is the result

Prerequisites

This tutorial assumes that you already have:

Familiarity with Streamlit for creating web applications
Reasonable familiarity with langchain 🦜🔗

While you can still go through this tutorial by using the code provided, having a solid understanding of Streamlit and Langchain will help you grasp the concepts more effectively and enable you to customize the implementation according to your specific needs.

Packages Required

You can install the required Python packages using the following commands:

pip install streamlit
pip install langchain
pip install openai

Creating the Single Page Streamlit App

Now, let’s create a new Python file named app.py and start by importing the required libraries:

# app.py
import os
import streamlit as st

from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

Initialize the chat model with the required parameters. The crucial parameters here are the streaming=True and callbacks=[StreamingStdOutCallbackHandler()], which are must present to make the streaming in the later part of this tutorial works.

# app.py (continue)
model = ChatOpenAI(openai_api_key=<API_KEY>,
                   streaming=True,
                   callbacks=[StreamingStdOutCallbackHandler()],
                   verbose=True)

# replace <API_KEY> above with your API_KEY

Next, we will create the main form for user input. To keep it simple, we will just create one text_area, followed by a submit_button.

Note that “input_text” is used as the key for the text_area components. This allows us to get the value entered into the text_area through st.session_state[“input_text”] later.

For the submit button, this tutorial uses the callback approach. Read this section of Streamlit documentation for more details. This approach allows us to define a specific function to be executed when the button is clicked, providing better control over the app’s behavior.

# app.py (continue)

# main form
with st.form(key=''form_main''):
    user_input = st.text_area("Enter your text here", key=''input_text'')

    submit_button = st.form_submit_button(
                            label=''Submit'',
                            on_click=on_submit_button_click)

This part of the code is for displaying the output from the LLM model.

# app.py (continue)

# For showing the Streaming Output
streaming_box = st.empty()


# For showing the Completed Output
if ''output_text'' in st.session_state:
    st.markdown(''---'')
    st.markdown(''#### Response:'')
    st.markdown(st.session_state[''output_text''])
    st.markdown(''---'')

We’ve jumped the gun slightly. The code below should be inserted before the main_form that contains the text_area and submit_button.

Define a function to handle the form submission:

# app.py (continue)

def on_submit_button_click():
    st.toast("Processing... Please wait...", icon=''⏳'')

    # Prepare the system message
    message_system = SystemMessage(content="You''re are a helpful,"
                                          "talkative, and friendly assistant.")

    # Prepare the user message using the value from the `text_area`
    message_user = HumanMessage(content=st.session_state.input_text)

    full_response = []
    # Loop through the chunks streamed back from the API call
    for resp in model.stream([message_system, message_user]):
        wordstream = resp.dict().get(''content'')

        # if wordstream is not None
        if wordstream:
            full_response.append(wordstream)
            result = "".join(full_response).strip()
            # This streaming_box is a st.empty from the display
            with streaming_box.container():
                st.markdown(''---'')
                st.markdown(''#### Response:'')
                st.markdown(result)
                st.markdown(''---'')

    # Concatenate and store the streamed chunks to a full response
    st.session_state.output_text = "".join(full_response).strip()

    st.toast("Processing complete!", icon=''✅'')


# main form
# This section is repeated here to show where the function
# `on_submit_button_click` above should be placed.
# This section of code been introduced earlier on.
with st.form(key=''form_main''):
    user_input = st.text_area("Enter your text here", key=''input_text'')

    submit_button = st.form_submit_button(
                            label=''Submit'',
                            on_click=on_submit_button_click)

Running the Streamlit Application

To run the Streamlit application, open a terminal and navigate to the project directory. Then, run the following command:

streamlit run app.py

The Complete Code

import streamlit as st
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler


model = ChatOpenAI(openai_api_key=<API_KEY>,
                   streaming=True,
                   callbacks=[StreamingStdOutCallbackHandler()],
                   verbose=True)
# replace <API_KEY> above with your API_KEY


def on_submit_button_click():
    st.toast("Processing... Please wait...", icon=''⏳'')

    # Prepare the system message
    message_system = SystemMessage(content="You''re are a helpful,"
                                          "talkative, and friendly assistant.")

    # Prepare the user message using the value from the `text_area`
    message_user = HumanMessage(content=st.session_state.input_text)

    full_response = []
    # Loop through the chunks streamed back from the API call
    for resp in model.stream([message_system, message_user]):
        wordstream = resp.dict().get(''content'')

        # if wordstream is not None
        if wordstream:
            full_response.append(wordstream)
            result = "".join(full_response).strip()
            # This streaming_box is a st.empty from the display
            with streaming_box.container():
                st.markdown(''---'')
                st.markdown(''#### Response:'')
                st.markdown(result)
                st.markdown(''---'')

    # Concatenate and store the streamed chunks to a full response
    st.session_state.output_text = "".join(full_response).strip()

    st.toast("Processing complete!", icon=''✅'')


# Main Form
with st.form(key=''form_main''):
    user_input = st.text_area("Enter your text here", key=''input_text'')
    submit_button = st.form_submit_button(label=''Submit'', on_click=on_submit_button_click)


# For Showing the Streaming Output
streaming_box = st.empty()

# For Showing the Completed Output
if ''output_text'' in st.session_state:
    st.markdown(''---'')
    st.markdown(''#### Response:'')
    st.markdown(st.session_state[''output_text''])
    st.markdown(''---'')

Reference

Langchain, Chat Model Streaming (2023), https://python.langchain.com/docs/modules/model_io/models/chat/streaming
Streamlit, Button behavior and examples (2023) https://docs.streamlit.io/library/advanced-features/button-behavior-and-examples#option-2-use-a-callback

Streaming Responses from Langchain’s ChatModels to Streamlit App

Here is the result

Prerequisites

Packages Required

Creating the Single Page Streamlit App

Running the Streamlit Application

The Complete Code

Reference

Continue Learning

How I passed the “NVIDIA-Certified Associate: AI Infrastructure and Operations” Exam

Top 10 AI Voice Tools You Should Try

KAG: Knowledge Augmented Generation A Practical Guide better than Rag

W.A.L.D.O. v2: The AI Tool That Can See Everything

Leave One Subject Out Cross Validation for Machine Learning Models

How to Make AI Models for Free

Main Menu

Follow Us

Streaming Responses from Langchain’s ChatModels to Streamlit App

Here is the result

Prerequisites

Packages Required

Creating the Single Page Streamlit App

Running the Streamlit Application

The Complete Code

Reference

Continue Learning

How I passed the “NVIDIA-Certified Associate: AI Infrastructure and Operations” Exam

Top 10 AI Voice Tools You Should Try

KAG: Knowledge Augmented Generation A Practical Guide better than Rag

W.A.L.D.O. v2: The AI Tool That Can See Everything

Leave One Subject Out Cross Validation for Machine Learning Models

How to Make AI Models for Free