Skip to content

Streaming of LLM responses in realtime using Fastapi and Streamlit.


Notifications You must be signed in to change notification settings


Repository files navigation

Local LLM Streaming


In this project, FastAPI and Streamlit are utilized to create and demonstrate how to stream LLM response locally. The project is structured with a backend service responsible for handling the interactions with the LLM using Fastapi, and a frontend service that provides a user interface for making queries using streamlit. Though ChatOpenAI from langchain_openai was used in this project, the same concept can be extened to any LLM.

Project Demo

LLM local streaming demo


Make sure you have Docker installed on your machine if you want to use the Dockerfiles in this project, otherwise, you have to run the application locally.

Getting Started

  1. Clone the repository:

    git clone
    cd LLM-Local-Streaming
  2. Create a .env file on the project root directory and add your OpenAI API key:

     ├── backend/
     │   ├──
     │   └──
     ├── frontend/
     │   └──
     ├── docker-compose.yml
     ├── Makefile
     └── .env

Running the Application

Use the provided Makefile to build and run the Docker containers:

make build
make up

This will build the necessary Docker images and start the services.


  1. Access the Streamlit interface by navigating to http://localhost:8501 in your web browser.

  2. Enter your query in the input field and click submit.

  3. The backend service will process the query using the LLM, and the results will be displayed dynamically in the Streamlit interface.

  • backend/ FastAPI application handling LLM model interactions.
  • backend/ Helper classes and functions for streaming responses.
  • backend/Dockerfile: Dockerfile for building the backend docker image.
  • frontend/ Streamlit application for user interaction.
  • frontend/Dokcerfile: Dockerfile for building the frontend docker image.
  • docker-compose.yml: Docker Compose configuration for services (frontend and backend).
  • Makefile: Makefile for simplifying common tasks.

Docker Compose

The project uses Docker Compose to manage the deployment of both frontend and backend services. The docker-compose.yml file defines two services - frontend and backend. The services are connected through a bridge network called app.

Makefile Commands

  • make build: Build Docker images.
  • make up: Start Docker containers in detached mode.
  • make up-v: Start Docker containers in the foreground.
  • make down: Stop and remove Docker containers.
  • make down-v: Stop and remove Docker containers along with volumes.
  • make status: Show status of Docker containers.
  • make show-logs: Display logs of all Docker containers.
  • make server-logs: Display logs of the backend service.
  • make frontend-logs: Display logs of the frontend service.
  • make restart: Restart Docker containers.
  • make prune: Remove unused Docker resources.
  • make remove-images: Remove all Docker images.
  • make stop-container: Stop a specific Docker container.
  • make remove-container: Remove a specific Docker container.


  • Make sure to create the .env file and add your OpenAI API key as mentioned in the "Getting Started" section.


Streaming of LLM responses in realtime using Fastapi and Streamlit.








No releases published


No packages published