This repo and instructions are for running Chatbot RAG App with Elastic Cloud using either Docker or Kubernetes
Elastic’s sample RAG based Chatbot application, showcases how to use Elasticsearch with local data that has embeddings, enabling search to properly pull out the most contextual information during a query with a chatbot connected to an LLM of your choice. It’ a great example of how to build out a RAG based application with Elasticsearch.
This app is also now insturmented with EDOT, and you can visualize the Chatbot’s traces to OpenAI, as well as relevant logs, and metrics from the application. By running the app as instructed in the github repo with Docker you can see these traces on a local stack. But how about running it against serverless, Elastic cloud or even with Kubernetes?
These few pre-requisites are needed to ensure you can run on with Elastic Cloud for Kubernetes or Docker
-
An Elastic Cloud account — sign up now, and become familiar with Elastic’s OpenTelemetry configuration. With Serverless no version required. With regular cloud minimally 8.17
-
Git clone the RAG based Chatbot application and go through the tutorial on how to bring it up and become more familiar and how to bring up the application using Docker.
-
An account on OpenAI with API keys
-
Kubernetes cluster - use Amazon EKS, Google GKE, or Microsoft Azure AKS.
-
Be familiar with EDOT to understand how we bring in logs, metrics, and traces from the application through the OTel Collector
In order to set this up, you can follow the following repo on Observability-examples which has the Kubernetes yaml files being used. These will also point to Elastic Cloud.
You have two options:
- Run a prebuilt container for the Chatbot app which will use the following:
ghcr.io/elastic/elasticsearch-labs/chatbot-rag-app:latest
- Build your own image
-
Set up the Kubernetes Cluster
-
Get the appropriate ENV variables:
-
Find the
OTEL_EXPORTER_OTLP_ENDPOINT
andOTEL_EXPORTER_OTLP_HEADER
variables in your Elastic Cloud instance underintegrations-->APM
-
Get your OpenAI Key
-
Get the Elasticsearch URL, username and password.
- Replace the variables and your image location in both
init-index-job.yaml
andk8s-deployment-chatbot-rag-app.yaml
Here is what you replace in k8s-deployment-chatbot-rag-app.yaml
stringData:
ELASTICSEARCH_URL: "https://yourelasticcloud.es.us-west-2.aws.found.io"
ELASTICSEARCH_USER: "elastic"
ELASTICSEARCH_PASSWORD: "elastic"
OTEL_EXPORTER_OTLP_HEADERS: "Authorization=Bearer%20xxxx"
OTEL_EXPORTER_OTLP_ENDPOINT: "https://12345.apm.us-west-2.aws.cloud.es.io:443"
OPENAI_API_KEY: "YYYYYYYY"
- Then run the following
kubectl create -f k8s-deployment.yaml
kubectl create -f init-index-job.yaml
Here is what happens:
k8s-deployment.yaml
will ensure the chatbot-rag-app pods are runningk8s-deployment.yaml
deploys a secret with your env variables (OpenAI key, Elastic end points, OTel endpoint and header, etc)init-index-job.yaml
will run a job initializing elasticsearch with the index for the app, and use the secret created by k8s-deployment.yaml
- Once the job iscomplete and the chatbot-rag-app is running, get the loadbalancer url by running:
kubectl get services
You should see something such as:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
chatbot-regular-service LoadBalancer 10.100.130.44 xxxxxxxxx-1515488226.us-west-2.elb.amazonaws.com 80:30748/TCP
6d23h
- Open the URL and run the app, then log into Elastisearch Cloud and look for your service in APM.
When using your own image, all the steps above are valid except for:
- You need to create your own image:
- Create a docker image using the Dockerfile from the repo. However use the following build command to ensure it will run on any K8s environment.,
docker buildx build --platform linux/amd64 -t chatbot-rag-app .
- Push the image to your favorite container repo:
yourimagelocation:latest
- In addition to step 3 above you will also have to replace the image in
k8s-deployment-chatbot-rag-app.yaml
replace:
ghcr.io/elastic/elasticsearch-labs/chatbot-rag-app:latest
with:
yourimagelocation:latest
- Get the appropriate ENV variables:
-
Find the OTEL_EXPORTER_OTLP_ENDPOINT/HEADER variables in your Elastic Cloud instance under
integrations-->APM
-
Get your OpenAI Key
-
Get the Elasticsearch URL, username and password.
-
Replace the variables a local copy of
env.example
- DO NOT FORGET TO call it.env
-
Run
docker compose up --pull-always --force-recreate
-
Play with app at
localhost:4000
-
Log into Elastisearch Cloud and look for your service in APM.