Update docs for LLamaGuard & WildGuard Microservice (opea-project#1259)

* working README for CLI and compose Signed-off-by: Daniel Deleon <[email protected]> * update for direct python execution Signed-off-by: Daniel Deleon <[email protected]> * fix formatting Signed-off-by: Daniel Deleon <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bring back depends_on condition Signed-off-by: Daniel Deleon <[email protected]> --------- Signed-off-by: Daniel Deleon <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Abolfazl Shahbazi <[email protected]>
arangoml · Feb 10, 2025 · 0df374b · 0df374b
1 parent fb86b5e
commit 0df374b
Show file tree

Hide file tree

Showing 2 changed files with 88 additions and 101 deletions.
diff --git a/comps/guardrails/src/guardrails/README.md b/comps/guardrails/src/guardrails/README.md
@@ -9,9 +9,9 @@ The Guardrails Microservice now offers two primary types of guardrails:
 - Input Guardrails: These are applied to user inputs. An input guardrail can either reject the input, halting further processing.
 - Output Guardrails: These are applied to outputs generated by the LLM. An output guardrail can reject the output, preventing it from being returned to the user.
 
-## LlamaGuard
+**This microservice supports Meta's [Llama Guard](https://huggingface.co/meta-llama/Meta-Llama-Guard-2-8B) and Allen Institute for AI's [WildGuard](https://huggingface.co/allenai/wildguard) models.**
 
-We offer content moderation support utilizing Meta's [Llama Guard](https://huggingface.co/meta-llama/Meta-Llama-Guard-2-8B) model.
+## Llama Guard
 
 Any content that is detected in the following categories is determined as unsafe:
 
@@ -22,111 +22,84 @@ Any content that is detected in the following categories is determined as unsafe
 - Regulated or Controlled Substances
 - Suicide & Self Harm
 
-### 🚀1. Start Microservice with Python (Option 1)
-
-To start the Guardrails microservice, you need to install python packages first.
+## WildGuard
 
-#### 1.1 Install Requirements
+`allenai/wildguard` was fine-tuned from `mistralai/Mistral-7B-v0.3` on their own [`allenai/wildguardmix`](https://huggingface.co/datasets/allenai/wildguardmix) dataset. Any content that is detected in the following categories is determined as unsafe:
 
-```bash
-pip install -r requirements.txt
-```
+- Privacy
+- Misinformation
+- Harmful Language
+- Malicious Uses
 
-#### 1.2 Start TGI Gaudi Service
+## Clone OPEA GenAIComps and set initial environment variables
 
 ```bash
-export HF_TOKEN=${your_hf_api_token}
-volume=$PWD/data
-model_id="meta-llama/Meta-Llama-Guard-2-8B"
-docker pull ghcr.io/huggingface/tgi-gaudi:2.0.5
-docker run -p 8088:80 -v $volume:/data --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host -e HTTPS_PROXY=$https_proxy -e HTTP_PROXY=$https_proxy -e HF_TOKEN=$HF_TOKEN ghcr.io/huggingface/tgi-gaudi:2.0.5 --model-id $model_id --max-input-length 1024 --max-total-tokens 2048
+git clone https://github.com/opea-project/GenAIComps.git
+export OPEA_GENAICOMPS_ROOT=$(pwd)/GenAIComps
+export GUARDRAIL_PORT=9090
 ```
 
-#### 1.3 Verify the TGI Gaudi Service
+## Start up the HuggingFace Text Generation Inference (TGI) Server
 
-```bash
-curl 127.0.0.1:8088/generate \
-  -X POST \
-  -d '{"inputs":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
-  -H 'Content-Type: application/json'
-```
+Before starting the guardrail service, we first need to start the TGI server that will be hosting the guardrail model.
 
-#### 1.4 Start Guardrails Service
+Choose one of the following before starting your TGI server.
 
-Optional: If you have deployed a Guardrails model with TGI Gaudi Service other than default model (i.e., `meta-llama/Meta-Llama-Guard-2-8B`) [from section 1.2](#12-start-tgi-gaudi-service), you will need to add the eviornment variable `SAFETY_GUARD_MODEL_ID` containing the model id. For example, the following informs the Guardrails Service the deployed model used LlamaGuard2:
+**For LlamaGuard:**
 
 ```bash
 export SAFETY_GUARD_MODEL_ID="meta-llama/Meta-Llama-Guard-2-8B"
+export GUARDRAILS_COMPONENT_NAME=OPEA_LLAMA_GUARD
 ```
 
+Or
+
 ```bash
-export SAFETY_GUARD_ENDPOINT="http://${your_ip}:8088"
-python guardrails_tgi.py
+export SAFETY_GUARD_MODEL_ID="meta-llama/LlamaGuard-7b"
+export GUARDRAILS_COMPONENT_NAME=OPEA_LLAMA_GUARD
 ```
 
-### 🚀2. Start Microservice with Docker (Option 2)
-
-If you start an Guardrails microservice with docker, the `docker_compose_guardrails.yaml` file will automatically start a TGI gaudi service with docker.
+_Other variations of LlamaGuard are also an option to use but are not guaranteed to work OOB._
 
-#### 2.1 Setup Environment Variables
-
-In order to start TGI and LLM services, you need to setup the following environment variables first.
+**For Wild Guard:**
 
 ```bash
-export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
-export SAFETY_GUARD_ENDPOINT="http://${your_ip}:8088"
-export LLM_MODEL_ID=${your_hf_llm_model}
+export SAFETY_GUARD_MODEL_ID="allenai/wildguard"
+export GUARDRAILS_COMPONENT_NAME=OPEA_WILD_GUARD
 ```
 
-#### 2.2 Build Docker Image
-
-```bash
-cd ../../../../
-docker build -t opea/guardrails:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/guardrails/src/guardrails/Dockerfile .
-```
+_Note that both of these models are gated and you need to complete their form on their associated model pages first in order to use them with your HF token._
 
-#### 2.3 Run Docker with CLI
+Follow the steps [here](https://github.com/opea-project/GenAIComps/tree/main/comps/third_parties/tgi) to start the TGI server container where LLM_MODEL_ID is set to your SAFETY_GUARD_MODEL_ID like below:
 
 ```bash
-docker run -d --name="guardrails-tgi-server" -p 9090:9090 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e SAFETY_GUARD_ENDPOINT=$SAFETY_GUARD_ENDPOINT -e HUGGINGFACEHUB_API_TOKEN=$HUGGINGFACEHUB_API_TOKEN opea/guardrails:latest
+export LLM_MODEL_ID=$SAFETY_GUARD_MODEL_ID
 ```
 
-#### 2.4 Run Docker with Docker Compose
+Once the container is starting up and loading the model, set the endpoint that you will use to make requests to the TGI server:
 
 ```bash
-cd deployment/docker_compose/
-docker compose -f compose_llamaguard.yaml up -d
+export SAFETY_GUARD_ENDPOINT="http://${host_ip}:${LLM_ENDPOINT_PORT}"
 ```
 
-### 🚀3. Consume Guardrails Service
+**Verify that the TGI Server is ready for inference**
 
-#### 3.1 Check Service Status
+First check that the TGI server successfully loaded the guardrail model. Loading the model could take up to 5-10 minutes. You can do this by running the following:
 
 ```bash
-curl http://localhost:9090/v1/health_check\
-  -X GET \
-  -H 'Content-Type: application/json'
+docker logs tgi-gaudi-server
 ```
 
-#### 3.2 Consume Guardrails Service
+If the last line of the log contains something like `INFO text_generation_router::server: router/src/server.rs:2209: Connected` then your TGI server is ready and the following curl should work:
 
 ```bash
-curl http://localhost:9090/v1/guardrails\
+curl localhost:${LLM_ENDPOINT_PORT}/generate \
   -X POST \
-  -d '{"text":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
+  -d '{"inputs":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
   -H 'Content-Type: application/json'
 ```
 
-## WildGuard
-
-We also offer content moderation support utilizing Allen Institute for AI's [WildGuard](https://huggingface.co/allenai/wildguard) model.
-
-`allenai/wildguard` was fine-tuned from `mistralai/Mistral-7B-v0.3` on their own [`allenai/wildguardmix`](https://huggingface.co/datasets/allenai/wildguardmix) dataset. Any content that is detected in the following categories is determined as unsafe:
-
-- Privacy
-- Misinformation
-- Harmful Language
-- Malicious Uses
+Check the logs again with the `logs` command to confirm that the curl request resulted in `Success`.
 
 ### 🚀1. Start Microservice with Python (Option 1)
 
@@ -135,84 +108,98 @@ To start the Guardrails microservice, you need to install python packages first.
 #### 1.1 Install Requirements
 
 ```bash
+pip install $OPEA_GENAICOMPS_ROOT
+cd $OPEA_GENAICOMPS_ROOT/comps/guardrails/src/guardrails
 pip install -r requirements.txt
 ```
 
-#### 1.2 Start TGI Gaudi Service
+#### 1.2 Start Guardrails Service
 
 ```bash
-export HF_TOKEN=${your_hf_api_token}
-volume=$PWD/data
-model_id="allenai/wildguard"
-docker pull ghcr.io/huggingface/tgi-gaudi:2.0.1
-docker run -p 8088:80 -v $volume:/data --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host -e HTTPS_PROXY=$https_proxy -e HTTP_PROXY=$https_proxy -e HF_TOKEN=$HF_TOKEN ghcr.io/huggingface/tgi-gaudi:2.0.1 --model-id $model_id --max-input-length 1024 --max-total-tokens 2048
+python opea_guardrails_microservice.py
 ```
 
-#### 1.3 Verify the TGI Gaudi Service
+### 🚀2. Start Microservice with Docker (Option 2)
 
-```bash
-curl 127.0.0.1:8088/generate \
-  -X POST \
-  -d '{"inputs":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
-  -H 'Content-Type: application/json'
-```
+With the TGI server already running, now we can start the guardrail service container.
 
-#### 1.4 Start Guardrails Service
+#### 2.1 Build Docker Image
 
 ```bash
-export SAFETY_GUARD_ENDPOINT="http://${your_ip}:8088"
-python guardrails_tgi.py
+cd $OPEA_GENAICOMPS_ROOT
+docker build -t opea/guardrails:latest \
+  --build-arg https_proxy=$https_proxy \
+  --build-arg http_proxy=$http_proxy \
+  -f comps/guardrails/src/guardrails/Dockerfile .
 ```
 
-### 🚀2. Start Microservice with Docker (Option 2)
-
-If you start an Guardrails microservice with docker, the `compose_wildguard.yaml` file will automatically start a TGI gaudi service with docker.
-
-#### 2.1 Setup Environment Variables
+#### 2.2.a Run with Docker Compose (Option A)
 
-In order to start TGI and LLM services, you need to setup the following environment variables first.
+**To run with LLama Guard:**
 
 ```bash
-export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
-export SAFETY_GUARD_ENDPOINT="http://${your_ip}:8088"
-export LLM_MODEL_ID=${your_hf_llm_model}
+docker compose -f $OPEA_GENAICOMPS_ROOT/comps/guardrails/deployment/docker_compose/compose.yaml up -d llamaguard-guardrails-server
 ```
 
-#### 2.2 Build Docker Image
+**To run with WildGuard:**
 
 ```bash
-cd ../../../../
-docker build -t opea/guardrails:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/guardrails/src/guardrails/Dockerfile .
+docker compose -f $OPEA_GENAICOMPS_ROOT/comps/guardrails/deployment/docker_compose/compose.yaml up -d wildguard-guardrails-server
 ```
 
-#### 2.3 Run Docker with CLI
+#### 2.2.b Run Docker with CLI (Option B)
+
+**To run with LLama Guard:**
 
 ```bash
-docker run -d --name="guardrails-tgi-server" -p 9090:9090 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e SAFETY_GUARD_ENDPOINT=$SAFETY_GUARD_ENDPOINT -e HUGGINGFACEHUB_API_TOKEN=$HUGGINGFACEHUB_API_TOKEN -e GUARDRAILS_COMPONENT_NAME="OPEA_WILD_GUARD" opea/guardrails:latest
+docker run -d \
+  --name="llamaguard-guardrails-server" \
+  -p ${GUARDRAIL_PORT}:${GUARDRAIL_PORT} \
+  --ipc=host \
+  -e http_proxy=$http_proxy \
+  -e https_proxy=$https_proxy \
+  -e no_proxy=$no_proxy \
+  -e SAFETY_GUARD_ENDPOINT=$SAFETY_GUARD_ENDPOINT \
+  -e HUGGINGFACEHUB_API_TOKEN=$HF_TOKEN \
+  opea/guardrails:latest
 ```
 
-#### 2.4 Run Docker with Docker Compose
+**To run with WildGuard:**
 
 ```bash
-cd deployment/docker_compose/
-docker compose -f compose_wildguard.yaml up -d
+docker run -d \
+  --name="wildguard-guardrails-server" \
+  -p ${GUARDRAIL_PORT}:${GUARDRAIL_PORT} \
+  --ipc=host \
+  -e http_proxy=$http_proxy \
+  -e https_proxy=$https_proxy \
+  -e no_proxy=$no_proxy \
+  -e SAFETY_GUARD_ENDPOINT=$SAFETY_GUARD_ENDPOINT \
+  -e HUGGINGFACEHUB_API_TOKEN=$HF_TOKEN \
+  -e GUARDRAILS_COMPONENT_NAME="OPEA_WILD_GUARD" \
+  opea/guardrails:latest
 ```
 
 ### 🚀3. Consume Guardrails Service
 
 #### 3.1 Check Service Status
 
 ```bash
-curl http://localhost:9090/v1/health_check \
+curl http://localhost:${GUARDRAIL_PORT}/v1/health_check\
   -X GET \
   -H 'Content-Type: application/json'
 ```
 
 #### 3.2 Consume Guardrails Service
 
 ```bash
-curl http://localhost:9090/v1/guardrails \
+curl http://localhost:${GUARDRAIL_PORT}/v1/guardrails\
   -X POST \
   -d '{"text":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
   -H 'Content-Type: application/json'
 ```
+
+This request should return text containing:
+`"Violated policies: <category>, please check your input."`
+
+Where `category` is `Violent Crimes` or `harmful` for `Llama-Guard-2-8B` or `wildguard`, respectively.
diff --git a/comps/third_parties/tgi/README.md b/comps/third_parties/tgi/README.md
@@ -19,12 +19,12 @@ Run tgi on xeon.
 
 ```bash
 cd deplopyment/docker_compose
-docker compose -f compose.yaml tgi-server up -d
+docker compose -f compose.yaml up -d tgi-server
 ```
 
 Run tgi on gaudi.
 
 ```bash
 cd deplopyment/docker_compose
-docker compose -f compose.yaml tgi-gaudi-server up -d
+docker compose -f compose.yaml up -d tgi-gaudi-server
 ```