Skip to content

Commit

Permalink
Update docs for LLamaGuard & WildGuard Microservice (opea-project#1259)
Browse files Browse the repository at this point in the history
* working README for CLI and compose

Signed-off-by: Daniel Deleon <[email protected]>

* update for direct python execution

Signed-off-by: Daniel Deleon <[email protected]>

* fix formatting

Signed-off-by: Daniel Deleon <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bring back depends_on condition

Signed-off-by: Daniel Deleon <[email protected]>

---------

Signed-off-by: Daniel Deleon <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Abolfazl Shahbazi <[email protected]>
  • Loading branch information
3 people authored Feb 10, 2025
1 parent fb86b5e commit 0df374b
Show file tree
Hide file tree
Showing 2 changed files with 88 additions and 101 deletions.
185 changes: 86 additions & 99 deletions comps/guardrails/src/guardrails/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@ The Guardrails Microservice now offers two primary types of guardrails:
- Input Guardrails: These are applied to user inputs. An input guardrail can either reject the input, halting further processing.
- Output Guardrails: These are applied to outputs generated by the LLM. An output guardrail can reject the output, preventing it from being returned to the user.

## LlamaGuard
**This microservice supports Meta's [Llama Guard](https://huggingface.co/meta-llama/Meta-Llama-Guard-2-8B) and Allen Institute for AI's [WildGuard](https://huggingface.co/allenai/wildguard) models.**

We offer content moderation support utilizing Meta's [Llama Guard](https://huggingface.co/meta-llama/Meta-Llama-Guard-2-8B) model.
## Llama Guard

Any content that is detected in the following categories is determined as unsafe:

Expand All @@ -22,111 +22,84 @@ Any content that is detected in the following categories is determined as unsafe
- Regulated or Controlled Substances
- Suicide & Self Harm

### 🚀1. Start Microservice with Python (Option 1)

To start the Guardrails microservice, you need to install python packages first.
## WildGuard

#### 1.1 Install Requirements
`allenai/wildguard` was fine-tuned from `mistralai/Mistral-7B-v0.3` on their own [`allenai/wildguardmix`](https://huggingface.co/datasets/allenai/wildguardmix) dataset. Any content that is detected in the following categories is determined as unsafe:

```bash
pip install -r requirements.txt
```
- Privacy
- Misinformation
- Harmful Language
- Malicious Uses

#### 1.2 Start TGI Gaudi Service
## Clone OPEA GenAIComps and set initial environment variables

```bash
export HF_TOKEN=${your_hf_api_token}
volume=$PWD/data
model_id="meta-llama/Meta-Llama-Guard-2-8B"
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.5
docker run -p 8088:80 -v $volume:/data --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host -e HTTPS_PROXY=$https_proxy -e HTTP_PROXY=$https_proxy -e HF_TOKEN=$HF_TOKEN ghcr.io/huggingface/tgi-gaudi:2.0.5 --model-id $model_id --max-input-length 1024 --max-total-tokens 2048
git clone https://github.com/opea-project/GenAIComps.git
export OPEA_GENAICOMPS_ROOT=$(pwd)/GenAIComps
export GUARDRAIL_PORT=9090
```

#### 1.3 Verify the TGI Gaudi Service
## Start up the HuggingFace Text Generation Inference (TGI) Server

```bash
curl 127.0.0.1:8088/generate \
-X POST \
-d '{"inputs":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
-H 'Content-Type: application/json'
```
Before starting the guardrail service, we first need to start the TGI server that will be hosting the guardrail model.

#### 1.4 Start Guardrails Service
Choose one of the following before starting your TGI server.

Optional: If you have deployed a Guardrails model with TGI Gaudi Service other than default model (i.e., `meta-llama/Meta-Llama-Guard-2-8B`) [from section 1.2](#12-start-tgi-gaudi-service), you will need to add the eviornment variable `SAFETY_GUARD_MODEL_ID` containing the model id. For example, the following informs the Guardrails Service the deployed model used LlamaGuard2:
**For LlamaGuard:**

```bash
export SAFETY_GUARD_MODEL_ID="meta-llama/Meta-Llama-Guard-2-8B"
export GUARDRAILS_COMPONENT_NAME=OPEA_LLAMA_GUARD
```

Or

```bash
export SAFETY_GUARD_ENDPOINT="http://${your_ip}:8088"
python guardrails_tgi.py
export SAFETY_GUARD_MODEL_ID="meta-llama/LlamaGuard-7b"
export GUARDRAILS_COMPONENT_NAME=OPEA_LLAMA_GUARD
```

### 🚀2. Start Microservice with Docker (Option 2)

If you start an Guardrails microservice with docker, the `docker_compose_guardrails.yaml` file will automatically start a TGI gaudi service with docker.
_Other variations of LlamaGuard are also an option to use but are not guaranteed to work OOB._

#### 2.1 Setup Environment Variables

In order to start TGI and LLM services, you need to setup the following environment variables first.
**For Wild Guard:**

```bash
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export SAFETY_GUARD_ENDPOINT="http://${your_ip}:8088"
export LLM_MODEL_ID=${your_hf_llm_model}
export SAFETY_GUARD_MODEL_ID="allenai/wildguard"
export GUARDRAILS_COMPONENT_NAME=OPEA_WILD_GUARD
```

#### 2.2 Build Docker Image

```bash
cd ../../../../
docker build -t opea/guardrails:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/guardrails/src/guardrails/Dockerfile .
```
_Note that both of these models are gated and you need to complete their form on their associated model pages first in order to use them with your HF token._

#### 2.3 Run Docker with CLI
Follow the steps [here](https://github.com/opea-project/GenAIComps/tree/main/comps/third_parties/tgi) to start the TGI server container where LLM_MODEL_ID is set to your SAFETY_GUARD_MODEL_ID like below:

```bash
docker run -d --name="guardrails-tgi-server" -p 9090:9090 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e SAFETY_GUARD_ENDPOINT=$SAFETY_GUARD_ENDPOINT -e HUGGINGFACEHUB_API_TOKEN=$HUGGINGFACEHUB_API_TOKEN opea/guardrails:latest
export LLM_MODEL_ID=$SAFETY_GUARD_MODEL_ID
```

#### 2.4 Run Docker with Docker Compose
Once the container is starting up and loading the model, set the endpoint that you will use to make requests to the TGI server:

```bash
cd deployment/docker_compose/
docker compose -f compose_llamaguard.yaml up -d
export SAFETY_GUARD_ENDPOINT="http://${host_ip}:${LLM_ENDPOINT_PORT}"
```

### 🚀3. Consume Guardrails Service
**Verify that the TGI Server is ready for inference**

#### 3.1 Check Service Status
First check that the TGI server successfully loaded the guardrail model. Loading the model could take up to 5-10 minutes. You can do this by running the following:

```bash
curl http://localhost:9090/v1/health_check\
-X GET \
-H 'Content-Type: application/json'
docker logs tgi-gaudi-server
```

#### 3.2 Consume Guardrails Service
If the last line of the log contains something like `INFO text_generation_router::server: router/src/server.rs:2209: Connected` then your TGI server is ready and the following curl should work:

```bash
curl http://localhost:9090/v1/guardrails\
curl localhost:${LLM_ENDPOINT_PORT}/generate \
-X POST \
-d '{"text":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
-d '{"inputs":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
-H 'Content-Type: application/json'
```

## WildGuard

We also offer content moderation support utilizing Allen Institute for AI's [WildGuard](https://huggingface.co/allenai/wildguard) model.

`allenai/wildguard` was fine-tuned from `mistralai/Mistral-7B-v0.3` on their own [`allenai/wildguardmix`](https://huggingface.co/datasets/allenai/wildguardmix) dataset. Any content that is detected in the following categories is determined as unsafe:

- Privacy
- Misinformation
- Harmful Language
- Malicious Uses
Check the logs again with the `logs` command to confirm that the curl request resulted in `Success`.

### 🚀1. Start Microservice with Python (Option 1)

Expand All @@ -135,84 +108,98 @@ To start the Guardrails microservice, you need to install python packages first.
#### 1.1 Install Requirements

```bash
pip install $OPEA_GENAICOMPS_ROOT
cd $OPEA_GENAICOMPS_ROOT/comps/guardrails/src/guardrails
pip install -r requirements.txt
```

#### 1.2 Start TGI Gaudi Service
#### 1.2 Start Guardrails Service

```bash
export HF_TOKEN=${your_hf_api_token}
volume=$PWD/data
model_id="allenai/wildguard"
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.1
docker run -p 8088:80 -v $volume:/data --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host -e HTTPS_PROXY=$https_proxy -e HTTP_PROXY=$https_proxy -e HF_TOKEN=$HF_TOKEN ghcr.io/huggingface/tgi-gaudi:2.0.1 --model-id $model_id --max-input-length 1024 --max-total-tokens 2048
python opea_guardrails_microservice.py
```

#### 1.3 Verify the TGI Gaudi Service
### 🚀2. Start Microservice with Docker (Option 2)

```bash
curl 127.0.0.1:8088/generate \
-X POST \
-d '{"inputs":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
-H 'Content-Type: application/json'
```
With the TGI server already running, now we can start the guardrail service container.

#### 1.4 Start Guardrails Service
#### 2.1 Build Docker Image

```bash
export SAFETY_GUARD_ENDPOINT="http://${your_ip}:8088"
python guardrails_tgi.py
cd $OPEA_GENAICOMPS_ROOT
docker build -t opea/guardrails:latest \
--build-arg https_proxy=$https_proxy \
--build-arg http_proxy=$http_proxy \
-f comps/guardrails/src/guardrails/Dockerfile .
```

### 🚀2. Start Microservice with Docker (Option 2)

If you start an Guardrails microservice with docker, the `compose_wildguard.yaml` file will automatically start a TGI gaudi service with docker.

#### 2.1 Setup Environment Variables
#### 2.2.a Run with Docker Compose (Option A)

In order to start TGI and LLM services, you need to setup the following environment variables first.
**To run with LLama Guard:**

```bash
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export SAFETY_GUARD_ENDPOINT="http://${your_ip}:8088"
export LLM_MODEL_ID=${your_hf_llm_model}
docker compose -f $OPEA_GENAICOMPS_ROOT/comps/guardrails/deployment/docker_compose/compose.yaml up -d llamaguard-guardrails-server
```

#### 2.2 Build Docker Image
**To run with WildGuard:**

```bash
cd ../../../../
docker build -t opea/guardrails:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/guardrails/src/guardrails/Dockerfile .
docker compose -f $OPEA_GENAICOMPS_ROOT/comps/guardrails/deployment/docker_compose/compose.yaml up -d wildguard-guardrails-server
```

#### 2.3 Run Docker with CLI
#### 2.2.b Run Docker with CLI (Option B)

**To run with LLama Guard:**

```bash
docker run -d --name="guardrails-tgi-server" -p 9090:9090 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e SAFETY_GUARD_ENDPOINT=$SAFETY_GUARD_ENDPOINT -e HUGGINGFACEHUB_API_TOKEN=$HUGGINGFACEHUB_API_TOKEN -e GUARDRAILS_COMPONENT_NAME="OPEA_WILD_GUARD" opea/guardrails:latest
docker run -d \
--name="llamaguard-guardrails-server" \
-p ${GUARDRAIL_PORT}:${GUARDRAIL_PORT} \
--ipc=host \
-e http_proxy=$http_proxy \
-e https_proxy=$https_proxy \
-e no_proxy=$no_proxy \
-e SAFETY_GUARD_ENDPOINT=$SAFETY_GUARD_ENDPOINT \
-e HUGGINGFACEHUB_API_TOKEN=$HF_TOKEN \
opea/guardrails:latest
```

#### 2.4 Run Docker with Docker Compose
**To run with WildGuard:**

```bash
cd deployment/docker_compose/
docker compose -f compose_wildguard.yaml up -d
docker run -d \
--name="wildguard-guardrails-server" \
-p ${GUARDRAIL_PORT}:${GUARDRAIL_PORT} \
--ipc=host \
-e http_proxy=$http_proxy \
-e https_proxy=$https_proxy \
-e no_proxy=$no_proxy \
-e SAFETY_GUARD_ENDPOINT=$SAFETY_GUARD_ENDPOINT \
-e HUGGINGFACEHUB_API_TOKEN=$HF_TOKEN \
-e GUARDRAILS_COMPONENT_NAME="OPEA_WILD_GUARD" \
opea/guardrails:latest
```

### 🚀3. Consume Guardrails Service

#### 3.1 Check Service Status

```bash
curl http://localhost:9090/v1/health_check \
curl http://localhost:${GUARDRAIL_PORT}/v1/health_check\
-X GET \
-H 'Content-Type: application/json'
```

#### 3.2 Consume Guardrails Service

```bash
curl http://localhost:9090/v1/guardrails \
curl http://localhost:${GUARDRAIL_PORT}/v1/guardrails\
-X POST \
-d '{"text":"How do you buy a tiger in the US?","parameters":{"max_new_tokens":32}}' \
-H 'Content-Type: application/json'
```

This request should return text containing:
`"Violated policies: <category>, please check your input."`

Where `category` is `Violent Crimes` or `harmful` for `Llama-Guard-2-8B` or `wildguard`, respectively.
4 changes: 2 additions & 2 deletions comps/third_parties/tgi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,12 @@ Run tgi on xeon.

```bash
cd deplopyment/docker_compose
docker compose -f compose.yaml tgi-server up -d
docker compose -f compose.yaml up -d tgi-server
```

Run tgi on gaudi.

```bash
cd deplopyment/docker_compose
docker compose -f compose.yaml tgi-gaudi-server up -d
docker compose -f compose.yaml up -d tgi-gaudi-server
```

0 comments on commit 0df374b

Please sign in to comment.