Skip to content

Commit

Permalink
textgen ollama code refactor. (opea-project#1158)
Browse files Browse the repository at this point in the history
Remove Ollama folder since default openai API is able to consume Ollama service, modified Ollama readme and added UT.
opea-project#998
Signed-off-by: Ye, Xinyu <[email protected]>
  • Loading branch information
XinyuYe-Intel authored and smguggen committed Jan 23, 2025
1 parent a7f08f2 commit b2fb42d
Show file tree
Hide file tree
Showing 11 changed files with 83 additions and 118 deletions.
4 changes: 0 additions & 4 deletions .github/workflows/docker/compose/llms-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,6 @@ services:
build:
dockerfile: comps/llms/src/text-generation/Dockerfile.intel_hpu
image: ${REGISTRY:-opea}/llm-textgen-gaudi:${TAG:-latest}
llm-ollama:
build:
dockerfile: comps/llms/text-generation/ollama/langchain/Dockerfile
image: ${REGISTRY:-opea}/llm-ollama:${TAG:-latest}
llm-docsum:
build:
dockerfile: comps/llms/src/doc-summarization/Dockerfile
Expand Down
2 changes: 1 addition & 1 deletion comps/finetuning/src/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -244,7 +244,7 @@ curl http://${your_ip}:8015/v1/finetune/list_checkpoints -X POST -H "Content-Typ

### 3.4 Leverage fine-tuned model

After fine-tuning job is done, fine-tuned model can be chosen from listed checkpoints, then the fine-tuned model can be used in other microservices. For example, fine-tuned reranking model can be used in [reranks](../../rerankings/src/README.md) microservice by assign its path to the environment variable `RERANK_MODEL_ID`, fine-tuned embedding model can be used in [embeddings](../../embeddings/src/README.md) microservice by assign its path to the environment variable `model`, LLMs after instruction tuning can be used in [llms](../../llms/text-generation/README.md) microservice by assign its path to the environment variable `your_hf_llm_model`.
After fine-tuning job is done, fine-tuned model can be chosen from listed checkpoints, then the fine-tuned model can be used in other microservices. For example, fine-tuned reranking model can be used in [reranks](../../rerankings/src/README.md) microservice by assign its path to the environment variable `RERANK_MODEL_ID`, fine-tuned embedding model can be used in [embeddings](../../embeddings/src/README.md) microservice by assign its path to the environment variable `model`, LLMs after instruction tuning can be used in [llms](../../llms/src/text-generation/README.md) microservice by assign its path to the environment variable `your_hf_llm_model`.

## 🚀4. Descriptions for Finetuning parameters

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,18 +57,18 @@ curl --noproxy "*" http://localhost:11434/api/generate -d '{
## Build Docker Image

```bash
cd GenAIComps/
docker build --no-cache -t opea/llm-ollama:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/ollama/langchain/Dockerfile .
cd ../../../../
docker build -t opea/llm-textgen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/src/text-generation/Dockerfile .
```

## Run the Ollama Microservice

```bash
docker run --network host -e http_proxy=$http_proxy -e https_proxy=$https_proxy opea/llm-ollama:latest
docker run --network host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e LLM_ENDPOINT="http://localhost:11434" -e LLM_MODEL_ID="llama3" opea/llm-textgen:latest
```

## Consume the Ollama Microservice

```bash
curl http://127.0.0.1:9000/v1/chat/completions -X POST -d '{"model": "llama3", "query":"What is Deep Learning?","max_tokens":32,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"stream":true}' -H 'Content-Type: application/json'
curl http://127.0.0.1:9000/v1/chat/completions -X POST -d '{"messages": [{"role": "user", "content": "What is Deep Learning?"}]}' -H 'Content-Type: application/json'
```
File renamed without changes.
26 changes: 0 additions & 26 deletions comps/llms/text-generation/ollama/langchain/Dockerfile

This file was deleted.

2 changes: 0 additions & 2 deletions comps/llms/text-generation/ollama/langchain/__init__.py

This file was deleted.

8 changes: 0 additions & 8 deletions comps/llms/text-generation/ollama/langchain/entrypoint.sh

This file was deleted.

60 changes: 0 additions & 60 deletions comps/llms/text-generation/ollama/langchain/llm.py

This file was deleted.

This file was deleted.

12 changes: 0 additions & 12 deletions comps/llms/text-generation/ollama/langchain/requirements.txt

This file was deleted.

78 changes: 78 additions & 0 deletions tests/llms/test_llms_text-generation_service_ollama.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
#!/bin/bash
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

set -x

WORKPATH=$(dirname "$PWD")
LOG_PATH="$WORKPATH/tests"
ip_address=$(hostname -I | awk '{print $1}')
ollama_endpoint_port=11435
llm_port=9000

function build_docker_images() {
cd $WORKPATH
docker build --no-cache --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -t opea/llm:comps -f comps/llms/src/text-generation/Dockerfile .
if [ $? -ne 0 ]; then
echo "opea/llm built fail"
exit 1
else
echo "opea/llm built successful"
fi
}

function start_service() {
export llm_model=$1
docker run -d --name="test-comps-llm-ollama-endpoint" -e https_proxy=$https_proxy -p $ollama_endpoint_port:11434 ollama/ollama
export LLM_ENDPOINT="http://${ip_address}:${ollama_endpoint_port}"

sleep 5s
docker exec test-comps-llm-ollama-endpoint ollama pull $llm_model
sleep 20s

unset http_proxy
docker run -d --name="test-comps-llm-ollama-server" -p $llm_port:9000 --ipc=host -e LOGFLAG=True -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e LLM_ENDPOINT=$LLM_ENDPOINT -e LLM_MODEL_ID=$llm_model opea/llm:comps
sleep 20s
}

function validate_microservice() {
result=$(http_proxy="" curl http://${ip_address}:${llm_port}/v1/chat/completions \
-X POST \
-d '{"messages": [{"role": "user", "content": "What is Deep Learning?"}]}' \
-H 'Content-Type: application/json')
if [[ $result == *"content"* ]]; then
echo "Result correct."
else
echo "Result wrong. Received was $result"
docker logs test-comps-llm-ollama-endpoint >> ${LOG_PATH}/llm-ollama.log
docker logs test-comps-llm-ollama-server >> ${LOG_PATH}/llm-server.log
exit 1
fi
}

function stop_docker() {
cid=$(docker ps -aq --filter "name=test-comps-llm-ollama*")
if [[ ! -z "$cid" ]]; then docker stop $cid && docker rm $cid && sleep 1s; fi
}

function main() {

stop_docker
build_docker_images

pip install --no-cache-dir openai

llm_models=(
llama3.2:1b
)
for model in "${llm_models[@]}"; do
start_service "${model}"
validate_microservice
stop_docker
done

echo y | docker system prune

}

main

0 comments on commit b2fb42d

Please sign in to comment.