Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GraphRAG QA Example #1036

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions GraphRAG/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.env
**/__pycache__/
**/.venv/
180 changes: 180 additions & 0 deletions GraphRAG/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
# GraphRAG for Question Answering

## Overview

A modified example of AgentQnA using GraphRAG tools.

This example showcases a hierarchical multi-agent system for question-answering applications. The architecture diagram is shown below. The supervisor agent interfaces with the user and dispatch tasks to the worker agent and other tools to gather information and come up with answers. The worker agent uses the retrieval tool to generate answers to the queries posted by the supervisor agent. Other tools used by the supervisor agent may include APIs to interface knowledge graphs, SQL databases, external knowledge bases, etc.
![Architecture Overview](assets/agent_qna_arch.png)

The AgentQnA example is implemented using the component-level microservices defined in [GenAIComps](https://github.com/opea-project/GenAIComps). The flow chart below shows the information flow between different microservices for this example.

```mermaid
---
config:
flowchart:
nodeSpacing: 400
rankSpacing: 100
curve: linear
themeVariables:
fontSize: 50px
---
flowchart LR
%% Colors %%
classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef invisible fill:transparent,stroke:transparent;

%% Subgraphs %%
subgraph DocIndexRetriever-MegaService["DocIndexRetriever MegaService "]
direction LR
EM([Embedding MicroService]):::blue
RET([Retrieval MicroService]):::blue
RER([Rerank MicroService]):::blue
end
subgraph UserInput[" User Input "]
direction LR
a([User Input Query]):::orchid
Ingest([Ingest data]):::orchid
end
AG_REACT([Agent MicroService - react]):::blue
AG_RAG([Agent MicroService - rag]):::blue
LLM_gen{{LLM Service <br>}}
DP([Data Preparation MicroService]):::blue
TEI_RER{{Reranking service<br>}}
TEI_EM{{Embedding service <br>}}
VDB{{Vector DB<br><br>}}
R_RET{{Retriever service <br>}}



%% Questions interaction
direction LR
a[User Input Query] --> AG_REACT
AG_REACT --> AG_RAG
AG_RAG --> DocIndexRetriever-MegaService
EM ==> RET
RET ==> RER
Ingest[Ingest data] --> DP

%% Embedding service flow
direction LR
AG_RAG <-.-> LLM_gen
AG_REACT <-.-> LLM_gen
EM <-.-> TEI_EM
RET <-.-> R_RET
RER <-.-> TEI_RER

direction TB
%% Vector DB interaction
R_RET <-.-> VDB
DP <-.-> VDB


```

### Why Agent for question answering?

1. Improve relevancy of retrieved context.
Agent can rephrase user queries, decompose user queries, and iterate to get the most relevant context for answering user's questions. Compared to conventional RAG, RAG agent can significantly improve the correctness and relevancy of the answer.
2. Use tools to get additional knowledge.
For example, knowledge graphs and SQL databases can be exposed as APIs for Agents to gather knowledge that may be missing in the retrieval vector database.
3. Hierarchical agent can further improve performance.
Expert worker agents, such as retrieval agent, knowledge graph agent, SQL agent, etc., can provide high-quality output for different aspects of a complex query, and the supervisor agent can aggregate the information together to provide a comprehensive answer.

### Roadmap

- v0.9: Worker agent uses open-source websearch tool (duckduckgo), agents use OpenAI GPT-4o-mini as llm backend.
- v1.0: Worker agent uses OPEA retrieval megaservice as tool.
- v1.0 or later: agents use open-source llm backend.
- v1.1 or later: add safeguards

## Getting started - standalone

For rapid prototyping, the `standalone` directory contains a monolithic front-end/back-end example
and some exploratory python notebooks.

## Getting started - microservices

1. Build agent docker image </br>
First, clone the opea GenAIComps repo

```
export WORKDIR=<your-work-directory>
cd $WORKDIR
git clone https://github.com/opea-project/GenAIComps.git
```

Then build the agent docker image. Both the supervisor agent and the worker agent will use the same docker image, but when we launch the two agents we will specify different strategies and register different tools.

```
cd GenAIComps
docker build -t opea/agent-langchain:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/agent/langchain/Dockerfile .
```

2. Launch tool services </br>
In this example, we will use some of the mock APIs provided in the Meta CRAG KDD Challenge to demonstrate the benefits of gaining additional context from mock knowledge graphs.

```
docker run -d -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
```

3. Set up environment for this example </br>
First, clone this repo

```
cd $WORKDIR
git clone https://github.com/opea-project/GenAIExamples.git
```

Second, set up env vars

```
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
# optional: OPANAI_API_KEY
export OPENAI_API_KEY=<your-openai-key>
```

4. Launch agent services</br>
The configurations of the supervisor agent and the worker agent are defined in the docker-compose yaml file. We currently use openAI GPT-4o-mini as LLM, and we plan to add support for llama3.1-70B-instruct (served by TGI-Gaudi) in a subsequent release.
To use openai llm, run command below.

```
cd docker_compose/intel/cpu/xeon
bash launch_agent_service_openai.sh
```

## Validate services

First look at logs of the agent docker containers:

```
docker logs docgrader-agent-endpoint
```

```
docker logs react-agent-endpoint
```

You should see something like "HTTP server setup successful" if the docker containers are started successfully.</p>

Second, validate worker agent:

```
curl http://${ip_address}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
}'
```

Third, validate supervisor agent:

```
curl http://${ip_address}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
}'
```

## How to register your own tools with agent

You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/langchain/README.md#5-customize-agent-strategy).
Binary file added GraphRAG/assets/agent_qna_arch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
65 changes: 65 additions & 0 deletions GraphRAG/docker_compose/intel/cpu/xeon/compose_openai.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

services:
worker-docgrader-agent:
image: opea/agent-langchain:latest
container_name: docgrader-agent-endpoint
volumes:
- ${WORKDIR}/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
- ${TOOLSET_PATH}:/home/user/tools/
ports:
- "9095:9095"
ipc: host
environment:
ip_address: ${ip_address}
strategy: rag_agent
recursion_limit: ${recursion_limit_worker}
llm_engine: openai
OPENAI_API_KEY: ${OPENAI_API_KEY}
model: ${model}
temperature: ${temperature}
max_new_tokens: ${max_new_tokens}
streaming: false
tools: /home/user/tools/worker_agent_tools.yaml
require_human_feedback: false
RETRIEVAL_TOOL_URL: ${RETRIEVAL_TOOL_URL}
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY}
LANGCHAIN_TRACING_V2: ${LANGCHAIN_TRACING_V2}
LANGCHAIN_PROJECT: "opea-worker-agent-service"
port: 9095


supervisor-react-agent:
image: opea/agent-langchain:latest
container_name: react-agent-endpoint
volumes:
- ${WORKDIR}/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
- ${TOOLSET_PATH}:/home/user/tools/
ports:
- "9090:9090"
ipc: host
environment:
ip_address: ${ip_address}
strategy: react_langgraph
recursion_limit: ${recursion_limit_supervisor}
llm_engine: openai
OPENAI_API_KEY: ${OPENAI_API_KEY}
model: ${model}
temperature: ${temperature}
max_new_tokens: ${max_new_tokens}
streaming: false
tools: /home/user/tools/supervisor_agent_tools.yaml
require_human_feedback: false
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY}
LANGCHAIN_TRACING_V2: ${LANGCHAIN_TRACING_V2}
LANGCHAIN_PROJECT: "opea-supervisor-agent-service"
CRAG_SERVER: $CRAG_SERVER
WORKER_AGENT_URL: $WORKER_AGENT_URL
port: 9090
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
export ip_address=$(hostname -I | awk '{print $1}')
export recursion_limit_worker=12
export recursion_limit_supervisor=10
export model="gpt-4o-mini-2024-07-18"
export temperature=0
export max_new_tokens=512
export OPENAI_API_KEY=${OPENAI_API_KEY}
export WORKER_AGENT_URL="http://${ip_address}:9095/v1/chat/completions"
export RETRIEVAL_TOOL_URL="http://${ip_address}:8889/v1/retrievaltool"
export CRAG_SERVER=http://${ip_address}:8080

docker compose -f compose_openai.yaml up -d
98 changes: 98 additions & 0 deletions GraphRAG/docker_compose/intel/hpu/gaudi/compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

services:
tgi-server:
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
container_name: tgi-server
ports:
- "8085:80"
volumes:
- ${HF_CACHE_DIR}:/data
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
HUGGING_FACE_HUB_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
HABANA_VISIBLE_DEVICES: all
OMPI_MCA_btl_vader_single_copy_mechanism: none
PT_HPU_ENABLE_LAZY_COLLECTIVES: true
ENABLE_HPU_GRAPH: true
LIMIT_HPU_GRAPH: true
USE_FLASH_ATTENTION: true
FLASH_ATTENTION_RECOMPUTE: true
runtime: habana
cap_add:
- SYS_NICE
ipc: host
command: --model-id ${LLM_MODEL_ID} --max-input-length 4096 --max-total-tokens 8192 --sharded true --num-shard ${NUM_SHARDS}
worker-docgrader-agent:
image: opea/agent-langchain:latest
container_name: docgrader-agent-endpoint
depends_on:
- tgi-server
volumes:
# - ${WORKDIR}/GenAIExamples/AgentQnA/docker_image_build/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
- ${TOOLSET_PATH}:/home/user/tools/
ports:
- "9095:9095"
ipc: host
environment:
ip_address: ${ip_address}
strategy: rag_agent
recursion_limit: ${recursion_limit_worker}
llm_engine: tgi
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
llm_endpoint_url: ${LLM_ENDPOINT_URL}
model: ${LLM_MODEL_ID}
temperature: ${temperature}
max_new_tokens: ${max_new_tokens}
streaming: false
tools: /home/user/tools/worker_agent_tools.yaml
require_human_feedback: false
RETRIEVAL_TOOL_URL: ${RETRIEVAL_TOOL_URL}
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY}
LANGCHAIN_TRACING_V2: ${LANGCHAIN_TRACING_V2}
LANGCHAIN_PROJECT: "opea-worker-agent-service"
port: 9095


supervisor-react-agent:
image: opea/agent-langchain:latest
container_name: react-agent-endpoint
depends_on:
- tgi-server
- worker-docgrader-agent
volumes:
# - ${WORKDIR}/GenAIExamples/AgentQnA/docker_image_build/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
- ${TOOLSET_PATH}:/home/user/tools/
ports:
- "9090:9090"
ipc: host
environment:
ip_address: ${ip_address}
strategy: react_langgraph
recursion_limit: ${recursion_limit_supervisor}
llm_engine: tgi
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
llm_endpoint_url: ${LLM_ENDPOINT_URL}
model: ${LLM_MODEL_ID}
temperature: ${temperature}
max_new_tokens: ${max_new_tokens}
streaming: false
tools: /home/user/tools/supervisor_agent_tools.yaml
require_human_feedback: false
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY}
LANGCHAIN_TRACING_V2: ${LANGCHAIN_TRACING_V2}
LANGCHAIN_PROJECT: "opea-supervisor-agent-service"
CRAG_SERVER: $CRAG_SERVER
WORKER_AGENT_URL: $WORKER_AGENT_URL
port: 9090
Loading
Loading