During the knowledge graph construction phase, we organize raw documents into a structured graph form. As shown in the figure:

During the retrieval phase, we first obtain subgraphs and corresponding chunks from the graph based on the Query, and then use the LLM to get the answer.

The tools used in this document are as follows:
- Sample raw document: A segment from Zhou Shuren's "Dawn Blossoms Plucked at Dusk" which can be replaced with any other document.
- Graph storage: TuGraph, an open-source graph database (think of it as a mysql-server).
- LLM: This document uses the Qwen2.5-7B-Instruct provided by SiliconCloud as an example. Users can switch to any PyPI
openai
interface, regardless of whether the model comes from SFT or a remote API.
-
Install TuGraph. TuGraph Official supports deployment via docker/online service/binary files, and here we use the docker method.
My server version is 4.5.0
# Pull the image docker pull tugraph/tugraph-runtime-centos7 # Run docker run -d -p 7070:7070 -p 7687:7687 -p 9090:9090 -v /root/tugraph/data:/var/lib/lgraph/data -v /root/tugraph/log:/var/log/lgraph_log --name tugraph_demo ${REPOSITORY}:${VERSION} # ${REPOSITORY} is the image address, ${VERSION} is the version number. # 7070 is the default http port, used for accessing tugraph-db-browser. # 7687 is the bolt port, used for bolt client access. # 9090 is the default rpc port, used for rpc client access. # /var/lib/lgraph/data is the default data directory inside the container, /var/log/lgraph_log is the default log directory inside the container. # The command mounts the data and log directories to the host machine's /root/tugraph/ for persistence, which you can modify according to your actual situation.
After successful installation, open port 7070 in the browser to see the TuGraph UI interface. The default account is admin, and the default password is 73@TuGraph.
-
HuixiangDou2 Dependencies. Simply use
pip install
.python3 -m pip install -r requirements.txt
We also support cpu-only
# CPU only python3 -m pip install -r requirements/cpu.txt
-
Download the embedding model. HuixiangDou2 supports bce/bge text and image-text models. For example, using bce embedding and reranker, assume the models are downloaded to the following two locations on your machine:
/home/data/share/bce-embedding-base_v1
/home/data/share/bce-reranker-base_v1
-
LLM Key. We use the SiliconCloud free LLM API.
- Click API Key to obtain the sk.
- The models used in this tutorial are all
Qwen/Qwen2.5-7B-Instruct
.
Tip 1: New users can register with this link to receive additional tokens on top of the free quota: https://cloud.siliconflow.cn/s/tpoisonooo
Tip 2: You can also deploy your own model using
vllm
. Refer to the commandvllm serve /path/to/Qwen2.5-7B-Instruct --enable-prefix-caching --served-model-name Qwen2.5-7B-Instruct --port 8000 --tensor-parallel-size 1
-
Configure
config.ini
. Copyconfig.ini.example
toconfig.ini
and fill in the SiliconCloud SK. Here is the complete configuration guide.cp config.ini.example config.ini
There are two documents under tests/data
, copy them to repodir
to build the knowledge base.
cp -rf tests/data repodir
python3 -m huixiangdou.pipeline.store
After successful creation, multiple feature directories will appear in workdir
; at the same time, the TuGraph graph project will show a graph named HuixiangDou2
.

Running main.py
will execute a query example:
python3 -m huixiangdou.main
+------------------+---------+---------------------------------+---------------+
| Query | State | Response | References |
+==================+=========+=================================+===============+
| What is in the Hundred Grass Garden? | success | The Hundred Grass Garden has various plants and insects, including green vegetable plots, tall soapberry trees, purple mulberries, raspberries that look like small coral beads, and insects such as cicadas, hornets, skylarks, crickets, centipedes, and blister beetles. In addition, there are twining vines of Polygonum multiflorum and climbing fig, as well as natural features like stone wells and broken bricks. | baicaoyuan.md |
+------------------+---------+---------------------------------+---------------+
We also implement Gradio inside it
python3 -m huixiangdou.gradio_ui
Open port 7860 in web browser:

Also support Swagger API
Start server
python3 -m huixiangdou.server --port 23334
Test with client
python3 huixiangdou/client.py
Simply delete the entities and relationships in workdir
and TuGraph.
# Delete features
rm -rf workdir
# After execution, enter Y to confirm the deletion of entity relationships
# The graph name is configured in `config.ini`, default is `HuixiangDou2`
python3 -m huixiangdou.service.graph_store drop