This repository contains a chatbot implementation leveraging large language models (LLMs) for retrieval-augmented question answering (QA). The system integrates document chunking, embedding generation, and FAISS-based vector search to create a high-performance, context-aware chatbot.
- Document Ingestion: Load and process
.txt
files for retrieval. - Chunking: Split long documents into manageable pieces using
RecursiveCharacterTextSplitter
. - Embeddings: Generate text embeddings using Ollama's embedding model.
- Vector Search: Perform similarity-based search using FAISS.
- LLM QA: Retrieve relevant context and answer user queries using the
ChatOllama
LLM. - Retrieval Augmented Generation (RAG): Combine retrieved context with a question-answering template to enhance LLM outputs.
- Python 3.8+
- Pip
-
Clone this repository:
git clone <repository-url> cd <repository-name>
-
Create and activate a virtual environment:
python -m venv chatbot # On Windows Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass .\chatbot\Scripts\Activate.ps1 # On macOS/Linux source chatbot/bin/activate
-
Install the required packages:
python -m pip install --upgrade pip pip install llama-index-llms-openai pip install llama-index-llms-huggingface pip install llama-index-llms-huggingface-api pip install "transformers[torch]" "huggingface_hub[inference]"
-
Environment Setup:
- Update environment variables as needed in the
.env
file. - Suppress warnings with built-in functionality.
- Update environment variables as needed in the
-
Prepare Data:
- Place
.txt
files in thedata/
directory.
- Place
-
Run the Chatbot:
python chatbot.py
- Type your questions when prompted.
- Enter
exit
to terminate the session.
<repository-name>/
├── data/ # Directory for .txt files
├── chatbot.py # Main script
├── requirements.txt # Additional dependencies
├── README.md # Project documentation
└── .env # Environment variables
Modify the following components as needed:
-
Document Chunking: Adjust chunk size and overlap in the
chunk_documents
function. -
Embedding Model: Update the
OllamaEmbeddings
initialization to use a different model or base URL. -
Prompt Template: Customize the QA prompt in the
ChatPromptTemplate
object. -
Retriever Parameters: Tune retriever parameters such as
k
,fetch_k
, andlambda_mult
for performance.
Contributions are welcome! Feel free to open issues or submit pull requests to improve the project.
This project is licensed under the MIT License. See the LICENSE file for details.
- Inspired by the LangChain community and Ollama's LLM embeddings.
- Thanks to Hugging Face for their open-source tools.