Ollama and HuggingFace RAG Engine

This repository contains a chatbot implementation leveraging large language models (LLMs) for retrieval-augmented question answering (QA). The system integrates document chunking, embedding generation, and FAISS-based vector search to create a high-performance, context-aware chatbot.

Features

Document Ingestion: Load and process .txt files for retrieval.
Chunking: Split long documents into manageable pieces using RecursiveCharacterTextSplitter.
Embeddings: Generate text embeddings using Ollama's embedding model.
Vector Search: Perform similarity-based search using FAISS.
LLM QA: Retrieve relevant context and answer user queries using the ChatOllama LLM.
Retrieval Augmented Generation (RAG): Combine retrieved context with a question-answering template to enhance LLM outputs.

Installation

Prerequisites

Python 3.8+
Pip

Setup

Clone this repository:

git clone <repository-url>
cd <repository-name>

Create and activate a virtual environment:

python -m venv chatbot
# On Windows
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
.\chatbot\Scripts\Activate.ps1
# On macOS/Linux
source chatbot/bin/activate

Install the required packages:

python -m pip install --upgrade pip
pip install llama-index-llms-openai
pip install llama-index-llms-huggingface
pip install llama-index-llms-huggingface-api
pip install "transformers[torch]" "huggingface_hub[inference]"

Usage

Environment Setup:
- Update environment variables as needed in the .env file.
- Suppress warnings with built-in functionality.
Prepare Data:
- Place .txt files in the data/ directory.
Run the Chatbot:
```
python chatbot.py
```
- Type your questions when prompted.
- Enter exit to terminate the session.

Directory Structure

<repository-name>/
├── data/                 # Directory for .txt files
├── chatbot.py            # Main script
├── requirements.txt      # Additional dependencies
├── README.md             # Project documentation
└── .env                  # Environment variables

Configuration

Modify the following components as needed:

Document Chunking: Adjust chunk size and overlap in the chunk_documents function.
Embedding Model: Update the OllamaEmbeddings initialization to use a different model or base URL.
Prompt Template: Customize the QA prompt in the ChatPromptTemplate object.
Retriever Parameters: Tune retriever parameters such as k, fetch_k, and lambda_mult for performance.

Key Dependencies

Contributing

Contributions are welcome! Feel free to open issues or submit pull requests to improve the project.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

Inspired by the LangChain community and Ollama's LLM embeddings.
Thanks to Hugging Face for their open-source tools.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
LICENSE		LICENSE
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ollama and HuggingFace RAG Engine

Features

Installation

Prerequisites

Setup

Usage

Directory Structure

Configuration

Key Dependencies

Contributing

License

Acknowledgements

About

Releases

Packages

Languages

License

taha-parsayan/Ollama-and-HuggingFace-RAG-Engine

Folders and files

Latest commit

History

Repository files navigation

Ollama and HuggingFace RAG Engine

Features

Installation

Prerequisites

Setup

Usage

Directory Structure

Configuration

Key Dependencies

Contributing

License

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages