RAG (Retrieval Augmented Generation) System

Video Tutorial (Portuguese)

This repository is part of Bix Tech "Semana de Dados" a.k.a Data Week. For a further explanation aboutwhat is RAG and this tutorial, watch the video below:

🎥 Assista ao tutorial no YouTube

A Question-Answering system built using LangChain and ChromaDB that allows users to query their documents using natural language. The system uses OpenAI's language models to provide context-aware answers based on the content of the indexed documents.

Overview

This system allows you to:

Index text documents
Create and manage a vector store using ChromaDB
Interact with your documents through a chat interface
Get answers with source references
View and manage the document store

Repository Structure

rag_system/
├── documents/         # Place your text files here
├── logs/             # System logs are stored here
├── db/               # Vector store database (created automatically)
├── core.py           # Core RAG system implementation
├── interface.py      # Interactive command-line interface
├── requirements.txt  # Project dependencies
└── .env             # Environment variables configuration

Setup Instructions

1. Create a Virtual Environment

For Windows:

# Create a virtual environment
python -m venv .venv

# Activate the virtual environment
.venv\Scripts\activate

For Linux/Mac:

# Create a virtual environment
python -m venv .venv

# Activate the virtual environment
source .venv/bin/activate

2. Install Dependencies

# Install all required packages
pip install -r requirements.txt

3. Environment Configuration

Create a .env file in the root directory with the following content:

OPENAI_API_KEY=your-api-key-here
MODEL_NAME=gpt-3.5-turbo
COLLECTION_NAME=my_documents
PERSIST_DIRECTORY=db

Replace your-api-key-here with your actual OpenAI API key.

4. Prepare Documents

Create a documents directory if it doesn't exist:

mkdir documents

Place your text files (.txt) in the documents directory. These are the documents that will be indexed and used for answering questions.

5. Running the Application

# Run the interactive interface
python interface.py

Using the System

Once running, the system provides the following options:

Index documents: Processes and indexes all text files in the documents directory
Check total number of documents: Shows how many documents are currently indexed
Delete document store: Removes all indexed documents
Start RAG chat: Begins an interactive Q&A session
Exit: Closes the application

Chat Commands

When in chat mode:

Type your questions normally and press Enter
Type 'sources' to see detailed source documents for the last answer
Type 'quit', 'exit', or 'q' to return to the main menu

Requirements

Python 3.8 or higher
OpenAI API key
Sufficient disk space for document storage
Internet connection for API access

Dependencies

Main libraries used:

langchain
langchain-openai
langchain-community
langchain-chroma
chromadb
python-dotenv
rich

Troubleshooting

If you encounter any issues:

Check the logs in the logs directory
Ensure your OpenAI API key is valid
Verify that your documents are text files (.txt)
Make sure all required directories exist
Check your internet connection

Additional Notes

The system creates necessary directories automatically
Logs are timestamped and stored in the logs directory
The vector store is persistent and stored in the db directory
All text files should be in UTF-8 or compatible encoding

For any other issues or questions, please refer to the logs or create an issue in the repository.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
documents		documents
.gitignore		.gitignore
README.md		README.md
core.py		core.py
env_example		env_example
interface.py		interface.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG (Retrieval Augmented Generation) System

Video Tutorial (Portuguese)

Overview

Repository Structure

Setup Instructions

1. Create a Virtual Environment

2. Install Dependencies

3. Environment Configuration

4. Prepare Documents

5. Running the Application

Using the System

Chat Commands

Requirements

Dependencies

Troubleshooting

Additional Notes

About

Releases

Packages

Languages

bixtecnologia/rag_tutorial

Folders and files

Latest commit

History

Repository files navigation

RAG (Retrieval Augmented Generation) System

Video Tutorial (Portuguese)

Overview

Repository Structure

Setup Instructions

1. Create a Virtual Environment

2. Install Dependencies

3. Environment Configuration

4. Prepare Documents

5. Running the Application

Using the System

Chat Commands

Requirements

Dependencies

Troubleshooting

Additional Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages