- Problem Statement
- System Architecture
- Reproducibility
- Evaluation of RAG Flow
- Demo
- Future Improvements
Many of us often encounter long, informative YouTube videos that we'd like to explore, but lack the time to watch in their entirety. This common dilemma inspired the creation of Chat-With-Video.
Chat-With-Video is an application designed to allow users to ask questions about YouTube videos without having to watch them in full. By leveraging advanced Large Language Models and retrieval techniques, our system provides quick, relevant answers based on video transcripts.
The primary features of our app are:
- Real-time question answering about YouTube video content
- Efficient retrieval of relevant information from video transcripts
- High-quality responses generated by a state-of-the-art language model
- Continuous evaluation and improvement of system performance
- User feedback collection for ongoing refinement
- Comprehensive monitoring and analytics
Our system utilizes a Retrieval-Augmented Generation (RAG) approach, combining efficient information retrieval with powerful language models. The key components include:
- TF-IDF Retriever: For fast and efficient retrieval of relevant transcript segments
- GPT-4o-mini Language Model: To generate accurate and contextually appropriate responses
- Streamlit Frontend: For an intuitive user interface
- PostgreSQL Database: To store conversation history and evaluation metrics
- Airflow: For scheduling and running evaluation tasks
- Grafana: For monitoring system performance and user interactions
To set up and run the Chat-With-Video system, follow these steps:
-
Clone the repository and navigate to the project directory.
- Change
env.env
filename to.env
and update API variables.
- Change
-
Set up the repo:
make setup
-
Download the required CSV files from Drive and place them in the
data
folder. -
Start the Docker containers:
make compose-up # or make compose-up-force
- Streamlit UI is available on http://localhost:8501
-
To simulate user traffic:
make attach-app python src/simulate_traffic.py
-
Access the monitoring dashboard:
- Grafana: http://localhost:3000
- Default credentials: admin/admin (change on first login)
-
View and manage Airflow DAGs:
- Airflow webserver: http://localhost:8080
- Default credentials: admin/admin
-
Manual Evaluation:
- You can manually trigger the evaluation DAG in Airflow.
- This process retrieves unevaluated conversations from the database, uses the LLM to assess answer and context relevance, and inserts the results into the evaluation table.
We conducted extensive experiments to optimize our RAG (Retrieval-Augmented Generation) flow. Our evaluation process covered various aspects of the system, including:
- Retrieval method performance
- Language model selection
- Context window size optimization
For detailed information about our evaluation process, results, and conclusions, please refer to ./evaluation/README.md.
Key findings:
- TF-IDF was selected as our retrieval method due to its balance of accuracy and efficiency.
- GPT-4o-mini outperformed other tested language models in our specific use case.
- Providing the top 3 retrieved documents as context yielded the best results.
For visual demonstrations of Chat-With-Video in action, please see our DEMO.md file. This includes screenshots of:
- The main chat interface
- Example conversations
- Grafana monitoring dashboard
- Airflow DAG view
These screenshots provide a quick overview of the system's functionality and user interface.
While Chat-With-Video currently performs well, there are several areas for enhancement:
- Implement more advanced retrieval methods that maintain TF-IDF's efficiency while improving accuracy.
- Chunking techniques and multi-language transcript format.
- Query re-writing.
- Make the code scalable to handle multiple videos and concurrent users
- Explore cloud deployment options and develop a Telegram bot interface. This will make the service more accessible to users across different platforms and devices.
Claude Sonnet 3.5 was used to generate this README.md file
This app was created as a course-project of LLM Zoomcamp course organized by DataTalksClub.