GitHub - shettysach/IMDb-Knowledge-Graph-RAG: Simple implementation of Graph RAG using IMDb Top 250, Neo4j Graph database and Gemini API.

Neo4j database stores the network graph.
- There are 3 types of nodes - Movie, Genre, Person
- Person can be - Actor, Director, Writer or a combination of these
- Relationships are -
  - Person - [ WROTE / ACTED_IN / DIRECTED ] -> Movie
  - Movie - [ BELONGS_TO ] -> Genre
LangChain and Gemini are used for the pipeline, which
- Processes the natural language prompt
- Generates a Cypher query
- Queries the Neo4j database with generated query and gets back the result in JSON
- Parses the JSON and responds back in natural language

Clone the repo and navigate to the diretory.
Download the dataset, rename and move it to the /data directory as imdb.csv.
Create a virtual environment with Python version 3.10.14, install the requirements from requirements.txt. For Conda,
```
$ conda create -c conda-forge --name <env> --file requirements.txt
```
Recommended - Create new Neo4j database. (for Community edition)
Start the Neo4j server.
Fill in Neo4j credentials and Gemini API key in .env_template and rename to .env.
First create the network graph by running the Jupyter Notebook ./src/Knowledge_Graph.ipynb.
Run the Jupyter Notebook ./src/Graph_RAG.ipynb.
Run the Streamlit web app by running
```
streamlit run ./src/App.py
```

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
data		data
src		src
.env_template		.env_template
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback