Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new genai rag guide #21703

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

new genai rag guide #21703

wants to merge 3 commits into from

Conversation

Expeto
Copy link

@Expeto Expeto commented Dec 31, 2024

Description

Added a new guide about RAG with genAI which also explains how RAG actually work

Reviews

  • Technical review
  • Editorial review
  • Product review

Signed-off-by: Your Name <[email protected]>
Copy link

netlify bot commented Dec 31, 2024

Deploy Preview for docsdocker ready!

Name Link
🔨 Latest commit f3b1e44
🔍 Latest deploy log https://app.netlify.com/sites/docsdocker/deploys/6782b4a42d23a0000873fbe8
😎 Deploy Preview https://deploy-preview-21703--docsdocker.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@dvdksn dvdksn self-requested a review January 9, 2025 09:37
Copy link
Contributor

@dvdksn dvdksn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! I appreciate your effort; RAG is an important
topic, and expanding our documentation on it can add significant value for our
readers. However, I think this guide could benefit from more focus and
refinement to align with the needs of our audience and complement our existing
materials. Here are my thoughts:

  • The guide's purpose is unclear. It's important to define what this guide
    offers that the existing RAG guide does not. It seems like you want to focus
    on RAG with graph databases and Neo4j in particular. If the goal of the guide
    is to explore and demonstrate the benefits of this technology in particular,
    consider a way of doing that by example. E.g., by comparing this approach to
    using a vector database.
  • Th structure and flow of the article feels fragmented and doesn't follow a
    logical progression. For instance, introducing graph databases before RAG
    seems out of place, especially for readers unfamiliar with the primary topic.
    Start with RAG and its benefits, then explain how graph databases fit into
    the picture.
  • The "Case Study" section is disconnected from the rest of the content. I
    would suggest removing it. Or if you want to keep it, integrate it into a
    narrative that supports the guide.
  • Several sections lack detail and direction. The technical instructions for
    setting up the stack are too sparse to be genuinely useful. I'm missing
    detailed, hands-on workflows and concrete examples that readers can follow
    and apply.

@Expeto
Copy link
Author

Expeto commented Jan 13, 2025

@dvdksn I did the changes, can you take another look. Thanks.

Copy link
Contributor

@craig-osterhout craig-osterhout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Expeto.
Added some style suggestions. Overall, it's looking good. The last section is a hard to make sense of, so I suggest elaborating more, or replace it with a summary of what was covered.

* Configure a GenAI stack with Docker, incorporating Neo4j and an AI model.
* Analyze a real-world case study that highlights the effectiveness of this approach for handling specialized queries.

## Understanding RAG (Retrieval-Augmented Generation)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already defined in the intro, so not necessary to define again.

Suggested change
## Understanding RAG (Retrieval-Augmented Generation)
## Understanding RAG


## Understanding RAG (Retrieval-Augmented Generation)

RAG (Retrieval-Augmented Generation) is a hybrid framework that enhances the capabilities of large language models by integrating information retrieval. It combines three core components:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already defined in the intro, so not necessary to define again.

Suggested change
RAG (Retrieval-Augmented Generation) is a hybrid framework that enhances the capabilities of large language models by integrating information retrieval. It combines three core components:
RAG is a hybrid framework that enhances the capabilities of large language models by integrating information retrieval. It combines three core components:

- **Large Language Model (LLM)** for generating responses
- **Vector embeddings** to enable semantic search

In a Retrieval-Augmented Generation (RAG) system, vector embeddings are used to represent the semantic meaning of text in a way that a machine can understand and process. For instance, the words "dog" and "puppy" will have similar embeddings because they share similar meanings. By integrating these embeddings into the RAG framework, the system can combine the generative power of large language models with the ability to pull in highly relevant, contextually-aware data from external sources.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In a Retrieval-Augmented Generation (RAG) system, vector embeddings are used to represent the semantic meaning of text in a way that a machine can understand and process. For instance, the words "dog" and "puppy" will have similar embeddings because they share similar meanings. By integrating these embeddings into the RAG framework, the system can combine the generative power of large language models with the ability to pull in highly relevant, contextually-aware data from external sources.
In a RAG system, vector embeddings are used to represent the semantic meaning of text in a way that a machine can understand and process. For instance, the words "dog" and "puppy" will have similar embeddings because they share similar meanings. By integrating these embeddings into the RAG framework, the system can combine the generative power of large language models with the ability to pull in highly relevant, contextually-aware data from external sources.


To hold this vector information in a efficient manner, we need a special type of database.

## Introduction to Graph Databases
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Introduction to Graph Databases
## Introduction to Graph databases


Another key difference lies in schema flexibility. SQL databases operate on a rigid schema, meaning any changes to the data structure, such as adding new columns or altering relationships, typically require careful planning and migration processes. Graph databases, however, are schema-optional, allowing for much greater flexibility. New nodes, edges, or properties can be introduced without disrupting existing data, enabling faster adaptation to changing requirements.

## Practical Implementation: Testing RAG Effectiveness
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Practical Implementation: Testing RAG Effectiveness
## Practical implementation: testing RAG effectiveness


The first startup may take some time because the system needs to download a large language model (LLM).

### Monitoring Progress
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Monitoring Progress
### Monitoring progress


As we can see, AI does not know anything about this subject because it did not exist during the time of its training, also known as the information cutoff point.

Now it's time to teach the AI some new tricks. First, connect to [http://localhost:8502/](http://localhost:8502/). Instead of using the "neo4j" tag, change it to the "apache-nifi" tag, then click the **Import** button.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit

Suggested change
Now it's time to teach the AI some new tricks. First, connect to [http://localhost:8502/](http://localhost:8502/). Instead of using the "neo4j" tag, change it to the "apache-nifi" tag, then click the **Import** button.
Now it's time to teach the AI some new tricks. First, connect to [http://localhost:8502/](http://localhost:8502/). Instead of using the "neo4j" tag, change it to the "apache-nifi" tag, then select the **Import** button.


![alt text](image-2.png)

Results will appera below. What we are seeing here is the information system downloaded from Stack Overflow and saved in the graph database. RAG will utilize this information to enhance its responses.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Results will appera below. What we are seeing here is the information system downloaded from Stack Overflow and saved in the graph database. RAG will utilize this information to enhance its responses.
Results will appear below. What we are seeing here is the information system downloaded from Stack Overflow and saved in the graph database. RAG will utilize this information to enhance its responses.

ORDER BY Count DESC;
```

To execute this query, write in the box on the top and click the blue run button.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit

Suggested change
To execute this query, write in the box on the top and click the blue run button.
To execute this query, write in the box on the top and select the blue run button.

For optimal results, choose a tag that the LLM is not familiar with.


### When RAG is Effective
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest elaborating in this section or removing the section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants