Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing the MMR algorithm for OLAP vector storage #30033

Merged
merged 5 commits into from
Feb 28, 2025

Conversation

fkzhao
Copy link
Contributor

@fkzhao fkzhao commented Feb 28, 2025

Thank you for contributing to LangChain!

  • Implementing the MMR algorithm for OLAP vector storage:

  • Support Apache Doris and StarRocks OLAP database.

  • Example: "vectorstore.as_retriever(search_type="mmr", search_kwargs={"k": 10})"

  • Implementing the MMR algorithm for OLAP vector storage:

    • **Apache Doris
    • **StarRocks
    • Dependencies: any dependencies required for this change
    • Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out!
  • Add tests and docs:

    • Example: "vectorstore.as_retriever(search_type="mmr", search_kwargs={"k": 10})"
  • Lint and test: Run make format, make lint and make test from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:

  • Make sure optional dependencies are imported within a function.
  • Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests.
  • Most PRs should not touch more than one package.
  • Changes should be backwards compatible.
  • If you are adding something to community, do not re-import it in langchain.

If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Copy link

vercel bot commented Feb 28, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 28, 2025 4:05am

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. community Related to langchain-community Ɑ: vector store Related to vector store module labels Feb 28, 2025
Comment on lines +332 to +333
{self.config.column_map["embedding"]}) as dist,
{self.config.column_map["embedding"]} as embedding
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two are the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not a duplicate, it is the same field and needs to be read and calculated

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Feb 28, 2025
@ccurme ccurme merged commit f07338d into langchain-ai:master Feb 28, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Related to langchain-community lgtm PR looks good. Use to confirm that a PR is ready for merging. size:L This PR changes 100-499 lines, ignoring generated files. Ɑ: vector store Related to vector store module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants