-
Notifications
You must be signed in to change notification settings - Fork 831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: add multimodal eval support #1559
Feat: add multimodal eval support #1559
Conversation
hey @Yunnglin I had a quick look at this and it's great - thanks alot for contributing it in ❤️ testing it on my end too and will merge it in. I also see there are a couple of type check error, will you be tackling them or should I help you (happy to 🙂) |
Hello, I have corrected these errors. Could you please recheck them? |
Hey @Yunnglin this seems great. We could improve the method for calculating faithfulness later on if required. It would be great if you can add these two the docs as well. It would go under RAG section - https://docs.ragas.io/en/stable/concepts/metrics/available_metrics/ |
I have added the relevant documents. Could you please take a look and see if any modifications are needed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
thanks a lot @Yunnglin for the PR - made a couple of small tweaks to merge it in but looks great ❤️ btw we have a form for goodies do check it out 🙂 https://docs.google.com/forms/d/e/1FAIpQLSdM9FrrZrnpByG4XxuTbcAB-zn-Z7i_a7CsMkgBVOWQjRJckg/viewform |
Hi, nice work! Is it possible to add base64 image support as well (to mirror how anthropic/openai-compatible models accept images)? |
base64 would be very useful @jjmachan any examples of how I can evaluate multimodal retrieval? |
hey @simjak I will take a look at this and let you know :) btw are you in discord? |
I am a developer from ModelScope. This framework is great and I would like to add some new features. Multi-modal RAG evaluation is important, as mentioned in #1030.
This PR adds support for image-text context RAG evaluation. Currently, it preliminarily supports MultiModalFaithfulness and MultiModalRelevance by referring to LlamaIndex (reference: faithfulness and relevancy). The current evaluation metrics are still quite preliminary and can be further improved in the future.
The usage is as follows:
Input example:
Output example: