[Feature] Improve AI Photo Captions with Optional Context Input or Metadata #7

Scout2022 · 2024-11-22T19:35:40Z

Would it be possible to add an optional text box to the AI photo recognition process? Users could type in extra details like location names or words related to the photo's content to help the AI understand the image better. This would make the captions more accurate, especially for ambiguous images. This text box could be added either before or after the AI processes the photo, allowing users to provide context upfront or correct mistakes afterward by resending to AI.

Alternatively, the plugin could automatically extract location information from the photo's metadata, if available. This would provide context to the AI without requiring manual input from the user.

manuelcapellari · 2024-12-21T08:03:30Z

As an alternative to an optional text box, it would be useful to transmit existing metadata (existing keywords, geoinformation, etc.), since it is not possible, at least in the case of Gemini, to extract this information directly from the image

bmachek · 2024-12-21T14:03:20Z

As an alternative to an optional text box, it would be useful to transmit existing metadata (existing keywords, geoinformation, etc.), since it is not possible, at least in the case of Gemini, to extract this information directly from the image

This sounds like a better plan, since it would require less user interaction while the plugin is running. And most importantly it would improve my personal workflow. ;-)
Will give it a try soon... :-)

bmachek · 2024-12-21T15:22:24Z

As it turned out I tried it immediately... ;-)

In the referenced git branch is a candidate that transmits the GPS coordinates from the analyzed photo to Gemini. Results vary from being astonishing specific and correct to being very general. If checking out a branch is possible for you, you might want to give it a try..

to be continued

manuelcapellari · 2024-12-21T15:55:16Z

In the referenced git branch is a candidate that transmits the GPS coordinates from the analyzed photo to Gemini. Results vary from being astonishing specific and correct to being very general. If checking out a branch is possible for you, you might want to give it a try..

incredible!

I have been following the development of LLMs for a long time, I believe that time is working for us here as far as the quality of the results is concerned, the development has increased massively in speed in the last 2 years

bmachek · 2024-12-22T07:49:59Z

Thanks. I added support for sending pre-existing keywords with the request. It doesn't affect the results much.
Happy holidays! :-)

bmachek self-assigned this Dec 21, 2024

bmachek added the enhancement New feature or request label Dec 21, 2024

bmachek linked a pull request Jan 1, 2025 that will close this issue

7 feature improve ai photo captions with optional context input or metadata #16

Merged

bmachek closed this as completed in #16 Jan 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Improve AI Photo Captions with Optional Context Input or Metadata #7

[Feature] Improve AI Photo Captions with Optional Context Input or Metadata #7

Scout2022 commented Nov 22, 2024

manuelcapellari commented Dec 21, 2024

bmachek commented Dec 21, 2024

bmachek commented Dec 21, 2024

manuelcapellari commented Dec 21, 2024

bmachek commented Dec 22, 2024

[Feature] Improve AI Photo Captions with Optional Context Input or Metadata #7

[Feature] Improve AI Photo Captions with Optional Context Input or Metadata #7

Comments

Scout2022 commented Nov 22, 2024

manuelcapellari commented Dec 21, 2024

bmachek commented Dec 21, 2024

bmachek commented Dec 21, 2024

manuelcapellari commented Dec 21, 2024

bmachek commented Dec 22, 2024