Anthropic image format fix #1273

jk10001 · 2025-03-09T00:55:55Z

Why are these changes needed?

Reformatting of messages containing images from OpenAI format to Claude format.
Also change system message handling so that message lists can be processed as well as strings.

Related issue number

Checks

I've included any doc changes needed for https://docs.ag2.ai/. See https://docs.ag2.ai/docs/contributor-guide/documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

Reformat OpenAI image content item to Claude format. Also fix an issue processing the system message when the system message content is a list instead of a string.

CLAassistant · 2025-03-09T00:56:00Z

All committers have signed the CLA.

davorrunje · 2025-03-10T09:15:06Z

@jk10001 please sign the CLA: https://cla-assistant.io/ag2ai/ag2?pullRequest=1273

marklysze · 2025-03-12T01:56:40Z

@jk10001 thanks so much for creating this, would you be able to add some tests?
https://github.com/ag2ai/ag2/blob/main/test/oai/test_anthropic.py

Tests updated for conversion of image messages to Anthropic format.

jk10001 · 2025-03-15T07:12:55Z

@marklysze I've added some additional tests into test_anthropic.py.

marklysze · 2025-03-15T20:06:34Z

Great, thanks @jk10001, for the tests...

I tried this with the MultimodalConversableAgent with some tweaks in img_utils and it worked well for OCR. Would you have a simple example of how you use this? The reason I ask is it would be good to add to the Anthropic Model page in the docs and for testing.

…into anthropic_image_fix

jk10001 · 2025-03-18T21:31:51Z

Hi @marklysze, here's a simple example adapting the code in the notebook https://docs.ag2.ai/docs/use-cases/notebooks/notebooks/agentchat_lmm_gpt-4v
I haven't updated docs before, might be a while before I can get into that.

import os
import autogen
from autogen.agentchat.contrib.multimodal_conversable_agent import MultimodalConversableAgent
from dotenv import load_dotenv

load_dotenv()

config_claude_sonnet_37 = [
    {
        "model": "claude-3-7-sonnet-latest",
        "api_key": os.getenv("ANTHROPIC_API_KEY"),
        "api_type": "anthropic"
    },
]

image_agent = MultimodalConversableAgent(
    name="image-explainer",
    max_consecutive_auto_reply=10,
    llm_config={"config_list": config_claude_sonnet_37, "temperature": 0.5, "max_tokens": 300},
)

user_proxy = autogen.UserProxyAgent(
    name="User_proxy",
    system_message="A human admin.",
    human_input_mode="NEVER", 
    max_consecutive_auto_reply=0,
    code_execution_config={
        "use_docker": False
    },  # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly.
)

# Ask the question with an image
user_proxy.initiate_chat(
    image_agent,
    message="""What's the breed of this dog?
<img https://th.bing.com/th/id/OIP.29Mi2kJmcHHyQVGe_0NG7QHaEo?pid=ImgDet&rs=1>.""",)

marklysze · 2025-03-18T22:42:11Z

@jk10001 that's handy, thank you! We'll add that example into the docs separately.

I'm good to move this out of draft for a review if you are.

jk10001 · 2025-03-19T04:04:01Z

@marklysze good with me, thanks!

marklysze

Nice work @jk10001, appreciate the enhancement

codecov · 2025-03-20T00:29:27Z

Codecov Report

Attention: Patch coverage is 7.31707% with 38 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
autogen/oai/anthropic.py	7.31%	38 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (b2e367c) and HEAD (593a41e). Click for more details.

HEAD has 1265 uploads less than BASE

Flag BASE (b2e367c) HEAD (593a41e)

3.9 82 0

ubuntu-latest 145 1

commsagent-discord 9 0

optional-deps 141 0

core-without-llm 14 1

3.13 85 0

macos-latest 104 0

browser-use 7 0

3.11 64 1

3.12 36 0

commsagent-slack 9 0

3.10 96 0

windows-latest 114 0

commsagent-telegram 9 0

twilio 9 0

retrievechat-pgvector 10 0

interop-langchain 9 0

graph-rag-falkor-db 6 0

jupyter-executor 9 0

retrievechat 15 0

retrievechat-mongodb 10 0

interop 13 0

retrievechat-qdrant 14 0

crawl4ai 13 0

websockets 9 0

docs 6 0

interop-pydantic-ai 9 0

interop-crewai 9 0

cerebras 15 0

agent-eval 1 0

mistral 14 0

teachable 4 0

gpt-assistant-agent 3 0

lmm 4 0

together 14 0

long-context 3 0

retrievechat-couchbase 3 0

llama-index-agent 3 0

websurfer 15 0

gemini 15 0

anthropic 16 0

swarm 14 0

groq 14 0

ollama 15 0

cohere 15 0

bedrock 15 0

integration 12 0

core-llm 8 0

openai-realtime 1 0

captainagent 1 0

autobuild 1 0

deepseek 1 0

neo4j 2 0

falkordb 2 0

gemini-realtime 1 0

Files with missing lines	Coverage Δ
autogen/oai/anthropic.py	`20.89% <7.31%> (-57.22%)`	⬇️

... and 63 files with indirect coverage changes

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

jk10001 added 3 commits March 5, 2025 22:26

Update anthropic.py to allow images to be included in messages

e29820d

Reformat OpenAI image content item to Claude format. Also fix an issue processing the system message when the system message content is a list instead of a string.

Merge remote-tracking branch 'upstream/main' into anthropic_image_fix

4a08768

Merge remote-tracking branch 'upstream/main' into anthropic_image_fix

494d012

davorrunje self-assigned this Mar 10, 2025

marklysze and others added 3 commits March 12, 2025 12:56

Merge branch 'main' into anthropic_image_fix

9e74e2b

Update test_anthropic.py

696c933

Tests updated for conversion of image messages to Anthropic format.

Merge remote-tracking branch 'upstream/main' into anthropic_image_fix

8702bce

marklysze and others added 9 commits March 16, 2025 07:17

Pre-commit fixes, type updates

76a195e

Pre-commit tidy on tests

3e43e60

Merge branch 'main' into anthropic_image_fix

537d75c

Merge branch 'ag2ai:main' into anthropic_image_fix

56023bc

Merge branch 'main' into anthropic_image_fix

461f0bf

Merge branch 'ag2ai:main' into anthropic_image_fix

c118817

Merge remote-tracking branch 'upstream/main' into anthropic_image_fix

3acc8f5

Update anthropic.py

bba755b

Merge branch 'anthropic_image_fix' of https://github.com/jk10001/ag2 …

16f544b

…into anthropic_image_fix

Merge branch 'main' into anthropic_image_fix

76f505f

jk10001 marked this pull request as ready for review March 19, 2025 21:48

Update anthropic.py

593a41e

marklysze approved these changes Mar 20, 2025

View reviewed changes

marklysze added this pull request to the merge queue Mar 20, 2025

Merged via the queue into ag2ai:main with commit db2754a Mar 20, 2025
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anthropic image format fix #1273

Anthropic image format fix #1273

jk10001 commented Mar 9, 2025 •

edited

Loading

CLAassistant commented Mar 9, 2025 •

edited

Loading

davorrunje commented Mar 10, 2025

marklysze commented Mar 12, 2025

jk10001 commented Mar 15, 2025

marklysze commented Mar 15, 2025

jk10001 commented Mar 18, 2025

marklysze commented Mar 18, 2025

jk10001 commented Mar 19, 2025

marklysze left a comment

codecov bot commented Mar 20, 2025

Anthropic image format fix #1273

Anthropic image format fix #1273

Conversation

jk10001 commented Mar 9, 2025 • edited Loading

Why are these changes needed?

Related issue number

Checks

CLAassistant commented Mar 9, 2025 • edited Loading

davorrunje commented Mar 10, 2025

marklysze commented Mar 12, 2025

jk10001 commented Mar 15, 2025

marklysze commented Mar 15, 2025

jk10001 commented Mar 18, 2025

marklysze commented Mar 18, 2025

jk10001 commented Mar 19, 2025

marklysze left a comment

Choose a reason for hiding this comment

codecov bot commented Mar 20, 2025

Codecov Report

jk10001 commented Mar 9, 2025 •

edited

Loading

CLAassistant commented Mar 9, 2025 •

edited

Loading