Skip to content

[Bug]: MCPProxy Github OpenAPI Spec Parsing Error #1655

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
marufaytekin opened this issue Apr 16, 2025 · 3 comments
Open

[Bug]: MCPProxy Github OpenAPI Spec Parsing Error #1655

marufaytekin opened this issue Apr 16, 2025 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@marufaytekin
Copy link
Collaborator

Describe the bug

MCPProxy create is failing with a JSON parse error to create mcp server from github OpenAPI Spec:

cc: @davorrunje @sternakt

Steps to reproduce

Run MCPProxy to create MCP Server for the following Github API:

from autogen.mcp.mcp_proxy import MCPProxy

tmp_path = Path("tmp") / "mcp_github"
shutil.rmtree(tmp_path, ignore_errors=True)
tmp_path.mkdir(parents=True, exist_ok=True)

MCPProxy.create(
    openapi_url="https://raw.githubusercontent.com/github/rest-api-description/refs/heads/main/descriptions/api.github.com/api.github.jparrowsec.cn.json",
    client_source_path=tmp_path,
    servers=[{"url": "https://api.github.com"}],
)

Error:

  File "/Users/maruf/PycharmProjects/ag2/notebook/mcp/github_mcp_server.py", line 20, in <module>
    MCPProxy.create(
  File "/Users/maruf/PycharmProjects/ag2/autogen/mcp/mcp_proxy/mcp_proxy.py", line 334, in create
    main_name = cls.generate_code(  # noqa F841
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/PycharmProjects/ag2/autogen/mcp/mcp_proxy/mcp_proxy.py", line 266, in generate_code
    generate_code(
  File "/Users/maruf/PycharmProjects/ag2/autogen/mcp/mcp_proxy/patch_fastapi_code_generator.py", line 77, in patched_generate_code
    return org_generate_code(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/fastapi_code_generator/__main__.py", line 205, in generate_code
    results[relative_path] = code_formatter.format_code(result)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/datamodel_code_generator/format.py", line 205, in format_code
    code = self.apply_black(code)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/datamodel_code_generator/format.py", line 219, in apply_black
    return black.format_str(
           ^^^^^^^^^^^^^^^^^
  File "src/black/__init__.py", line 1208, in format_str
  File "src/black/__init__.py", line 1222, in _format_str_once
  File "src/black/parsing.py", line 98, in lib2to3_parse
black.parsing.InvalidInput: Cannot parse for target version Python 3.9: 725:498: By default, all responses will exclude advisories for malware, because malware are not standard vulnerabilities. To list advisories for malware, you must include the `type` parameter in your request, with the value `malware`. For more information about the different types of security advisories, see "[About the GitHub Advisory database](https://docs.github.com/code-security/security-advisories/global-security-advisories/about-the-github-advisory-database#about-types-of-security-advisories).""""

Model Used

No response

Expected Behavior

No response

Screenshots and logs

No response

Additional Information

No response

@marufaytekin marufaytekin added the bug Something isn't working label Apr 16, 2025
@marufaytekin
Copy link
Collaborator Author

Original description ends with \", that was caused this bug:

"description": "Lists all global security advisories that match the specified parameters. 
......
advisories, see \"[About the GitHub Advisory database](https://docs.github.com/code-security/security-advisories/global-security-advisories/about-the-github-advisory-database#about-types-of-security-advisories).\"",

\" four quotes ...about-types-of-security-advisories)."""" to be in the end of description after jinja2 the template rendering.

When I added spaces or dot at the end of description in template, this issue is fixed. Let me know if you want to push this fix or apply a different fix:

templates/main.jinja2:

{% if operation.description %}
    , description="""{{operation.description}}"""

updated as:

{% if operation.description %}
    , description="""<space>{{operation.description}}<space>"""

@marufaytekin marufaytekin self-assigned this Apr 16, 2025
@marufaytekin
Copy link
Collaborator Author

marufaytekin commented Apr 16, 2025

@davorrunje @sternakt Getting this error after the parse fix:

/Users/maruf/PycharmProjects/ag2/notebook/mcp/tmp/mcp_github/main.py:12785: SyntaxWarning: invalid escape sequence '\_'
  description=""" Returns all community profile metrics for a repository. The repository cannot be a fork.
/Users/maruf/PycharmProjects/ag2/notebook/mcp/tmp/mcp_github/main.py:18457: SyntaxWarning: invalid escape sequence '\_'
  description=""" Find topics via various criteria. Results are sorted by best match. This method returns up to 100 results [per page](https://docs.github.com/rest/guides/using-pagination-in-the-rest-api). See "[Searching topics](https://docs.github.com/articles/searching-topics/)" for a detailed list of qualifiers.
/Users/maruf/PycharmProjects/ag2/notebook/mcp/tmp/mcp_github/main.py:20389: SyntaxWarning: invalid escape sequence '\_'
  description=""" Fetches the URL to download the migration archive as a `tar.gz` file. Depending on the resources your repository uses, the migration archive can contain JSON files with data for these objects:
Traceback (most recent call last):
  File "/Applications/PyCharm.app/Contents/plugins/python-ce/helpers/pydev/pydevd.py", line 1570, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Applications/PyCharm.app/Contents/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/maruf/PycharmProjects/ag2/notebook/mcp/github_mcp_server.py", line 20, in <module>
    MCPProxy.create(
  File "/Users/maruf/PycharmProjects/ag2/autogen/mcp/mcp_proxy/mcp_proxy.py", line 341, in create
    main = importlib.import_module(main_name, package=td.name)  # nosemgrep
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/3.12.0/lib/python3.12/importlib/__init__.py", line 90, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1381, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1354, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1325, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 929, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 994, in exec_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "/Users/maruf/PycharmProjects/ag2/notebook/mcp/tmp/mcp_github/main.py", line 18, in <module>
    from models import (
  File "/Users/maruf/PycharmProjects/ag2/notebook/mcp/tmp/mcp_github/models.py", line 44942, in <module>
    class ReposOwnerRepoContentsPathGetResponse(
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_model_construction.py", line 237, in __new__
    complete_model_class(
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_model_construction.py", line 597, in complete_model_class
    schema = gen_schema.generate_schema(cls)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_generate_schema.py", line 706, in generate_schema
    schema = self._generate_schema_inner(obj)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_generate_schema.py", line 984, in _generate_schema_inner
    return self._model_schema(obj)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_generate_schema.py", line 802, in _model_schema
    root_field = self._common_field_schema('root', fields['root'], decorators)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_generate_schema.py", line 1348, in _common_field_schema
    schema = self._apply_annotations(
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_generate_schema.py", line 2264, in _apply_annotations
    schema = get_inner_schema(source_type)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_schema_generation_shared.py", line 83, in __call__
    schema = self._handler(source_type)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_generate_schema.py", line 2253, in inner_handler
    return transform_inner_schema(schema)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_generate_schema.py", line 1338, in set_discriminator
    schema = self._apply_discriminator_to_union(schema, field_info.discriminator)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_generate_schema.py", line 650, in _apply_discriminator_to_union
    return _discriminated_union.apply_discriminator(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_discriminated_union.py", line 70, in apply_discriminator
    return _ApplyInferredDiscriminator(discriminator, definitions or {}).apply(schema)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_discriminated_union.py", line 164, in apply
    schema = self._apply_to_root(schema)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_discriminated_union.py", line 200, in _apply_to_root
    self._handle_choice(choice)
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_discriminated_union.py", line 278, in _handle_choice
    inferred_discriminator_values = self._infer_discriminator_values_for_choice(choice, source_name=None)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_discriminated_union.py", line 354, in _infer_discriminator_values_for_choice
    return self._infer_discriminator_values_for_choice(self.definitions[schema_ref], source_name=source_name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_discriminated_union.py", line 336, in _infer_discriminator_values_for_choice
    return self._infer_discriminator_values_for_choice(choice['schema'], source_name=choice['cls'].__name__)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/maruf/.pyenv/versions/ag2-venv/lib/python3.12/site-packages/pydantic/_internal/_discriminated_union.py", line 362, in _infer_discriminator_values_for_choice
    raise TypeError(err_str)
TypeError: The core schema type 'list' is not a valid discriminated union variant. If you are making use of a list of union types, make sure the discriminator is applied to the union type and not the list (e.g. `list[Annotated[<T> | <U>, Field(discriminator=...)]]`).
python-BaseException

@sternakt sternakt self-assigned this Apr 17, 2025
@sternakt sternakt added this to ag2 Apr 17, 2025
@sternakt sternakt moved this to In Progress in ag2 Apr 17, 2025
@sternakt
Copy link
Collaborator

Hi @marufaytekin
I have opened a branch for this issue and pushed the space fix

The next bug you are encountering is related to the way discriminators are handled in pydantic and the definition of Content Directory in the github schema.

As far I understand OPenAPI specification, content directory cannot be used in discriminator because it is an array, only objects that have the attribute that is being mapped, here is a minimal example that will also fail generation:

openapi: 3.0.3
info:
  version: 1.1.4
  title: GitHub v3 REST API
servers:
  - url: https://api.github.com
components:
  schemas:
    content-directory:
      title: Content Directory
      description: A list of directory items
      type: array
      items:
        type: object
        properties:
          type:
            type: string
            enum:
              - dir
              - file
          size:
            type: integer
          name:
            type: string
    content-file:
      title: Content File
      description: Content File
      type: object
      properties:
        type:
          type: string
          enum:
            - file
        encoding:
          type: string
        size:
          type: integer
        name:
          type: string
paths:
  /repos/:
    get:
      summary: Get repository content
      operationId: repos_get_content
      responses:
        '200':
          description: Response
          content:
            application/json:
              schema:
                oneOf:
                  - $ref: '#/components/schemas/content-directory'
                  - $ref: '#/components/schemas/content-file'
                discriminator:
                  propertyName: type
                  mapping:
                    array: '#/components/schemas/content-directory'
                    file: '#/components/schemas/content-file'

Right now, I have managed to parse the GitHub API by manually changing the definition of Content Directory in json to this:

"content-directory": {
  "title": "Content Directory",
  "description": "A list of directory items",
  "type": "object",
  "properties": {
    "type": {
      "type": "string",
      "enum": [
        "dir"
      ]
    },
    "items": {
        "type": "array",
        "properties": {
        "type": {
            "type": "string",
            "enum": [
            "dir",
            "file",
            "submodule",
            "symlink"
            ]
        },
        "size": {
            "type": "integer"
        },
        "name": {
            "type": "string"
        },
        "path": {
            "type": "string"
        },
        "content": {
            "type": "string"
        },
        "sha": {
            "type": "string"
        },
        "url": {
            "type": "string",
            "format": "uri"
        },
        "git_url": {
            "type": "string",
            "format": "uri",
            "nullable": true
        },
        "html_url": {
            "type": "string",
            "format": "uri",
            "nullable": true
        },
        "download_url": {
            "type": "string",
            "format": "uri",
            "nullable": true
        },
        "_links": {
            "type": "object",
            "properties": {
            "git": {
                "type": "string",
                "format": "uri",
                "nullable": true
            },
            "html": {
                "type": "string",
                "format": "uri",
                "nullable": true
            },
            "self": {
                "type": "string",
                "format": "uri"
            }
            },
            "required": [
            "git",
            "html",
            "self"
            ]
        }
        },
        "required": [
        "_links",
        "git_url",
        "html_url",
        "download_url",
        "name",
        "path",
        "sha",
        "size",
        "type",
        "url"
        ]
    }
  }
}

But this should actually be fixed in the datamodel-code-generator so that it supports these cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: In Progress
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants