-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support more output formats #10816
Comments
Turns out there is a standard for this sort of thing: Static Analysis Results Interchange Format (SARIF) It seems preferable to adapt SARIF instead of another tool-specific format. |
That looks like one of those extremely long specifications that supports everything but is so complicated nobody can be bothered to support it. Look at Pyright's specification - it's like 3 paragraphs. But it does have a VSCode extension that looks decent so it might be worth the effort. |
All of the current mypy integrations parse the cli output already, which only has a very small amount of information:
But even just having this in a JSON format would already make mypy much more compatible with IDEs and other tools. If that sounds like a good idea, I'd be willing to make that work this weekend. |
I like Pylint's approach for this: We can keep the default JSON output pretty simple, but add a custom Ref: |
SARIF format would help to improve integration of mypy within MegaLinter, it would be great :) |
CodeClimate, which is used in GitLab, also needs JSON output: https://docs.gitlab.com/ee/user/project/merge_requests/code_quality.html#implementing-a-custom-tool Here's an old codeclimate plugin to parse and output JSON, for example: https://github.com/mercos/codeclimate-pylint |
FYI: action-pyright using converted rdjson from original json. |
It's a quite useful feature and can benefit the community! however mypy doesn't have full-time maintainers currently and we can't guarantee a clear plan for this. Anyone is welcome to contribute if interested! |
@97littleleaf11 How would one go about such a task? Is a plugin the preferred approach? Any chance you know anything that could serve as some reference on handling of reports? |
#11396 already seems like a reasonable implementation of this. I think at this point it just needs a maintainer to review it. |
A maintainer of another linter (ansible-lint), I faced two json formats being added, one was the codeclimate one and, more recently, SARIF. Sarif is more complex but it was designed for exchanging information between linters, so I would recommend its use instead of a custom one. |
SARIF was mentioned before but have you seen the spec? I would say just go with the current implementation which is small and simple, and if someone wants to spend a week implementing |
Adding both into this PR is incredibly easy actually :) |
@sobolevn @hauntsaninja if any decisions can be taken on this they'll probably have to be by the core team. |
Can you give an example of what it would look like? I.e. instead of this
What would the SARIF version be? |
@Timmmm taken from @intgr's reply on my PR: Click to reveal{
"version": "2.1.0",
"$schema": "https://schemastore.azurewebsites.net/schemas/json/sarif-2.1.0.json",
"runs": [
{
"tool": {
"driver": {
"name": "mypy",
"version": "0.910"
}
},
"results": [
{
"ruleId": "assignment",
"level": "error",
"message": {
"text": "Incompatible types in assignment (expression has type \"str\", variable has type \"int\")"
},
"locations": [
{
"physicalLocation": {
"artifactLocation": {
"uri": "mytest.py"
},
"region": {
"startLine": 2,
"startColumn": 4
}
}
}
]
},
{
"ruleId": "name-defined",
"level": "error",
"message": {
"text": "Name \"z\" is not defined"
},
"locations": [
{
"physicalLocation": {
"artifactLocation": {
"uri": "mytest.py"
},
"region": {
"startLine": 3,
"startColumn": 4
}
}
}
]
}
]
}
]
} |
Well it's not totally awful but I know which I would rather parse! I feel like having both options would be good, so the people that don't need SARIF don't have to deal with all of that extra complexity. Anyway sorry I should have read all the PR comments first. Looks like that is already the consensus there. 👍 |
Sarif is not for humans but for machine ^^ |
Yes we know. But for systems that don't already support SARIF it's nice to have a format that is simple to parse. |
I had to read some of Sarif spec and it was not a pleasure, reminded me of XML and design committees. Still, after a while I realised that you do not have to support all that crap and get something relatively simple. We do not need both from start. |
I took a stub at this issue, the main issue with adding the capability is the base that mypy provides. \
Anyway, there are people who created json reports based on formatting the string output, such as: Therefor If we want a more reliable method, which will allow us to add more report schemas easily, this issue will be composed of 2 parts:
@ssbarnea I would appreciate your input if from your experience with serif and linters, is output string parsing a reliable enough method for comforting to a schema like sherif? Edit: |
@ErezAmihud I think it would be better to avoid parsing the output in order to produce SARIF, better to do it directly from internal data captured by mypy. Still, i do not know mypy internals... so based on that it might not be so easy to do it. Parsing stdout is not very reliable and could easily break. |
### Description Resolves #10816 The changes this PR makes are relatively small. It currently: - Adds an `--output` option to mypy CLI - Adds a `ErrorFormatter` abstract base class, which can be subclassed to create new output formats - Adds a `MypyError` class that represents the external format of a mypy error. - Adds a check for `--output` being `'json'`, in which case the `JSONFormatter` is used to produce the reported output. #### Demo: ```console $ mypy mytest.py mytest.py:2: error: Incompatible types in assignment (expression has type "str", variable has type "int") mytest.py:3: error: Name "z" is not defined Found 2 errors in 1 file (checked 1 source file) $ mypy mytest.py --output=json {"file": "mytest.py", "line": 2, "column": 4, "severity": "error", "message": "Incompatible types in assignment (expression has type \"str\", variable has type \"int\")", "code": "assignment"} {"file": "mytest.py", "line": 3, "column": 4, "severity": "error", "message": "Name \"z\" is not defined", "code": "name-defined"} ``` --- A few notes regarding the changes: - I chose to re-use the intermediate `ErrorTuple`s created during error reporting, instead of using the more general `ErrorInfo` class, because a lot of machinery already exists in mypy for sorting and removing duplicate error reports, which produces `ErrorTuple`s at the end. The error sorting and duplicate removal logic could perhaps be separated out from the rest of the code, to be able to use `ErrorInfo` objects more freely. - `ErrorFormatter` doesn't really need to be an abstract class, but I think it would be better this way. If there's a different method that would be preferred, I'd be happy to know. - The `--output` CLI option is, most probably, not added in the correct place. Any help in how to do it properly would be appreciated, the mypy option parsing code seems very complex. - The ability to add custom output formats can be simply added by subclassing the `ErrorFormatter` class inside a mypy plugin, and adding a `name` field to the formatters. The mypy runtime can then check through the `__subclasses__` of the formatter and determine if such a formatter is present. The "checking for the `name` field" part of this code might be appropriate to add within this PR itself, instead of hard-coding `JSONFormatter`. Does that sound like a good idea? --------- Co-authored-by: Tushar Sadhwani <[email protected]> Co-authored-by: Tushar Sadhwani <[email protected]>
Feature
Mypy should support JSON output to stdout. Pyright has this already via
--outputjson
so it might be a decent idea to copy their schema as much as possible.Pitch
Programmatic access to the errors is needed for:
Parsing the current output is a bad idea because:
For example consider the hacky regex used here for Mypy vs the simple JSON parsing used here for Pyright.
The text was updated successfully, but these errors were encountered: