Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

De-duplicate identical issue annotations from problemMatchers #504

Closed
ammaraskar opened this issue May 25, 2020 · 21 comments
Closed

De-duplicate identical issue annotations from problemMatchers #504

ammaraskar opened this issue May 25, 2020 · 21 comments
Labels
Service Bug Bug fix scope to the pipelines service and launch app

Comments

@ammaraskar
Copy link

In CPython we're trying to add a issue matcher for the MSVC compiler but it outputs the same warning multiple times (once per file and once in the summary usually). This results in identical annotations appearing for the same matcher, same message and line number which looks a bit cluttered:

image

It would be great if these identical check annotations could be avoided at the matcher level.

For reference, this is the commit in vscode that added de-duplication for problemMatchers: microsoft/vscode@b08de2e

@TingluoHuang
Copy link
Member

@andymckay Can someone from experience side take a look? Why the customer gets same annotation twice? 😄
thanks.

@TingluoHuang TingluoHuang removed the enhancement New feature or request label Jun 8, 2020
@ammaraskar
Copy link
Author

Just for reference, this is a check run with duplicated annotations: https://github.com/python/cpython/actions/runs/122021377

The one that says

'initializing': conversion from 'size_t' to 'int', possible loss of data [D:\a\cpython\cpython\PCbuild\pythoncore.vcxproj]

is printed twice and gets two different annotations.

@xt0rted
Copy link

xt0rted commented Jun 8, 2020

This sounds like the same issue as #264 and actions/toolkit#395

@ammaraskar
Copy link
Author

ammaraskar commented Jun 8, 2020

Similar, but in this case this isn't done over different jobs. The same warning is generated multiple times in one run and hence it should be de-duplicatable at the runner level. Albeit if it were handled at the annotation level then no runner-specific code would be needed.

@TingluoHuang TingluoHuang added Service Bug Bug fix scope to the pipelines service and launch app and removed bug Something isn't working service labels Jun 8, 2020
@peternewman
Copy link

Similar, but in this case this isn't done over different jobs. The same warning is generated multiple times in one run and hence it should be de-duplicatable at the runner level. Albeit if it were handled at the annotation level then no runner-specific code would be needed.

Although if it's only fixed at the runner level, won't that miss the issue where someone has a PR run and a push run and the PR is from within the same repo and therefore causes similar duplication?

IMHO the annotation level is the place to do it, if some specific message is already annotated against a line number, we don't need to see it again, regardless of which action run generated it. Is there a repo for the annotation stuff we should be raising this issue in instead?

@peternewman
Copy link

In CPython we're trying to add a issue matcher for the MSVC compiler but it outputs the same warning multiple times (once per file and once in the summary usually).

I've no idea about your build process @ammaraskar , but I'd have thought the solution here would be to disable the summary on CI builds:
https://docs.microsoft.com/en-us/visualstudio/msbuild/msbuild-command-line-reference?view=vs-2019#switches

Suggests this would do the trick:
-consoleloggerparameters:NoSummary

@bryanmacfarlane
Copy link
Member

Note that an instance of the runner scoped to a job. So if you want to de-dupe at the run level, then it needs to be done at the service level (the service that writes the annotations)

@JustinGrote
Copy link

@thboop
Here is an extremely simple reproduction where I am getting duplicate annotations using problem matchers, it appears to be a bug in the matcher logic itself where it outputs both the ##error annotation and the one generated by the problem matcher.
image

https://github.com/JustinGrote/Super-Duper-Linter/runs/797950085?check_suite_focus=true#step:3:18
JustinGrote/Super-Duper-Linter@a73c863#annotation_261920524

You closed the issue previously but didn't reference where and which team it was directed to. Since there's been no progress in six months, who do we directly talk to about this? Is the annotation problemMatcher code available publicly?

@TingluoHuang
Copy link
Member

There was a recent regression in the server-side code that causes every annotation to get created twice, we are rolling out a fix, etc 1-2 days to every deployment rings.

@thboop
Copy link
Collaborator

thboop commented Jun 23, 2020

@JustinGrote, thanks for the repro!

There's a few different ways duplicate annotations can present themself:

1 annotation created in 1 job: Your repro is a great example. Currently, we have a bug where annotations are getting duplicated even if there is only a single annotation created on a single job. We are working on rolling out a fix for this now.

2+ identical annotations created in 1 job: The original post in this issue appears to have two identical annotations being created by the problem matcher in a single job. We can look at this as a runner issue, as we could solve it in the runner.

1 annotation created in 2+ jobs: If the same annotation is created in multiple jobs, we show it multiple times. This ticket was originally about this case. This can be valuable in some areas, for example knowing exactly what versions of a dependency your build failed on for a matrix build. It can also be detrimental, as it does create some noise for users. If you have thoughts on that, please post them in the community forum, our engineers monitor that forum and we'd love to discuss it more!

@JustinGrote
Copy link

@thboop, thanks for letting me know first scenario is a known issue and is being fixed.

For second scenario, I can "hold back" the output and effectively deduplicate it prior to sending it to the "console" within my script, so I have a workaround for that case.

@peternewman
Copy link

1 annotation created in 2+ jobs: If the same annotation is created in multiple jobs, we show it multiple times. This ticket was originally about this case. This can be valuable in some areas, for example knowing exactly what versions of a dependency your build failed on for a matrix build. It can also be detrimental, as it does create some noise for users. If you have thoughts on that, please post them in the community forum, our engineers monitor that forum and we'd love to discuss it more!

@silverwind suggested a presumably pretty trivial way to fix this problem (from a display perspective) which allows you to see "exactly what versions of a dependency your build failed on for a matrix build" without the downside of "noise for users":
#264 (comment)

I think there's also a third area of duplication too @thboop:
1 annotation created in 1 job in 2+ runs: If your linter runs once per run, but you are configured to do runs against all branches and PRs and you do a PR from a feature branch to master, that results in two runs and therefore two annotations, which I think double up, although I don't have an example case currently due to the other oddity of unchanged files meaning some annotations only showing up in some circumstances.

@JustinGrote
Copy link

JustinGrote commented Jun 24, 2020

FYI Scenario 1 appears to be fixed at least for me per #504 (comment), annotations are no longer displaying twice for the same annotation line.

@jbytheway
Copy link

Just to add another situation where this can arise: I'm experimenting with a problem matcher for clang-tidy (C++ static analysis tool) and if there is an issue in a header file it triggers an annotation for every time that header file is included, which can be dozens of times. That's an example of the second scenario discussed above (multiple identical annotations in a single job).

@peternewman
Copy link

if there is an issue in a header file it triggers an annotation for every time that header file is included

For something like this, assuming the log lines are standalone and not multi-line, you could do something fairly quick and dirty like piping the output into sort and then uniq to avoid the duplicates, and then annotating from the results, but similar behaviour could be achieved in the runner as mentioned which would fix it for everyone without producing such confusing log lines.

@jbytheway
Copy link

The full error messages are multiline, although I'm only matching a single-line part of them, so I could tee the output somewhere and then grep, sort, uniq that other thing. That could work, but would leave fragmented errors at the end of the output log, which is not ideal.

@peternewman
Copy link

The full error messages are multiline, although I'm only matching a single-line part of them, so I could tee the output somewhere and then grep, sort, uniq that other thing. That could work, but would leave fragmented errors at the end of the output log, which is not ideal.

Perhaps the trick is to tee, then from that grep only the lines that match the matcher and do the other processing. Which avoids the fragmented errors bit. Or better yet, you could do some sed/awk/perl magic to unwrap those lines before you feed them in, which also means the full error is in the annotation.

It still feels like a workaround for the runner doing it though.

@kobalicek
Copy link

kobalicek commented Nov 7, 2020

Any ETA for this feature?

The current support is not enough for serious projects, because:

  • Imagine you have a C++ project with 2000 .cpp files, and there is a problem in a header file, included by all of these .cpp files - you would get 2000 annotations that is just a single issue in the end.
  • Imagine you have a big build matrix, like 50 builds testing various OS and compiler combinations, in a big project where there is 2000 .cpp files - you would get 100000 annotations instead of just 1.

@Plasma
Copy link

Plasma commented May 9, 2022

I'm still seeing this issue

@ruvceskistefan
Copy link
Contributor

Hey all,

Since this issue does not seem to be a problem with the runner application, it concerns the GitHub actions platform more generally. Could you please post your feedback on the GitHub Community Support Forum which is actively monitored. Using the forum ensures that we route your problem to the correct team. 😃

@piegamesde
Copy link

There already is such a post in the forum FWIW: https://github.com/orgs/community/discussions/16661

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Service Bug Bug fix scope to the pipelines service and launch app
Projects
None yet
Development

No branches or pull requests