Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S2755 Improve performance: XmlExternalEntityShouldNotBeParsed #4483

Open
pfeigl opened this issue May 28, 2021 · 5 comments
Open

S2755 Improve performance: XmlExternalEntityShouldNotBeParsed #4483

pfeigl opened this issue May 28, 2021 · 5 comments
Labels
Area: C# C# rules related issues. Type: Performance It takes too long.

Comments

@pfeigl
Copy link

pfeigl commented May 28, 2021

Description

We have a single assembly which takes very long when beeing analyzed. /p:reportanalyzer=true shows the following result

Total analyzer execution time: 15420.471 seconds. (TaskId:263)
NOTE: Elapsed time may be less than analyzer execution time because analyzers can run concurrently. (TaskId:263)
Time (s)    %   Analyzer (TaskId:263)
15420.471  100   SonarAnalyzer.CSharp, Version=8.23.0.0, Culture=neutral, PublicKeyToken=c5b62af9de6d7244 (TaskId:263)
8089.488   52      SonarAnalyzer.Rules.CSharp.XmlExternalEntityShouldNotBeParsed (TaskId:263)
6958.701   45      SonarAnalyzer.Rules.CSharp.PrivateFieldUsedAsLocalVariable (TaskId:263)
355.733    2      SonarAnalyzer.Rules.CSharp.HardcodedIpAddress (TaskId:263)
2.563   <1      SonarAnalyzer.Rules.SymbolicExecution.SymbolicExecutionRunner (TaskId:263)
1.689   <1      SonarAnalyzer.Rules.CSharp.CreatingHashAlgorithms (TaskId:263)
...

Repro steps

For now I'm unable to share the source code :-( Some things that might stand out and give you some hints:

  • The project contains XML and XSD files as embedded resources
  • However we actually exclude them from sonar completely by using <Property Name="sonar.exclusions">**/*.xml,**/*.xsd</Property>
  • One of the files has a very big Dictionary (around 4k lines), so we suspected a connection to Reduce the number of symbols retrieved by SymbolReferenceAnalyzer #4220. While this might also have been a problem (we already applied 8.23 with the respective fix), it doesn't seem to be the root problem

Known workarounds

We didn't find one (except ofcourse excluding the whole project, but that shouldn't be an option ;-) )

Related information

  • 8.23 (but we started off with 8.19)
  • Visual Studio version: Only msbuild tools, no VS installation on build machines
  • MSBuild / dotnet version: Microsoft (R) Build Engine version 16.8.2+25e4d540b for .NET Framework
  • SonarScanner for .NET version (if used): 5.21
  • Operating System: Windows Server 2016
@costin-zaharia-sonarsource
Copy link
Member

Hi @pfeigl,

thanks for your feedback.

As far as I can tell this issue and #4220 have the same root cause: we get symbols for all the identifiers and this is expensive. We can certainty improve in this regard.

Depends on the fix but most probably, fixing #4220 will not fix this issue with XmlExternalEntityShouldNotBeParsed or PrivateFieldUsedAsLocalVariable so I will let this issue open.

Regarding the fix done in 8.23, it is only related to the utility analyzers and this explains why the problem is still reproducible. Once question though, did the analysis time improved for your project between previous version and 8.23?

@pavel-mikula-sonarsource
Copy link
Contributor

Hi @pfeigl,

Can you provide a small sample of the file with many dictionary entries? With just one or two lines of the dictionary and total number of lines? You can change all type names, all constants to 42, string literals to "something" and so on. We just need to see anonymized form of the syntax you use, while keeping all syntax elements (number of keywords, braces, ...).

@pfeigl
Copy link
Author

pfeigl commented May 31, 2021

Thanks for the feedback. We also suspected that the dictionary might still be somehow connected.

We meanwhile changed the huge dictionary to IList<string> and Lazy-converted this into a static dictionary field. Not the best of solutions, because we now have runtime code were we previously had a compiled dictionary, but well - with ~4k entries and not many iterations on our side, not too big of a deal.
This also fixed the build problems.

To answer the open question: Yes, integrating 8.23 actually brought the build times down from ~3h to ~2h.

Also find a anaonymized and shortened version of the file in question below (in reality its 4200 lines):

using System.Collections.Generic;
using System.Linq;

public class InformationContainer
{
    public InformationContainer(string identifier, int x, int y)
    {
        Identifier = identifier;
        X = x;
        Y = y;
    }

    public string Identifier { get; }
    public int X { get; }
    public int Y { get; }
}

public static class LongDictionaryContainer
{
    public static IDictionary<string, IList<InformationContainer>> LongDictionary = new Dictionary<string, IList<InformationContainer>>
    {
        {"1", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"2", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"3", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"4", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"5", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,1) } },
        {"6", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"7", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,1), new InformationContainer("Lorem Ipsum", 0,0) } },
        {"8", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"9", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"10", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,1), new InformationContainer("Lorem Ipsum", 0,0) } },
        {"11", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"12", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"13", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,1), new InformationContainer("Lorem Ipsum", 0,0) } },
        {"14", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"15", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"16", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"17", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"18", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"19", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"20", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,1), new InformationContainer("Lorem Ipsum", 0,0) } },
        {"21", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"22", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"23", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,1), new InformationContainer("Lorem Ipsum", 0,0) } },
        {"24", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,1), new InformationContainer("Lorem Ipsum", 0,0) } },
        {"25", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"26", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } },
        {"27", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,1), new InformationContainer("Lorem Ipsum", 0,0) } },
        {"28", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,1), new InformationContainer("Lorem Ipsum", 0,0) } },
        {"29", new List<InformationContainer> { new InformationContainer("Lorem Ipsum", 0,0) } }
    };
}

Thanks for a great product and the rapid help!

One final question from our side: We actually excluded the whole file using sonar.exclusions and this didn't help at all. Does this mean, that analyzers even run for excluded files and that they are just not reported?

If yes, is there a way to exclude certain analyzers for a specific file?

@costin-zaharia-sonarsource
Copy link
Member

costin-zaharia-sonarsource commented May 31, 2021

One final question from our side: We actually excluded the whole file using sonar.excludes and this didn't help at all. Does this mean, that analyzers even run for excluded files and that they are just not reported?

If yes, is there a way to exclude certain analyzers for a specific file?

Currently sonar.exclusions parameter is applied after the analysis is done. This should explain why the time does not decrease.

We are currently considering different options for applying the exclusions at the analysis time but all seem to have different downsides. One of them being a performance hit for checking if the file name of the syntax tree matches the exclusions pattern each time a Roslyn action is triggered :)

The specification is currently in progress.

One work around for this is to use .editorconfig for configuring the analyzers like here: https://github.com/dotnet/roslyn/blob/23d354127e9293a2a02d59bae81a0b309a02ea4a/.editorconfig#L275
but this is cumbersome since the configuration is done per rule (unless you are using the latest version of visual studio).

@costin-zaharia-sonarsource costin-zaharia-sonarsource added Area: C# C# rules related issues. Type: Performance It takes too long. labels Jun 7, 2021
@pavel-mikula-sonarsource pavel-mikula-sonarsource changed the title Analyzer taking very long (not TokenTypeAnalyzer) S2755 Improve performance: XmlExternalEntityShouldNotBeParsed Jun 21, 2021
@pavel-mikula-sonarsource
Copy link
Contributor

I've renamed this to address S1450 XmlExternalEntityShouldNotBeParsed and created #4589 to address S1450 PrivateFieldUsedAsLocalVariable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: C# C# rules related issues. Type: Performance It takes too long.
Projects
None yet
Development

No branches or pull requests

3 participants