-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance issues when loading include workflows #1680
Comments
After further investigation, it turns out the cost is born out of an interaction between partial loading of include workflows and the creation of multiple
The reason why multiple Type registration is batched so that all types in the workflow are collected at once, but because This effect will be exacerbated in workflows containing multiple included diverse workflows, each referencing very different types, e.g. one workflow with 100 includes of the same file will create only 1 extra serializer, whereas 100 includes of 50 different files can potentially create up to 50 extra serializers (assuming there is at least one non-overlapping type across the 50 different files). Possible solutionsPre-loading of include workflows during type discoveryEach loaded workflow file is pre-scanned for extension types (see Care might be required to avoid excessive recursive inspection of workflows when actually loading the included workflows (which will be themselves scanned recursively). One option might be to use memoization and store a table of already inspected workflows (invalidated at the end of the load call). However, if the main bottleneck is file IO this might not be very important as the file system will already cache file contents for repeated read access. An entirely different approach would be to have all include workflows loaded recursively together with the main workflow, but this is likely to require significant refactoring of the current infrastructure. Benchmarks should be taken into consideration to decide between the different options. Explicit pre-loading of types to XmlSerializerAnother potentially more straightforward alternative would be to expose a new API to pre-load extension types into the On the other hand, it might have the significant downside of ramping up IDE initialization latency, and also would not resolve initialization lag when running workflows from the CLI, where no explicit scanning for types is performed. |
It turns out the above is only half of the story. There is another performance bottleneck in include workflows related to serialization of the values in externalized properties. Specifically, because we separately serialize all the values of externalized properties as XML elements inside the include workflow itself, we need to create
Worse, because the root of the XML element for each property depends on the property name, it is not enough to have different There are two (possibly complementary) approaches to removing this bottleneck:
Doing 1. seems more accessible right now since all properties are already serialized independently and there is always a element.Name = XName.Get(SerializerPropertyName, element.Name.NamespaceName); |
Preliminary benchmarks implementing both serializer caching approaches above (eager loading of include workflow types and removing combinatorial explosion of property serializers) indicate a possibly more than 10x performance increase in reading a complex workflow, from 30 seconds down to less than 1 second. |
A growing issue as we use more and more embedded resources is a performance penalty when accessing the first embedded resource in a newly loaded assembly. This penalty seems to be paid only once when the resource is first loaded, but is paid for every single assembly with embedded resources.
The first step here would be to investigate the performance penalty further with benchmarks on both .NET framework and .NET core, and with different types of resources and different numbers of assemblies and resources loaded (e.g. does an assembly with 1 embedded resource pay a smaller cost than an assembly with 10 embedded resources?).
The text was updated successfully, but these errors were encountered: