-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inject RegexGeneratorAttribute from the generator #64229
Conversation
In .NET 7 we added the RegexGeneratorAttribute that the new source generator keys off of. This is a bit problematic, though, as it is with our other source generators, as it makes the attribute available even when the generator isn't in use, turning the attribute into a nop. For example, the attribute shows up for consumption in both F# and Visual Basic, but is meaningless there, and we can't run analyzers in all of the places it might possibly be an issue. Even in C# it's possible to disable the generator, again at which point the attribute is meaningless and potentially misleading. An alternative solution is to have the generator inject the attribute into the compilation. That way, the attribute by definition is only available when the generator is being used. There were three problems previously identified with this approach. First, it increases the size of the assembly, but a) that's the case for other attributes the compiler itself injects into every compilation, e.g. for nullable, and b) as long as the attribute goes unreferenced it can be removed by the timmer. Second, it needs to be referenced in order to apply it to members for the generator to see, and thus inhibits trimming, but that's addressable by making the attribute Conditional on something no one will ever use unless they really want it to stick around. And third, there's an issue of InternalsVisibleTo, where two assemblies each using the generator and one with IVT to the other can hit CS0436 about duplicate type definitions, but a) there's an open proposal in the language to fix that, and b) until such time the warning can be suppressed with a diagnostic suppressor built into the generator. This PR does that.
Note regarding the This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change. |
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions Issue DetailsIn .NET 7 we added the RegexGeneratorAttribute that the new source generator keys off of. This is problematic, though, as it is with our other source generators, as it makes the attribute available even when the generator isn't in use, turning the attribute into a nop. For example, the attribute shows up for consumption in both F# and Visual Basic, but is meaningless there, and we can't run analyzers in all of the places it might possibly be an issue. Even in C# it's possible to disable the generator, again at which point the attribute is meaningless and potentially misleading. An alternative solution is to have the generator inject the attribute into the compilation. That way, the attribute by definition is only available when the generator is being used. There were three problems previously identified with this approach. First, it increases the size of the assembly, but a) that's the case for other attributes the compiler itself injects into every compilation, e.g. for nullable, and b) as long as the attribute goes unreferenced it can be removed by the timmer. Second, it needs to be referenced in order to apply it to members for the generator to see, and thus inhibits trimming, but that's addressable by making the attribute Conditional on something no one will ever use unless they really want it to stick around. And third, there's an issue of InternalsVisibleTo, where two assemblies each using the generator and one with IVT to the other can hit CS0436 about duplicate type definitions, but a) there's an open proposal in the language to fix that, and b) until such time the warning can be suppressed with a diagnostic suppressor built into the generator. This PR does that. @ericstj, @eerhardt, I'd like your opinions here. While we can't change how we did this for the generators shipped in .NET 6, my hope is we could use this model moving forward, including for GeneratedDllImport (cc: @jkoritzinsky, @AaronRobinsonMSFT).
|
.... Does this mean, for my work in #62325, that I can remove the modifications to |
No, this doesn't affect the referencing of the generator. This is purely about where RegexGeneratorAttribute comes from. |
IIRC we considered it a non-starter because the user would have to make a change in their own code. It sounds like you're saying that's not the case if you use a diagnostic suppressor. I don't think this was suggested previously. It does seem like that would solve our concern with this if it's transparent to the user. It's not fool-proof: suppose assembly A uses the source generator, B tries to but doesn't include it. A has IVT to B. B can still see the attribute without having the generator, so it would still be better if we had a way to hide the internal attribute definition completely, but I think this approach is better than exposing in the reference assemblies.
We could consider a breaking change if we feel strongly about this being a better solution. |
I'm not a fan of having the attribute injected by the generator. It makes discovery very difficult in my opinion. I get the other language issue but I think we have that in a few other types as well - I've also not been happy with the IntelliSense experience around this so. This is largely back to discoverability, but I personally don't like it. I don't see why the linker can't trim these out of the way dynamically if they are costly or extra metadata. Not saying we shouldn't do this for some generators, but the DllImport source generator is about replacing a built-in concept so I'd argue it should also be built-in. I guess this is predicated on how one is shipping the generator though. /cc @elinor-fung |
How so? From what I've seen, the experience is identical from an IntelliSense perspective, assuming you do the injection correctly. (And if it's not, that's something to be fixed by Roslyn.) This is with it injected in RegisterPostInitializationOutput: Only difference I see is when I click Go To Definition, it takes me to the injected definition: |
It's one thing for something to not be usable because it's not relevant. It's much different to reach for something, expect it to work, and it doesn't.
There are orders of magnitude more uses of DllImport than there are of UnmanagedCallersOnly. If someone tried applying DllImport in F# or VB and it was simply a nop, that would be a stop-the-presses bug. |
I agree. But built-in doesn't mean "it must be in corelib". The implementation is supplied by a particular component and the attribute is worse than meaningless without it: why shouldn't the definition of the attribute, which is only meaningful with that component, also be supplied by that component? If the generator is part of the SDK, then the attribute is as well, supplied by the generator in the SDK. I don't understand the "built-in" argument here. |
I think having the DllImport generator inject its attribute like this would be reasonable - @AaronRobinsonMSFT, @jkoritzinsky, and I chatted offline and are in agreement. Main questions would be around process/convention. How would we want to document APIs injected by source generators like this? Do we create some standard note for APIs are not actually in runtime libraries and generated only for certain languages or do we have some new section for source-generated APIs? We also wouldn't have the existing triple-slash xml to auto-generate mechanism. For API proposals, should there be some convention like the source-generator tag (or another one) indicates the API will be injected and not part of the runtime libraries or just have it called out in description? |
Cool
Good question. We should figure out some approach to it (but not block on it). @terrajobst, @jeffhandley, ideas?
I think we should review these just as API proposals. They still need to be of the same level of quality, they still have the same compatibility rules. Etc. The ship vehicle is just a bit different. |
{ | ||
private const string AttributeName = "System.Text.RegularExpressions.RegexGeneratorAttribute"; | ||
private const string SuppressedId = "CS0436"; | ||
private const string Id = "RGDS0001"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this Id need to be "registered" anywhere (like we do for analyzer IDs)? Or documented at all? Does it ever show up to users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we expect to fix the dotnet/csharplang#5528 issue before 7.0 at which point this suppressor won't be required any longer, so that potentially means we don't need to document it, but that is just what I'm understanding.
{ | ||
/// <summary>Instructs the System.Text.RegularExpressions source generator to generate an implementation of the specified regular expression.</summary> | ||
/// <remarks>The generator associated with this attribute only supports C#. It only supplies an implementation when applied to static, partial, parameterless, non-generic methods that are typed to return <see cref=""Regex""/>.</remarks> | ||
[Conditional(""INCLUDE_SYSTEM_TEXT_REGULAREXPRESSIONS_REGEXGENERATORATTRIBUTE"")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone wanted to keep the attributes in the assembly, for say auditing purposes, the intention is they would set this define in their .csproj, right?
|
||
private const string RegexGeneratorAttributeSource = | ||
@"#nullable enable | ||
using global::System.Diagnostics; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: how come we are using global::
to qualify usings but not the rest of the System.* types below? I suppose we would need to also use global bellow in case people are for whatever reason declaring types that clash with some of ours?
/// <remarks>The generator associated with this attribute only supports C#. It only supplies an implementation when applied to static, partial, parameterless, non-generic methods that are typed to return <see cref=""Regex""/>.</remarks> | ||
[Conditional(""INCLUDE_SYSTEM_TEXT_REGULAREXPRESSIONS_REGEXGENERATORATTRIBUTE"")] | ||
[AttributeUsage(AttributeTargets.Method, AllowMultiple = false, Inherited = false)] | ||
internal sealed class RegexGeneratorAttribute : Attribute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this is a completely separate discussion, so this PR shouldn't block by it, but what are our thoughts around the documentation of the APIs that ship as part of an analyzer? For example, have we thought how to ensure the APIs are documented in docs.microsoft.com? cc. @carlossanlop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a small comment but this looks good otherwise. I suppose that there is absolutely no good way of adding tests for the code itself anymore? I really can't think of any other than perhaps keeping an internal copy of that string-code as part of the test assembly in order to run tests against it and we update that internal copy every time we change the string, but that is not maintainable at all and the existing test was really only checking ctor roundtriping.
Having the generator generate the attribute if it doesn't exist in the BCL seems fine, but this raises the question what kicks off the generator then? Normally I would have thought that the user needs to apply this attribute somewhere for the generator to kick in. I don't think we want to always generate the attribute if the generator is part of the compilation (because that pattern doesn't work well for auto-referenced generators). |
Can you elaborate on the concern? You'll end up with at most one internal attribute per assembly, and it'll be trivially trimmable by the trimmer. It's very similar in that vein to the attributes the compiler injects, e.g. NullableAttribute, NullableContextAttribute, etc.
If the generator is used, e.g. if it's referenced by default as part of the .NET SDK, Roslyn gives every generator a chance to inject code upon initialization of the generator, before there's any code to inspect... that's the RegisterPostInitializationOutput mentioned earlier. |
My understanding is that the compiler only adds the attribute when nullability is enabled. It seems bad to emit attributes in user assemblies for features they don't use. If we do this for all features that require attributes over time this might result in dozens of attributes. Now imagine someone does Hello World and opens their assembly in ILSpy. It looks messy/noisy. As a platform, we want to communicate that we sweat the small stuff. From that perspective, I think aesthetics/first impressions matter. |
I can understand that. The converse is true, though, with a similar impact on first impressions: "Look, I can add this attribute to my method and it'll generate all the P/Invoke marshaling for me. Awesome. Wait, why is my code nop'ing in production... oh, I see, they added this attribute to the core libraries and it simply nops with no notice in my F# code." |
But this would be only true for people targeting older version of the platform, right? I think our general stance is that we don't consider this the happy path. |
No. Every F# and VB user in .NET 7 (though if it came to that for VB at least we could write an analyzer to tell them they can't use the shiny goodness). |
Clearly I'm missing context. Re-reading more of the above. |
Source generator ShinyGenerator keys off of a ShinyAttribute, whose only job in life is to give data to ShinyGenerator for it to provide an implementation of the method ShinyAttribute adorns. If ShinyGenerator doesn't exist but ShinyAttribute does, it's like providing a header file with no implementation (or nop implementations); you can apply the attribute, but there's nothing to fill in the implementation. And ShinyGenerator is only available for C#. My argument is that ShinyAttribute is logically part of the generator, so it should ship as part of the generator; if you don't get the generator, you don't get the attribute. (If we had source-only attributes, that would solve the concern, as it would entirely evaporate without the help of the trimmer. As it stands today, though, C# does not provide such support. The RegexGeneratorAttribute in this PR is at least adorned with a [Conditional(...)] attribute so that unless a magic incantation is uttered, the use of the attribute will evaporate at compile-time, and the linker can easily remove the unused attribute from the assembly.) |
Caught up. That's a problem. I see two other options:
The counter example for (1) is that this requires additional configuration outside the code, but I don't think this would be disastrously bad because it's a well-established pattern for other things (e.g. nullable). And most people don't need to use these features. The problem of (2) is that it requires language support to turn something off. This isn't ideal because third parties might still experience false positives, but arguably doing VB and F# would handle the vast amount of non-C# code. I have a slight preference for (2) because it allows us to ship & doc platform-provided attributes like usual. |
This is really a [RequiresCSharpAttribute]. F# could add support for source generators, and it wouldn't help unless RegexGenerator added an F# implementation, DllImportGenerator added an F# implementation, JsonSourceGenerator added an F# implementation, LoggingGenerator added an F# implementation, etc. |
In that case property in MSBuild then? If there is a concern around this being an additional hoop, we could default them to |
Is that different from just disabling the generator, which there are already gestures for doing? (Though if memory serves they're not simple properties in a csproj... maybe that could be made easier if it's not already) |
See dotnet/roslyn#55518. Today it isn't a simple MSBuild property. You'd need to make a |
(Moving back to no-merge while we continue discussing) |
My thoughts ... As long as we are using an attribute to trigger the generators then the attribute has to exist somewhere. Either in the core libraries or as source in the project being compiled. Ideas like having them appear in source and removed if they're unused is decently involved feature (and end results is the solution likely ends up with more attributes in core libraries only usable by C#). If we are going to use attributes as triggers then I think we need to accept that they will be available even if unused.
I'm pretty against any use of MSBuild properties here. Build is complicated enough already with users having dozens of properties / switches they can use to turn the same compiler features on and off. It's already a significant source of customer confusion that we have to deal with. It's also an engineering headache because we have to ensure that every layer of the build stack properly flow the property (spoiler, this breaks a lot). I do not think we want to go this route. It's going to trade what today is an extra attribute for non-trivial build complexity.
I'm not sure this is the best route. I do understand what this problem is trying to solve. Overall attributes have a number of deficiencies as generator trigger points and this is only addressing one of them. If we're going to start "fixing" attributes I wonder if we should just start considering solutions that solve more of the problems. Quick list
(1) may be unfairly top of mind for me right now due to some investigations I'm involved with. Overall though it does make me wonder if we should pick a different trigger point. One idea is what if we use Note: This does require a small change to c# to treat |
But introduces others. Where do we put the other arguments, like options and timeout? How are intellisense provided for those? How do we convey that the string is a regex for VS regex coloring to apply? Do we now need to parse all trivia? Etc. |
Indeed these are legitimate problems to consider
This happens already |
"we" as in the generator author |
My 2 cents here, even though I understand the concern of this feature only work on C#, I think overall the value of keeping this as part of the framework outweighs the cons in my mind (of course I'm a C# developer so I'm biased). Apart from what has already been discussed on the thread, keeping the source as part of the framework adds infrastructure benefits like being able to test the attribute, being able to document it as we document other APIs, and also being able to run APICompat on it to ensure there are no breaking changes (which granted might be rare in an attribute) and to view it as part of sites like apisof.net or docs.microsoft.com |
I'm going to close this and we'll stick with including the attribute in the libraries. I don't actually think that's the right solution; I still 100% believe the right answer is that the attribute is part of the generator, such that you only can only even utter the attribute's name if the generator is actively being used. But we lack the ability to create a full-fidelity solution here based on injecting the attribute, e.g. leaving behind a copy of the attribute in every single untrimmed .NET assembly is untenable. |
In .NET 7 we added the RegexGeneratorAttribute that the new source generator keys off of. This is problematic, though, as it is with our other source generators, as it makes the attribute available even when the generator isn't in use, turning the attribute into a nop. For example, the attribute shows up for consumption in both F# and Visual Basic, but is meaningless there, and we can't run analyzers in all of the places it might possibly be an issue. Even in C# it's possible to disable the generator, again at which point the attribute is meaningless and potentially misleading.
An alternative solution is to have the generator inject the attribute into the compilation. That way, the attribute by definition is only available when the generator is being used. There were three problems previously identified with this approach. First, it increases the size of the assembly, but a) that's the case for other attributes the compiler itself injects into every compilation, e.g. for nullable, and b) as long as the attribute goes unreferenced it can be removed by the trimmer. Second, it needs to be referenced in order to apply it to members for the generator to see, and thus inhibits trimming, but that's addressable by making the attribute Conditional on something no one will ever use unless they really want it to stick around. And third, there's an issue of InternalsVisibleTo, where two assemblies each using the generator and one with IVT to the other can hit CS0436 about duplicate type definitions, but a) there's an open proposal in the language to fix that, and b) until such time the warning can be suppressed with a diagnostic suppressor built into the generator.
This PR does that.
@ericstj, @eerhardt, I'd like your opinions here. While we can't change how we did this for the generators shipped in .NET 6, my hope is we could use this model moving forward, including for GeneratedDllImport (cc: @jkoritzinsky, @AaronRobinsonMSFT).
cc: @jaredpar
Fixes #63551