-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement reading from null safe dereferences #21239
Conversation
+ "return a.missing_length", true)); | ||
|
||
// Writes, all unsupported at this point | ||
// assertEquals(null, exec("org.elasticsearch.painless.FeatureTest a = null; return a?.x")); // Read field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I toyed with this but it is fairly complicated to implement and I don't think it is worth doing in this go. I worked fairly hard on writing all these test cases so I kind of wanted to keep them around for a bit. If we decide later on there is no chance we'll implement writing to null safe dereferences then we can just pitch these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely nice to add this at a later time since even with the docs I bet people will run into this and be confused. But, yes, for another PR :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The types make it fairly confusing!
@@ -30,6 +30,9 @@ | |||
public final class AnalyzerCaster { | |||
|
|||
public static Cast getLegalCast(Location location, Type actual, Type expected, boolean explicit, boolean internal) { | |||
if (actual == null || expected == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this because I hit an NPE deep in here and I figured I'd explicitly check it so we could give a nicer warning. Seeing this bug while compiling is always a bug on our part but at least we'll have more information than "NPE".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a test case that triggers this? because it's definitely worth investigating. Since it's an unexpected case maybe Objects.requireNonNull makes more sense to keep the code a bit cleaner? I could go either way since you're trying to add a better error message, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figure if I end up scratching my head for a while on an NPE it is worth being specific about the message.
The tests that I added about writing to a map triggered this. It is why I have a guard on the cast
call here. I'm spent a while digging but it looks like it is intentional that expected
can be null in some cases and I have to not try and cast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, makes sense to me.
@@ -156,11 +156,11 @@ postdot | |||
; | |||
|
|||
callinvoke | |||
: DOT DOTID arguments | |||
: nullSafe=COND? DOT DOTID arguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about doing this in the lexer instead but went this way because I figured they were about equal. If you prefer lexer I'm certainly fine moving it.
@jdconrad I've made a gift for you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nik9000 Really like this change! Thanks for working on this. Left a few comments, I think all of which are fairly minor.
+ "return a.missing_length", true)); | ||
|
||
// Writes, all unsupported at this point | ||
// assertEquals(null, exec("org.elasticsearch.painless.FeatureTest a = null; return a?.x")); // Read field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely nice to add this at a later time since even with the docs I bet people will run into this and be confused. But, yes, for another PR :)
@@ -156,11 +156,11 @@ postdot | |||
; | |||
|
|||
callinvoke | |||
: DOT DOTID arguments | |||
: nullSafe=COND? DOT DOTID arguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way we've been doing this is simply to check to see if the Token is null, so the nullSafe is necessary. You can just do ctx.COND() != null
in the Walker. This keeps the grammar a bit cleaner IMO (unless I missed something?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did it this way to make it clear that it was a thing I was going to check and because it looks like COND()
would invoke getToken
which looks surprisingly non-trivial. I figured this way antlr saves a copy for me and I can just check it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly prefer a comment over a new variable. This keeps consistency with everything else in the Walker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll switch it to ctx.COND() != null
then
; | ||
|
||
fieldaccess | ||
: DOT ( DOTID | DOTINTEGER ) | ||
: nullSafe=COND? DOT ( DOTID | DOTINTEGER ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above.
@@ -30,6 +30,9 @@ | |||
public final class AnalyzerCaster { | |||
|
|||
public static Cast getLegalCast(Location location, Type actual, Type expected, boolean explicit, boolean internal) { | |||
if (actual == null || expected == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a test case that triggers this? because it's definitely worth investigating. Since it's an unexpected case maybe Objects.requireNonNull makes more sense to keep the code a bit cleaner? I could go either way since you're trying to add a better error message, though.
@@ -56,7 +56,11 @@ void analyze(Locals locals) { | |||
throw createError(new IllegalArgumentException("Cannot write to read-only field [length] for an array.")); | |||
} | |||
|
|||
actual = Definition.INT_TYPE; | |||
if (internal) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... when does this happen? Did I miss a test when I browsed through them quickly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind. I see this supports specifically the elvis operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I haven't got elvis yet, I'll work on that after this is in the next time I get a chance. I use this to support array?.length
which can be null so the types have to agree.
It does lead me to wonder if, after we have the elvis operator it'd be worth optimizing that "chain of ?. into ?:" case so we don't need the boxing and casting. But I think that is a thing for another time.
|
||
@Override | ||
void analyze(Locals locals) { | ||
guarded.expected = expected; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking about this a bit more in depth, I'm not 100% sure this is going to work as expected. Let me get back to you on casting issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is certainly a bit fidgety and you know it better than I do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need to think about this some more, but let me try to explain a bit better the implications of each variable being used here...
expected -- This can definitely be null. I tried to write an explanation in the package-info. The gist of it is that it will be null in cases where promotion is required since we can't determine what expected is until after knowing both the right-hand and left-hand types (I think we want this to always be null here actually, explained later)
explicit -- This is set by an explicit cast and indicates that certain def cases cannot be optimized if it's true. (Since expected will always be null this doesn't need to be set)
internal -- This should not be set here as it's really only used for arguments being passed into a function call where arguments may need to be boxed beyond the user's ability to do so. Take a look at PSubCall or PSubDefCall for an example of this. (Auto boxing will only work if expected and actual are set to the appropriate types which we cannot necessarily know.)
actual -- The type returned from the guarded node. (Set correctly after analysis)
I don't think you ever want to cast here because of the way this node is a wrapper or middle man, so you are sort of double casting since the parent node would do the same cast in theory. I hope that kind of makes sense. For the case of boxing, I think this needs to be specialized for this node, so in analysis if you see the actual type is a primitive you want to make actual the boxed type, but track that you did that with a boolean in the node. Then in writing make sure if you did box, you have to do it there too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I should:
- Stop using
internal
entirely. - Manually check and promote the type during analysis.
- Emit the appropriate boxing code manually, probably just a call to
writer.box
in the right spot.
Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure you got it. Happy to zoom if necessary. Sorry for the confusion here. I myself am trying to ensure I have this correct by doing some examples on good old paper.
I will try to beef up the documentation as I see the variable explanations just aren't thorough enough when I'm back working on improvements to Painless again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try and put together a fix in a bit and we can go from there. I've got a few things to work through before I get to it so it'll probably be tomorrow before I look at it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, sounds good.
assertNull( exec("String a = null; return a?.toString()")); // Call | ||
assertNull( exec("String a = null; return a?.length()")); // Call and box | ||
assertEquals("foo", exec("String a = 'foo'; return a?.toString()")); // Call | ||
assertEquals(Integer.valueOf(3), exec("String a = 'foo'; return a?.length()")); // Call and box |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I swear I'm missing something. I thought that top-level variables wouldn't be working with this based on the grammar changes I see...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That internal flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh never mind. I misread the grammar. The elvis operator applies to the call, not the var. Sorry, trying to think about too many casts at once :)
@Override | ||
void setup(MethodWriter writer, Globals globals) { | ||
guarded.setup(writer, globals); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since setup is only used during writing and writing is illegal, this can also be the same create error as store.
} | ||
|
||
@Override | ||
void load(MethodWriter writer, Globals globals) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as for setup.
} | ||
|
||
@Override | ||
void analyze(Locals locals) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot closer to what I would think this node does.
'write' doesn't need to be set here since the default is false and you have the check for it
'read' is correct
'internal' should be false here for the same reason as explained above, but it's false by default so need to set it
'expected' should be null here always as set by default
'explicit' should be false here always as set by default
'actual' is correct again
I think the same strategy as explained in the previous comment is required here. You do the box in write, but analysis won't be correct here because it needs to know that type has been boxed for the rest of the chain to be correct.
|
||
|
||
@Override | ||
int accessElementCount() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as setup.
} | ||
|
||
@Override | ||
boolean isDefOptimized() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as setup.
} | ||
|
||
@Override | ||
void updateActual(Type actual) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as setup.
@nik9000 Sorry for the confusion related to the AExpression variables once again. I know the casting for variable/field/methods is kind of wonky, but it was difficult to come up with a model that was both compatible for natural expressions like binary operators and variables. It's especially difficult to think through all the uses of a wrapper node in a case like this. |
@jdconrad I pushed some commits that I think address all of your comments. |
@nik9000 I might be missing something, but it looks like the analysis methods in PSubNullSafeField and PSubNullSafeCallInvoke didn't change? |
I reread this comment and made more of the changes you asked for. I'd internalized the "setting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nik9000 This is good. I'll continue to think about it because I'm not sure I've thought through all the scenarios, but if there're bugs we can fix them as they come up. Thanks!
Null safe dereferences make handling null or missing values shorter. Compare without: ``` if (ctx._source.missing != null && ctx._source.missing.foo != null) { ctx._source.foo_length = ctx.source.missing.foo.length() } ``` To with: ``` Integer length = ctx._source.missing?.foo?.length(); if (length != null) { ctx._source.foo_length = length } ``` Combining this with the as of yet unimplemented elvis operator allows for very concise defaults for nulls: ``` ctx._source.foo_length = ctx._source.missing?.foo?.length() ?: 0; ``` Since you have to start somewhere, we started with null safe dereferenes. Anyway, this is a feature borrowed from groovy. Groovy allows writing to null values like: ``` def v = null v?.field = 'cat' ``` And the writes are simply ignored. Painless doesn't support this at this point because it'd be complex to implement and maybe not all that useful. There is no runtime cost for this feature if it is not used. When it is used we implement it fairly efficiently, adding a jump rather than a temporary variable. This should also work fairly well with doc values.
bebb61b
to
d03b8e4
Compare
Null safe dereferences make handling null or missing values shorter.
Compare without:
To with:
Combining this with the as of yet unimplemented elvis operator allows
for very concise defaults for nulls:
Since you have to start somewhere, we started with null safe dereferenes.
Anyway, this is a feature borrowed from groovy. Groovy allows writing to
null values like:
And the writes are simply ignored. Painless doesn't support this at this
point because it'd be complex to implement and maybe not all that useful.
There is no runtime cost for this feature if it is not used. When it is
used we implement it fairly efficiently, adding a jump rather than a
temporary variable.
This should also work fairly well with doc values.