Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] Add support for nullable types in GDScript #76843

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

joao-pedro-braz
Copy link
Contributor

@joao-pedro-braz joao-pedro-braz commented May 8, 2023

Nullable Types in GDScript

This PR aims to add a feature-set to GDScript tailored towards the safe usage of potentially-null expressions.
It was initially based off godotengine/godot-proposals#162 and through several discussions over RocketChat it grew to be what it is currently. Further feedback is wholeheartedly appreciated!

TODOs

  • Make the analyzer ternary-aware
  • Catch methods returning the nullable result of other methods
  • Convert nullable errors to UNSAFE_* warnings (Won't consider for now)
  • Rewrite the way nullables are handled in the VM
  • Fix a off-by-two error when setting a OPCODE_JUMP_IF_NULL for nullable assignments
  • Add the PROPERTY_USAGE_NULLABLE to the PropertyUsageFlags
  • Merge https://github.com/godotengine/godot-cpp/pull/1110`
  • Re-enable "Build godot-cpp test extension" before merging

What it is

This PR is:

  • The addition of the nullable qualifier, an optional type qualifier that indicates whether a given type is nullable (which can be interpreted as a union type of the type itself + null, like: String?, which would be a union type of String | Null). Usages of the nullable qualifier are:

    • During a class member declaration:
      @tool
      extends EditorScript
      
      var class_member: String? = null
      
      func _run():
          class_member = "123"
    • During a variable declaration:
      @tool
      extends EditorScript
      
      func _run():
          var nullable_variable: String? = "123"
    • During a function parameter declaration:
      func a_fun_function(nullable_parameter: String?):
          pass
    • During a function return declaration:
      func another_function() -> String?:
          return null
    • As the inner type of a Typed Array:
      func _run():
          var arr: Array[int?] = [1, null]
  • The inclusion of nullity safety guards on several expressions to ensure nullable types are handled in a safe manner. These safety guards were added to the following expressions:

    • Binary operators:
      func _run():
          var nullable_str: String? = null
          print(nullable_str + "123") # Unsafe, will cause a parse error (triggered by the analyzer)
          var nullable_int: int? = null
          print(nullable_int - 123) # Unsafe, will cause a parse error (triggered by the analyzer)
    • Subscripts:
      class A:
          var cool_value := "cool!"
      func _run():
          var nullable: A? = null
          print(nullable.cool_value) # Usafe, will cause a parse error (triggered by the analyzer)
    • Calls:
      class A:
          func cool_function():
              pass
      func _run():
          var nullable: A? = null
          print(nullable.cool_function()) # Usafe, will cause a parse error (triggered by the analyzer)
    • Assignments:
      func _run():
          var nullable_str: String? = null
          var non_nullable: String = nullable_str # Usafe, will cause a parse error (triggered by the analyzer)
    • Unary Operators:
      func _run():
          var nullable: int? = null
          print(-nullable) # Usafe, will cause a parse error (triggered by the analyzer)
    • For-in ranges:
      func _run():
          var nullable: int? = null
          for i in nullable: # Usafe, will cause a parse error (triggered by the analyzer)
              pass
    • Returns:
      func a_non_null_function() -> String:
          var nullable: String? = null
          return nullable # Usafe, will cause a parse error (triggered by the analyzer)
      func another() -> String:
          return nullable() # Also unsafe and will cause a parse error (triggered by the analyzer)
      func nullable() -> String?:
          return null
  • The implicit narrowing of nullable identifiers, which ensures that nullable identifiers can be safely handled (without safety errors) if properly checked beforehand. This currently applies to the following expressions:

    • If/Elif/Else statements:
      var foo: String? = null
      if foo:
          print(foo.begins_with("")) # Safe
      elif foo != null:
          print(foo.begins_with("")) # Safe
      elif foo == "123":
          print(foo.begins_with("")) # Safe
      elif is_same(foo, "123"):
          print(foo.begins_with("")) # Safe
      elif is_instance_of(foo, TYPE_STRING):
          print(foo.begins_with("")) # Safe
      else:
          print(foo.begins_with("")) # Not Safe
    • While statements:
      var foo: int? = null
      while foo:
          foo -= 1 # Safe
    • Match statements:
      var foo: String? = null
      match foo:
         "1", "12", "123":
             print(foo.begins_with("1")) # Safe
         null:
             print(foo.begins_with("1")) # Not Safe
         var bar:
             print(bar.begins_with("1")) # Not Safe
    • Ternaries:
      var foo: int? = null
      (foo if foo else 1) + 1 # Safe
      (foo if foo else null) + 1 # Not Safe
  • The addition of the nullable access operator (?. and ?[]), which ensures subscript accesses are safe even on null values by only accessing the right-hand attribute if the left-hand base is not null, otherwise by returning null:

    • Attribute access:
      func _run():
          var nullable_str: String? = null
          print(nullable_str?.begins_with("")) # This is safe, "begins_with" will only be called/accessed if "nullable_str" is not null 
    • Index access:
      func _run():
          var nullable_str: String? = null
          print(nullable_str?[0]) # This is safe, "nullable_str" will only be indexed if it's not null
  • The inclusion of the nullable coalescing operator (??), which allows a nullable expression to be coalesced into another expression, essentially allowing nullable expressions to have default values. Both operands must have the same type (but not necessarily the same nullity) and Resource Types (Objects) are also supported:

    class A:
        var cool_property := "123"
    func _run():
        var nullable_str: String? = null
        print((nullable_str ?? "123").begins_with("123")) # This is safe, calling "begins_with" is safe because the expression "(nullable_str ?? "123")" will always return a non-nullable value
        var nullable_instance: A? = null
        print(nullable_instance ?? A.new()) # Also safe

What it is not

This PR is not about making the core null-aware or even updating the API docs to reflect instances where null might be a valid argument/return.
This PR is also not about making Reference Types (Like Objects) no longer implicitly null by default. Their behavior remains unchanged.

Next steps

Next steps for improvements included:

  • The addition of a Explicit Nullable Narrowing Syntax (ENNS for short), which would explicitly allow for a nullable identifier to be narrowed into a non-nullable one. Proposed syntax so far includes:
    • The if bar ?= foo syntax:
      var foo: int? = null
      if bar ?= foo:
          print(bar) # Bar will only be printed if foo is not null and it will be non-nullable (int)
    • The for bar in foo:
      var foo: int? = null
      for bar in foo:
          print(bar) # Bar will only be printed if foo is not null and it will be non-nullable (int)
  • Converting the nullable related errors to UNSAFE_* warnings as suggested by @dalexeev. (Will do next)
  • Even though over 50 test cases were added, there is room for more, specifically those ones covering edge cases.

Closes godotengine/godot-proposals#162
Closes godotengine/godot-proposals#1321

@joao-pedro-braz joao-pedro-braz requested review from a team as code owners May 8, 2023 13:23
@joao-pedro-braz joao-pedro-braz force-pushed the implement_nullable_types branch 3 times, most recently from 118ab45 to 7a9bbfd Compare May 8, 2023 13:39
@akien-mga akien-mga added this to the 4.x milestone May 8, 2023
@joao-pedro-braz
Copy link
Contributor Author

Codespell keeps complaining about:
image
but "Remaning" doesn't seem to even be present in the code base at all. Suggestions?

@akien-mga
Copy link
Member

Codespell keeps complaining about: image but "Remaning" doesn't seem to even be present in the code base at all. Suggestions?

Will be fixed by #76842 in a few minutes (rebasing this PR now on master should also fix it, but if you wait for that PR it's cleaner).

@adamscott adamscott marked this pull request as draft May 8, 2023 13:45
@adamscott
Copy link
Member

I set the PR to draft, as discussed in the chat!

@RandomShaper
Copy link
Member

This is amazing.

Could you clarify what's the result of a nullable access operator when the variable is null? Null, I guess, but just in case.

@joao-pedro-braz
Copy link
Contributor Author

joao-pedro-braz commented May 8, 2023

This is amazing.

Could you clarify what's the result of a nullable access operator when the variable is null? Null, I guess, but just in case.

@RandomShaper Yep, null!
Which allows it to be combined with the nullable coalesce operator for a basically infallible attribute access:

class B:
    var cool_property := "inner"
class A:
    var b: B?
func _run():
    var a: A? = null
    # Try to get "cool_property" or return a default value otherwise
    print(a?.b?.cool_property ?? "default")

Actually running it:
image

@joao-pedro-braz joao-pedro-braz force-pushed the implement_nullable_types branch from 7a9bbfd to 7fa3481 Compare May 8, 2023 14:54
@anvilfolk
Copy link
Contributor

anvilfolk commented May 8, 2023

Absolutely amazing PR description like I said in the chat! 🔥🔥🔥

I haven't been able to follow the dev chat fully lately, so this might have already been discussed there. I do think it bears having that discussion in this more public way though, so here we go! :)

As I've said in the chat, I personally generally dislike the implicit narrowing that happens as a side-effect of using control flow statements like if and while. I find that to be a little too subtle for a language that is meant to be straightforward and user-friendly :) I'd much rather have explicit narrowing, as in the handling of union types in functional programming languages (which could help set the stage for union type handling in GDScript later down the line). So something like:

var x : String? = null
case x:
  null: # Return / do error handling / perform default behavior?
  String: # use x as if it was a non-nullable string

One-liner variants of this could also work, e.g. case x: # assume x is not null or casen x: # assume x is null. This approach has the advantage of keeping nullability statements and control flow statements clearly separated.

The if foo syntax feels particularly unsafe due to many non-nullable variants having implicit boolean conversions:

	var x : String? = ""
	
	if x: # Cannot distinguish between x == "" or x == null
		print("not empty")
	else:
		print("empty")
	
	var y : int? = 10
	while y: # Cannot distinguish from y == null or y == 0
		y = y - 1
		# might do something here that results in y == null
		print(y)

And then you have to have expression analysis be aware that the expression, depending on context, may contain implicit narrowing syntax. For example, what happens with the example below? Somewhere in the middle of it there's a lone y or y != null that needs to be parsed and understood as implicit narrowing syntax in the middle of what is otherwise a normal boolean expression.

	var y : int? = 10
	while y > 0 and y*x < x + 4 and y != null and other_weird_function(y):
		do_stuff_with_y_that_must_be_non_null(y)

Doesn't that result in code that's harder to maintain compared to having a statement that explicitly handles nullability (and nothing else like boolean/booleanable expressions)?

I'm worried that newer folks coming into Godot/programming for the first time are gonna get either bit or just very confused by these subtle side-effects of program control flow statements.

Maybe it's just me though! And if the consensus is that there are no issues with user friendliness and code-maintainability, then I'm happy :)

@joao-pedro-braz
Copy link
Contributor Author

joao-pedro-braz commented May 8, 2023

Definitely a valid concern @anvilfolk!

I think there are two points I feel strongly of on the implicit narrowing subject:

  1. Other languages do it that way:
    Of course this by itself isn't a good enough reason for anything and I acknowledge that. Having said that... it goes to show how that isn't necessarily "confusion material" for newcomers. Languages like Typescript make use of implicit narrowing pretty much in the same way as this PR does:
    image

  2. At what point does it stop being our fault and become the user's?
    Kinda weird to say that, but so far the only examples on why the implicit narrowing might be confusing involve either functions with side effects or asynchronous functions with side effects. Both of which are prone to problems by itself (side effects = bad if not purposeful). Shouldn't we then rather focus on letting the user know their function have side-effects (By adding a UNSAFE_* warning as another PR, for instance) and therefore it might behave unexpectedly?

Just my two cents.
Also safe to add explicit and implicit are complementary, we could have both if we agree on a particular syntax.

@joao-pedro-braz
Copy link
Contributor Author

@anvilfolk Forgot to address the if x: # Cannot distinguish between x == "" or x == null
but I think that's a no brainer, if the user cares for all non-null values of x they could just do if x != null:.

@AThousandShips
Copy link
Member

You need to sync GDExtension with these changes I believe:

GDEXTENSION_VARIANT_OP_IN,

@joao-pedro-braz
Copy link
Contributor Author

GDEXTENSION_VARIANT_OP_IN,

@AThousandShips I knew I was missing something, even asked on #gdextension, thank you!

@joao-pedro-braz
Copy link
Contributor Author

@AThousandShips
Copy link
Member

AThousandShips commented May 8, 2023

Oh sorry missed that, now that you mention it I did see it, then this must be due to lack of sync in godot-cpp, got myself confused looking at the main thread and forgot

@adamscott
Copy link
Member

Where the PR would truly shine is with dictionaries. dict_1?.dict_2?.dict_3?.value

Unfortunately, dict_1.dict_2 triggers an error if the key dict_2 doesn't exist in dict_1. This is counter intuitive if we can't use the ?. operator on dictionaries. The operator should check if a key exist, otherwise return null without triggering any error.

@joao-pedro-braz
Copy link
Contributor Author

Oh sorry missed that, now that you mention it I did see it, then this must be due to lack of sync in godot-cpp, got myself confused looking at the main thread and forgot

So In that case I would need to have a PR adding the coalesce operator to the godot-cpp merged before merging this one?

@AThousandShips
Copy link
Member

That's my confusion, it does seem a very strange situation as neither would work without the other, I might be missing something elsewhere, that's why I got confused as I assumed it must be possible to make changes that doesn't immediately work with godot-cpp

@joao-pedro-braz
Copy link
Contributor Author

Indeed...
@akien-mga is this something you've encountered before?

@joao-pedro-braz
Copy link
Contributor Author

Looking at how the godot-cpp test is executed I think I really need to get the new operator merged upstream on godot-cpp before this PR can be merged:

      # Dump GDExtension interface and API
      - name: Dump GDExtension interface and API for godot-cpp build
        if: ${{ matrix.godot-cpp-test }}
        run: |
          ${{ matrix.bin }} --headless --dump-gdextension-interface --dump-extension-api
          cp -f gdextension_interface.h godot-cpp/gdextension/
          cp -f extension_api.json godot-cpp/gdextension/

@AThousandShips
Copy link
Member

@ronsiv8 it's a draft, it's not done or ready, please have patience, people work as best they can on their contributions, and people can't always put in the time they want, or there's something they haven't solved yet

@ronsiv8
Copy link

ronsiv8 commented Apr 24, 2024

oh i didnt notice the TODO

@Macksaur
Copy link
Contributor

Macksaur commented Apr 29, 2024

This PR looks amazing. I have three question-suggestions.


  1. Does it support the https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/operators/null-forgiving operator?
class A:
    var cool_value := "cool!"

func _run():
    var nullable: A? = null
    print(nullable!.cool_value) # Usafe but wont cause a parse error

  1. As for ENNS I would like to raise ?= as a potential null-safe, if-assignment operator as I am weary of using the existing := here (which may one day be a valid extension to gdscript):
var foo: int? = null
if var bar ?= foo:
    print(bar) # Bar will only be printed if foo is not null and it will be non-nullable (int)

This way you take advantage of both if var being an existing parse error and permit the := syntax to be safely used in other possible future extensions of gdscript.

For clarity var x ?= y returns true when used in an expression that results in the safe extraction of a nullable value and false otherwise. It behaves exactly like := otherwise and can perhaps be typed if desired as: var x:int ?= y.


  1. Does this PR support ?? as well as ??=?
data = data ?? 123
data ??= 123 # equivalent to the above

Thank you, great work!


// Special case for container types, forbid assigning a nullable typed array to a non-nullable one (Like assigning an "Array[int?]" to an "Array[int]")
if (p_origin->get_datatype().has_container_element_type() && p_assigned_value->get_datatype().has_container_element_type()) {
GDScriptParser::DataType identifier_data_type = p_origin->get_datatype().get_container_element_type();
Copy link
Contributor

@Macksaur Macksaur Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the correct source for the datatype, p_origin->get_datatype()? this block only seems to catch:

var a:Array[int?] = [1]
var b:Array[int] = a # analyzer error

and not the statement:

b = a # runtime error, no analyzer error

I changed all the p_origin->get_datatype() references to identifier->get_datatype() in this block and it sorta works?
image
It catches the other error instead now lol.

if (p_match_pattern == nullptr) {
return;
}

#define SET_NULLABLE(value) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know Godot's style guide stance on lambdas but if ever there was a case for one...

GDScriptParser::DataType non_nullable_left_type = left_type;
non_nullable_left_type.is_nullable = false;
if (!is_type_compatible(non_nullable_left_type, right_type, false, p_coalesce_op)) {
push_error(vformat(R"(Invalid operands "%s" and "%s" for "??" operator.)", left_type.to_string(), right_type.to_string()), p_coalesce_op);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be configurable rather than a hard error since Variant catches everything that gets through the type system. With it being configurable different projects can utilize whatever style suits their productivity.

I set the datatype to GDScriptParser::DataType::VARIANT here and commented out the error for testing and it feels good to use rather than being told off.

@@ -1041,6 +1041,8 @@ void Variant::_register_variant_operators() {

register_op<OperatorEvaluatorObjectHasPropertyString>(Variant::OP_IN, Variant::STRING, Variant::OBJECT);
register_op<OperatorEvaluatorObjectHasPropertyStringName>(Variant::OP_IN, Variant::STRING_NAME, Variant::OBJECT);

register_op<OperatorEvaluatorObjectHasPropertyStringName>(Variant::OP_IN, Variant::STRING_NAME, Variant::OBJECT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a merge error? I don't understand the duplication.

@joao-pedro-braz
Copy link
Contributor Author

@Macksaur
Just wanted to thank you for keeping this PR alive and code-reviewing it.
I'll try to merge-in your suggestions ASAP, but it might take a while.

@Macksaur
Copy link
Contributor

Macksaur commented Jun 10, 2024

I'm giving it a spin on a live project trying to simplify the codebase with it. It's a fun PR, you've done great! However I find myself wanting undefined next. Scope creep! 😂

class A:
  var prop1:int
  #var prop2:int

var obj:Variant = some_obj_a()
var which = obj.prop2 ?? 123 # undefined ?? 123 => 123

I will try to add any other comments on things as I encounter them during use of the PR. Take your time, thank you!

@Macksaur
Copy link
Contributor

Macksaur commented Jun 16, 2024

This one might be out of scope and off-topic for the issue but I view the distinction of this arbitrary (even though it follows suit of C#) and would like to field some thoughts.

// Nullable access ("?." syntax) is not allowed for assignments

func _on_area_entered(area:Area2D) -> void:
	(area.owner as Entity)?.on_long_grass += 1

Error: The left-hand side of an assignment expression may not contain a nullable access operator ("?.").

However if I rewrote it as:

func _on_area_entered(area:Area2D) -> void:
	(area.owner as Entity)?.set_on_long_grass(1)

...it would be ok? Given that operators are syntactic sugar for function calls I'd expect to be able to nullable?.operator+=(value). I might look into locally adding this one on my end. I want to be able to write quicker (and correct!) code using nullability wherever I can.


edit: I made this modification locally and it's very pleasant to use. I also combined it with $?x as sugar for get_node_or_null(x) and I can delete a lot of noisy code from implementing reusable boiler-plate in my game:

before:
image
after:
image

(These functions are attached to the basic object type in my game and can be overridden/customized in derived classes or left with good (sane) defaults most of the time.)

should_skip = true;
} else if (src->get_type() == Variant::OBJECT) {
Object *obj = src->get_validated_object_with_check(should_skip);
should_skip |= obj == nullptr;
Copy link
Contributor

@Macksaur Macksaur Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this simplify to should_skip = (src->get_validated_object() == null);? OPCODE_EXIT_IF_NULL is the same.

@Macksaur
Copy link
Contributor

Can you clarify the following error?

image

As far as I understand it the null-conditional is a short-circuiting operator and the expression should terminate early with null but I am seeing the above.

Taking a page from C# https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/operators/member-access-operators#null-conditional-operators--and-:

image

Am I misunderstanding the implemented behaviour or have I made an error in my rebase?

@joao-pedro-braz
Copy link
Contributor Author

Can you clarify the following error?

image

As far as I understand it the null-conditional is a short-circuiting operator and the expression should terminate early with null but I am seeing the above.

Taking a page from C# https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/operators/member-access-operators#null-conditional-operators--and-:

image

Am I misunderstanding the implemented behaviour or have I made an error in my rebase?

The nullable access operator in this PR was inspired by JavaScript's, where it behaves as if you were conditionally accessing an inner property, based on whether the "parent" is or isn't falsy and no short-circuit is assumed. This could be changed however... it should be possible to implement the short-circuiting behavior C#'s null conditional operator has.

@joao-pedro-braz
Copy link
Contributor Author

joao-pedro-braz commented Jun 19, 2024

This one might be out of scope and off-topic for the issue but I view the distinction of this arbitrary (even though it follows suit of C#) and would like to field some thoughts.

// Nullable access ("?." syntax) is not allowed for assignments

func _on_area_entered(area:Area2D) -> void:
	(area.owner as Entity)?.on_long_grass += 1

Error: The left-hand side of an assignment expression may not contain a nullable access operator ("?.").

However if I rewrote it as:

func _on_area_entered(area:Area2D) -> void:
	(area.owner as Entity)?.set_on_long_grass(1)

...it would be ok? Given that operators are syntactic sugar for function calls I'd expect to be able to nullable?.operator+=(value). I might look into locally adding this one on my end. I want to be able to write quicker (and correct!) code using nullability wherever I can.

edit: I made this modification locally and it's very pleasant to use. I also combined it with $?x as sugar for get_node_or_null(x) and I can delete a lot of noisy code from implementing reusable boiler-plate in my game:

before: image after: image

(These functions are attached to the basic object type in my game and can be overridden/customized in derived classes or left with good (sane) defaults most of the time.)

Yeah in hindsight I don't see a good reason to disallow assignment to a nullable access. I'll do some tests and get back on it

@Macksaur
Copy link
Contributor

Huh! I didn't know that JS didn't short-circuit on chains! I admire their basic coalescing behaviours for truthy-values otherwise but I think not having short-circuiting here is a missed opportunity to express simpler (and faster) code.

Since the GDScript VM isn't jitted/compiled to any machine code it is perhaps disadvantageous to inflict users with multiple redundant null-checks in the general syntax.

Is there an advantage to the behaviour producing the following errors in JS that would be worth the distinction on every chain?

image

@Macksaur
Copy link
Contributor

Yeah in hindsight I don't see a good reason to disallow assignment to a nullable access. I'll do some tests and get back on it

I've pushed my hacky WIP branch master...Macksaur:godot:nullable_types where I've gave a go at a implementing some of these things to play with. You're welcome to yank any of it.

I've included the weird %? operator since it's fun to play with the nullable types.

@Lking03x
Copy link

Lking03x commented Jul 17, 2024

Hi, could it be possible to change the position of the ?, like this

var foo: ?Vector2 = null

It helps, among other things, spotting the nullable arguments when reading documentations

@Macksaur
Copy link
Contributor

This has been raised above, you should use the reactions on that particular reply to indicate your support or not for that syntax. I will however say that it is a non-traditional syntax and doesn't make sense in the way that you're suggesting. Type? is not an operator, it is a type-qualifier, it makes the type more-specific and is usually kept at the end of a type so that they appear together when compared.

If you have other languages to compare syntaxes with, I'd like to read about them if you could link examples.

@twigleg2
Copy link

I just wanted to leave my support and gratitude for your work. I'm very excited for when this becomes a standard feature.
Thank you, thank you!

}

if (p_should_be_false) {
return { static_cast<GDScriptParser::IdentifierNode *>(unary_op->operand) };
Copy link
Contributor

@Macksaur Macksaur Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This crashes as there's no guarantees that not x is always an identifier, it could be not self?.some_fn(). This could be fixed by delegating to another call of return deduce_nullable_narrowing(unary_op->operand).

return false;
}

Vector<GDScriptParser::IdentifierNode *> GDScriptAnalyzer::deduce_nullable_narrowing(GDScriptParser::ExpressionNode *p_node, bool p_should_be_false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does p_should_be_false mean? I took a stab in the dark and renamed it to p_current_op_is_wide, which is a double-negative-ish of p_current_op_is_not_narrowed. Is this the correct understanding?

Should it be named and simplified? i.e. p_should_be_false -> p_current_op_is_narrowed?

@Maran23
Copy link
Contributor

Maran23 commented Sep 11, 2024

Might be worth to test with the new typed Dictionary + some testcase with it, e.g. [String, int?].

@unfallible
Copy link

unfallible commented Nov 2, 2024

This looks like an exciting PR! I have a question about what “This PR is also not about making Reference Types (Like Objects) no longer implicitly null by default. Their behavior remains unchanged.” means. Does it mean that the declarations var my_node: Node2D = $Child and var my_node: Node2D? = $Child will be equivalent? If so, is appending the question mark to the object type annotations purely stylistic? Does the comment mean the nullability annotations affects how the type checker treats local variables but not member variables?

Even if there were no nullable annotations at all, the new syntactic sugar for dealing with nulls would be plenty exciting on its own. I would like to float the idea of adding a unary postfix operator which generalizes the ?. operator to work with all function arguments by short circuiting its parent expression if its argument is null. For the purpose of unpacking that word salad, I’ll call this operator the ? operator.

C# has all of the syntactic sugar described in this PR (null coalescing operators, null access operators, and ENNS syntax). These features all exist to safely mitigate the so-called “pyramid of doom,” where you have to write many lines of code to check if a single operation can be called safely. In my experience, these features are great but frequently still insufficient to avoid the problem. After all these features are implemented, by far the most common null-related pain point for me is functions with non-nullable parameters.

Imagine I’m designing a strategy game with the following class:

class Unit:
    func attack(target:Unit, weapon:Weapon)->void:
        //do something

Notice that neither the target nor weapon parameters are nullable here. We can see that making them non-nullable is clearly correct for my use case. It doesn’t make sense to attack nothing or to attack without a weapon, so if I did make these parameters nullable, my options would basically be to throw an error at runtime, or else quietly do nothing. Avoiding these sorts of confusing behaviors is precisely what non-nullable annotations are supposed to fix (or, if they're not enforced, at least draw our attention to).

However, suppose I want to call this function in a context where I can't know for sure that I will have a valid target (for example, if the target dodged my attack by moving to another space, or if another attack defeated the target earlier in the turn) or even if I’ll have a weapon equipped. In that case, depending on the implementation of ENNS (here represented by ?=), my function call will require three pretty lines of code:

if enemy ?= get_unit_at(target_position):
    if weapon ?= selected_weapon:
        unit.attack(enemy,weapon)

or two ugly lines:

if enemy ?= get_unit_at(target_position) && weapon ?= selected_weapon:
    unit.attack(enemy, weapon)

In my experience, this sort of boilerplate is ubiquitous and annoying in C#, and it creates a temptation to avoid declaring functions with non-nullable parameters even when they would be most appropriate.

A ? operator would be handy because it would let us rewrite the above in the simple, readable line unit.attack(get_unit_at(target_position)?, selected_weapon?). To understand how I envision this working, imagine the tree representation of the expression:

7. attack(Unit, Unit, Weapon) -> void
  1. unit -> Unit
  4. ?(Unit?) -> Unit {or jump to 8}
    3. get_unit_at(Vector2) -> Unit?
      2. target_position -> Vector2
  6. ?(Weapon?) -> Weapon {or jump to 8}
    5. selected_weapon -> Weapon?
8. next statement

In this picture, each line is numbered according to the order in which the expressions will evaluate. The expected type of each expression's parameters, if any, is written in parentheses after the expression's name. The return type of the expression follows the -> symbol on each line. Finally, each sub-expression’s arguments are displayed beneath the parent expression (with an extra indent). For each branch in the tree, the child expressions must return a subtype of the corresponding argument of the parent type. This tree representation is helpful here for a couple reasons.

First, it makes it clear how the ? operator is will behave. When its argument is non-null, it will simply evaluate to that value. When its argument is null, the program will skip past its parent expression. In this example, if 3. or 5. evaluated to null, then 4. or 6. would cause exectution to jump past the execution of 7. Because the parent expression is skipped when the child expression is null, the parent expression is effectively guaranteed to receive a non-null argument.

The tree representation also shows what I mean when I say that ? is actually a generalization of ?.. Note that in the expression tree, the first argument of attack is unit, i.e. the expression on left-hand side of the . operator. Suppose we wrote the statement (a?).foo(b). How would it behave? Well, the program would ensure the function call is safe when a is null by “only accessing the right-hand attribute if the left-hand base is not null, otherwise by returning null.” That is to say, it would behave identically to (a)?.foo(b). I’m not expert in parsers, but if the ? operator’s binding power was strong enough, the expression a?.foo(b) might even work without explicitly defining the ?. operator.

As with ?., the ? operator would chain nicely with itself and ??. Sticking with the attack example, suppose that the target position was optionally specified, and when it’s unspecified, we don’t want to try attacking. We could write this as unit.attack(get_unit_at(optional_position?)?, selected_weapon?). Now imagine that we never want attacks to be skipped, but we want to minimize the amount of time players spend in menus. Instead when one of our arguments is null, we instead want to just default to our most recent valid value. To do this, we just write unit.attack(get_unit_at(optional_position?) ?? previous_target, selected_weapon ?? previous_weapon).

I’ve never seen another language with a feature like the ? operator, and maybe there’s a good reason for that. I can imagine a couple objections people might raise to a feature like this, although the ones I’ve thought of myself don’t feel particularly forceful.

First, it effectively precludes the possibility of implementing the ?. operator with C#’s short-circuiting behavior. To me, this trade off would be a no brainer, but maybe others will disagree. I would much rather have to add a question here or there than a potentially nested if block. Losing the short circuit would potentially hurt performance too, since you would have to check each function call for null. That said, GDScript is a scripting language and IMO that means putting major usability improvements over micro optimizations. If the performance hit was considered too big for some reason, the compiler could be specially optimized to skip to the end of a series of null checks, including coalescing operators.

Another potential objection to ? is that doesn't improve usability at all, and its behavior is too confusing and would make code less readable. As I just stated, ? does not seem to be a common language feature, and consequently, it may be slightly harder for people to initially understand. Also, whereas the ?. operator can be understood as an operator whose return value is solely dependent on its arguments, the ? operator is weirder in a couple ways. First, by skipping over the execution of its parent expression, ? violates some intuitions about control flow and statements not being able to jump around outside of the "block" in which they're written (a similar problem to goto and break statements, although the latter have plenty of good uses). Second, ? potentially alters the return type (if any) of its parent expression. In both of these respects, ? looks similar to exception handling, which I imagine will strike many as a red flag. However, I'm not convinced that either of these quirks are significant problems in this particular context, because the ? operator can only impact a single line of code at a time. Thus ? will never produce spaghetti code the way that goto statements do. Similarly, because its effects are a localized to a single expression, the ? operator’s impact will never cascade to distant parts of the code the way that exceptions do. If there’s a concern that the operator’s parent expression won’t always be obvious, it may also be possible for the IDE to highlight the parent expression when ? is clicked, similar to how it highlights matching parentheses.

I can imagine places where people might use ? to create really gnarly one-liners, but this is true of many common language features (ternary if operators, anyone?). What matters is that there are extremely common use cases where ? is much clearer and easier to write than the alternative, and the elegance of a line like a?.foo(b?, c?, d?, e?, f?) just makes ? look very appealing to me.

@idbrii
Copy link
Contributor

idbrii commented Dec 31, 2024

The addition of the nullable access and nullable coalescing operators are exciting! Are they something that could be added separate from this PR? They'd work on any values that could be null so even API functions could benefit: print(move_and_collide()?.get_collider()), right?

I'm with @unfallible in being uncertain how these lines differ:

var my_node: Node2D = $Child
var my_node: Node2D? = $Child

I assume "inclusion of nullity safety guards" means that only variables declared as nullable (the second line) will cause parse errors when they're not null-checked. (That safety check sounds awesome!)

Are there other differences? Could I pass both of them to a function expecting f(n: Node2D) or g(n: Node2D?)? What about passing Array[int] to h(n: Array[int?])?

@Macksaur
Copy link
Contributor

Macksaur commented Dec 31, 2024

nullable access and nullable coalescing operators are exciting! Are they something that could be added separate from this PR?

I personally hadn't considered separating the two ideas but this is definitely a proposal worth considering! The operators don't rely on type information to work, they just check the variant for null values. Which means it should work fine without the typing!


[I'm] uncertain how these lines differ

In semantics there is only var my_node: Node2D? = $Child, the other form is ignored for types deriving from Object. But you are correct about more strict error checking on the second line.

What that means is that null is a valid and expected value for those types no matter what. Ideally you'd support both but this is a compromise to deal with the pre-existing expectations of null being the default value for Object-derived types.


What about passing Array[int] to h(n: Array[int?])?

This works. int fits inside an int|null type.

The other way around however... that will produce lots of errors at compile/parse time! int? is int|null and those values do not fit in an Array[int].

@unfallible
Copy link

nullable access and nullable coalescing operators are exciting! Are they something that could be added separate from this PR?

I personally hadn't considered separating the two ideas but this is definitely a proposal worth considering! The operators don't rely on type information to work, they just check the variant for null values. Which means it should work fine without the typing!

[I'm] uncertain how these lines differ

In semantics there is only var my_node: Node2D? = $Child, the other form is ignored for types deriving from Object. But you are correct about more strict error checking on the second line.

What that means is that null is a valid and expected value for those types no matter what. Ideally you'd support both but this is a compromise to deal with the pre-existing expectations of null being the default value for Object-derived types.

What about passing Array[int] to h(n: Array[int?])?

This works. int fits inside an int|null type.

The other way around however... that will produce lots of errors at compile/parse time! int? is int|null and those values do not fit in an Array[int].

Because arrays are passed by reference and mutable, passing Array[int] to h(n: Array[int?]) should not be allowed.
Consider the following code:

func h(n: Array[int?]):
    n.append(null)
    return

func main():
    var a: Array[int] = [1,2,3]
    h(a) #shouldn't be allowed
    var sum: int = 0
    for x in sum:
        sum += x  #oops, can't add null

Also, there was a much smaller discussion in proposal #11054, and the consensus there seemed to be that until Object? syntax can be consistently applied in Godot 5, using ? to denote nullable value types (e.g. int?, float?, Array? etc.) and ! to denote non-nullable reference types (e.g. Variant!, Object!, Node!, etc.) would be preferable to using Object? as an alias for Object. What are people's thoughts on that here?

@Macksaur
Copy link
Contributor

Macksaur commented Jan 3, 2025

This will definitely need lots of test cases like the above 😆 If nullable makes it into arrays hopefully we can figure out a clean way to cast/copy that's not too verbose, otherwise nullable types end up poisoning the call graph rather than extending type support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement a null-coalescing operator (??) Add support for nullable static types in GDScript