Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantics of Expr(:new) underspecified #26764

Closed
Keno opened this issue Apr 10, 2018 · 15 comments · Fixed by #52169
Closed

Semantics of Expr(:new) underspecified #26764

Keno opened this issue Apr 10, 2018 · 15 comments · Fixed by #52169

Comments

@Keno
Copy link
Member

Keno commented Apr 10, 2018

The semantics of Expr(:new) with missing parameters for immutables are underspecified.
In particular, we need to decide whether the value is undefined or merely possibly uninitialized.
The primary difference between the two is how we represent this to LLVM. In particular undefined values may change on every access. As an illustrative example, consider:

struct Foo
    x::UInt8
    Foo() = new()
    Foo(x::UInt8) = new(x)
end

function bar(i...)
    f = Foo(i...)
    y = (f.x + UInt8(128))
    if f.x >= UInt8(128)
        println("World")
    end
    if y >= UInt8(128)
        println("Hello")
    end
    return nothing
end

What happens when you call bar() depends on LLVM versions and other considerations, but on my local copy, it prints nothing. However, any actual UInt8 value would print either "Hello" or "World". Unfortunately, there isn't really a way to say to llvm "I don't care what this value is, but it always has to be the same". If we decide we do not like the above behavior, we'll have to initialize all (stack? - will have to check what assumptions LLVM makes) allocations visible to LLVM.

@JeffBezanson
Copy link
Member

We should move towards initializing to zero, unless the cost is prohibitive. I suspect in these cases it's not.

@StefanKarpinski
Copy link
Member

Or just disallow incomplete construction of immutable structs.

@KristofferC
Copy link
Member

KristofferC commented Apr 10, 2018

At least for the isbits fields, see #24943. I have hit that twice when I have forgotten updating my new call when adding new isbits field and silently gotten uninitialized data in that field.

@JeffBezanson
Copy link
Member

It would be very difficult to define a type like Nullable if uninitialized fields are disallowed.

@KristofferC
Copy link
Member

KristofferC commented Apr 10, 2018

Opt in to it by enforcing putting undef in that argument in new :trollface:. Safe by default.

@vtjnash
Copy link
Member

vtjnash commented Apr 10, 2018

If only we had thought to add optimizations to make it more performant to explicitly mark undef by describing the field as having a Union of multiple possible types, rather than to define a type like Nullable with possibly-undef fields. And then had deprecated Nullable in preparation for v0.7 when that work was largely done.

@JeffBezanson
Copy link
Member

explicitly mark undef by describing the field as having a Union of multiple possible types

Not the same as undef, since the value can propagate.

@StefanKarpinski
Copy link
Member

It is exactly the same behavior as the undef that is passed to array constructors provides for arrays: #undef for pointer types and some value for bits types.

@JeffBezanson
Copy link
Member

We might be talking about different things. I'm talking about undefined references that throw errors on access, as opposed to Union{T,Nothing}, which is how I interpreted Jameson's comment.

It would be ok to have some syntax inside new to indicate that a field is undefined (in the current sense). But we couldn't use the first class value undef for that, since that means using that value itself as the value of the field. Unlike the Array constructor, where the first argument can mean whatever we want.

@StefanKarpinski
Copy link
Member

I was talking about what @KristofferC proposed. But I actually don't like that either since if new(undef) means what new() means now, then how do you write what new(undef) means currently?

@KristofferC
Copy link
Member

KristofferC commented Apr 10, 2018

then how do you write what new(undef) means currently?

You don't since that is a useless construct :). But yes, undef would have to be "lifted" to be something special and not just a normal singleton.
But there could of course also be some other special syntax, only valid inside new?

@JeffBezanson
Copy link
Member

We could use @undef, expanding to a special expression that can only appear inside new.

@StefanKarpinski
Copy link
Member

That would be ok, but I'd be rather worried that we have #undef, undef and @undef.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Apr 10, 2018

Not to mention Nothing, nothing, Cvoid, C_NULL, Union{}, Missing and missing.

Keno added a commit that referenced this issue Jun 3, 2018
For now, just bail out in this situation. There's a number of better
things we could do here. However, I want to avoid making #26764 worse
for now.

Fixes #27365
Keno added a commit that referenced this issue Jun 8, 2018
For now, just bail out in this situation. There's a number of better
things we could do here. However, I want to avoid making #26764 worse
for now.

Fixes #27365
Keno added a commit that referenced this issue Jun 8, 2018
For now, just bail out in this situation. There's a number of better
things we could do here. However, I want to avoid making #26764 worse
for now.

Fixes #27365
@vtjnash
Copy link
Member

vtjnash commented Jun 3, 2021

Related to #9147 xref

vtjnash added a commit that referenced this issue Nov 14, 2023
Since the creation of issue #26764, there now is a way to 'say to llvm
"I don't care what this value is, but it always has to be the same"' so
we can use that to instruct LLVM to not give us undefined behavior when
users are using uninitialized memory.

Fixes #26764
vtjnash added a commit that referenced this issue Nov 14, 2023
In the time since the creation of issue #26764, there _is_ now 'a way to
say to llvm "I don't care what this value is, but it always has to be
the same"', so we can use that to instruct LLVM to not give us undefined
behavior when users are using uninitialized memory. There should not be
an impact if users were already avoiding this paradigm and are fully
initializing their structs.

Fixes #26764
vtjnash added a commit that referenced this issue Nov 17, 2023
In the time since the creation of issue #26764, there _is_ now 'a way to
say to llvm "I don't care what this value is, but it always has to be
the same"' using the `freeze` instruction, so we can use that to
instruct LLVM to not give us undefined behavior when users are using
uninitialized memory. There should not be an impact if users were
already avoiding this paradigm and are fully initializing their structs.

Fixes #26764
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants