Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler: make sure zero-sized array types are as small as possible #2854

Merged
merged 1 commit into from
May 19, 2022

Conversation

skabbes
Copy link
Contributor

@skabbes skabbes commented May 18, 2022

As part of an #2667 - an effort to use protobufs with tinygo (particularly with WASM):

Fixes #2806

Context
The protobuf lib at google.golang.org/protobuf/runtime/protoimpl includes some code includes some code that acts as a "non-comparable" static check. I have simplified the code to its minimal repro:

package main

import (
	"unsafe"
)

// DoNotCompare can be embedded in a struct to prevent comparability.
type DoNotCompare [0]func()

type MessageState struct {
	DoNotCompare
	data *int32
}

// Static check that MessageState does not exceed the size of a pointer.
const _ = unsafe.Sizeof(unsafe.Pointer(nil)) - unsafe.Sizeof((MessageState{}))

func main() {
	println("protoc minimal example")
}

Problem

alignof([0]func()) was 2 words large, should have only been 1.

@skabbes
Copy link
Contributor Author

skabbes commented May 18, 2022

Needs #2853 to be merged to pass windows build.

@deadprogram would you be the right person to review this? I think it is pretty straightforward.

@deadprogram
Copy link
Member

@skabbes can you please rebase against dev now to see if CI passing? Thanks!

@skabbes
Copy link
Contributor Author

skabbes commented May 18, 2022

@deadprogram all green!

@deadprogram
Copy link
Member

@skabbes it would have been better to rebase without that last merge commit. I usually use the command line for this e.g.

git checkout dev
git pull --rebase origin dev
git checkout my-feature-branch
git rebase dev
git push myfork my-feature-branch -f

@skabbes
Copy link
Contributor Author

skabbes commented May 18, 2022

Ya, some projects prefer merge - some rebase. I'll squash them and force-push if that's how you prefer in tinygo.

@skabbes skabbes force-pushed the alignof_zero_func_array branch from 8d1675d to 178cec8 Compare May 18, 2022 15:00
@deadprogram
Copy link
Member

I see we need to clarify that on https://tinygo.org/docs/guides/contributing/

@skabbes
Copy link
Contributor Author

skabbes commented May 18, 2022

Happy to do it however the team finds best for them - I probably could have figured that out by looking at a few other PRs. But appreciate you letting me know. I'll rebase off dev from now on - and already did it for this branch.

If you want me to squash these too, just say the word!

skabbes added a commit to skabbes/tinygo-site that referenced this pull request May 18, 2022
@skabbes
Copy link
Contributor Author

skabbes commented May 18, 2022

I see we need to clarify that on https://tinygo.org/docs/guides/contributing/

Added a PR to clarify:
tinygo-org/tinygo-site#267

deadprogram pushed a commit to tinygo-org/tinygo-site that referenced this pull request May 18, 2022
compiler/sizes.go Outdated Show resolved Hide resolved
@deadprogram
Copy link
Member

I think you should perhaps remove commit d0e10e8 now since it was basically replaced by 8a4eecf

@skabbes
Copy link
Contributor Author

skabbes commented May 18, 2022

I think you should perhaps remove commit d0e10e8 now since it was basically replaced by 8a4eecf

OK, so I assume that you do no rewrite the commits and squash them when PRs are merged (as many other project do). You would like me to have a perfectly clean commit history, merged as-is into dev.

I'll just squash them both together in that case, since there is no point in having so many commits in dev. But there is value in having them here in this PR, so everyone review can see my thought process as I went through the changes.

In the go protobuf code, a pattern is used to statically prevent
comparable structs by embedding:

```
type DoNotCompare [0]func()

type message struct {
  DoNotCompare
  data *uint32
}
```

Previously, sizezof(message{}) is 2 words large, but it only needs to be
1 byte.  Making it be 1 byte allows protobufs to compile slightly more
(though not all the way).
@skabbes skabbes force-pushed the alignof_zero_func_array branch from 8a4eecf to 0131b81 Compare May 18, 2022 17:00
@deadprogram
Copy link
Member

@skabbes you totally get what we're trying to do, and thank you for the contribution and also for being a cool collaborator!

@skabbes skabbes requested a review from aykevl May 18, 2022 18:01
@skabbes
Copy link
Contributor Author

skabbes commented May 19, 2022

@deadprogram thanks! do you mind reviewing the code?

@deadprogram
Copy link
Member

Thanks for making the requested changes @skabbes now merging.

@deadprogram deadprogram merged commit 4c7449e into tinygo-org:dev May 19, 2022
@skabbes skabbes deleted the alignof_zero_func_array branch May 19, 2022 12:29
@dgryski
Copy link
Member

dgryski commented May 19, 2022

Seems reasonable. I know in the past there have been issues with trailing 0-byte elements in structs. Is there any danger of that happening again here?

@skabbes
Copy link
Contributor Author

skabbes commented May 19, 2022

I unfortunately do not have the context to answer that, but it as least seems that the previous intent of the code was that 0-sized structs have alignof = 1. I don't see a reason why 0-sized array types should be any different.

tinygo/compiler/sizes.go

Lines 55 to 59 in 4c7449e

a := s.Sizeof(T) // may be 0
// spec: "For a variable x of any type: unsafe.Alignof(x) is at least 1."
if a < 1 {
return 1
}

@aykevl
Copy link
Member

aykevl commented May 21, 2022

After some testing, I found that this PR isn't entirely correct either. Proof:

package main

import "unsafe"

func main() {
        v := struct {
                a byte
                b [0]func()
                c byte
        }{}
        println("unsafe.Alignof: ", int(unsafe.Alignof([0]func(){})))
        println("unsafe.Offsetof:", int(unsafe.Offsetof(v.c)))
        println("actual offset:  ", int(uintptr(unsafe.Pointer(&v.c))-uintptr(unsafe.Pointer(&v))))
}

The unsafe.Offsetof and actual offset values must match, but they do not on 32-bit architectures (on the latest dev branch with this PR included). For example, for wasm:

unsafe.Alignof:  1
unsafe.Offsetof: 1
actual offset:   4

I think this is the proper fix:

diff --git a/compiler/sizes.go b/compiler/sizes.go
index 57108323..36b8d1ab 100644
--- a/compiler/sizes.go
+++ b/compiler/sizes.go
@@ -51,6 +51,8 @@ func (s *stdSizes) Alignof(T types.Type) int64 {
                if t.Info()&types.IsString != 0 {
                        return s.PtrSize
                }
+       case *types.Signature:
+               return s.PtrSize
        }
        a := s.Sizeof(T) // may be 0
        // spec: "For a variable x of any type: unsafe.Alignof(x) is at least 1."

This is a bit different from upstream Go because func values in TinyGo consist of two pointers instead of one.

@aykevl
Copy link
Member

aykevl commented May 21, 2022

Also, to add: TinyGo does not necessarily match Go when it comes to size and alignment. Always check these things by checking against Clang or by comparing pointer values/indices.
The unsafe package can't be used either because it is computed using the values from compiler/sizes.go, and must adhere to what LLVM will make of it (not the other way around).

@deadprogram
Copy link
Member

Seems I merged this a little too quickly, then. @skabbes are you able to submit another PR with this change?

@skabbes
Copy link
Contributor Author

skabbes commented May 21, 2022

@aykevl thank you - your comment really clarified a mental model for me. The unsafe package simply reports what they layout was - it does not determine it. compiler/sizes.go kind sort of "configures" LLVM, but at the end of the day whatever memory layout clang generates is the source of truth.

@deadprogram I'll give this another shot - add some tests that check the invariants Ayke laid out above.

skabbes added a commit to skabbes/tinygo that referenced this pull request May 21, 2022
The expected behavior of this was not obvious at first glance,
so freeze aykevl's knowledge into a test:

Context in This PR: tinygo-org#2854
skabbes added a commit to skabbes/tinygo that referenced this pull request May 21, 2022
The expected behavior of this was not obvious at first glance,
so freeze aykevl's knowledge into a test:

Context in This PR: tinygo-org#2854

wip
skabbes added a commit to skabbes/tinygo that referenced this pull request May 21, 2022
The expected behavior of this was not obvious at first glance,
so freeze aykevl's knowledge into a test:

Context in This PR: tinygo-org#2854
@aykevl
Copy link
Member

aykevl commented May 21, 2022

@aykevl thank you - your comment really clarified a mental model for me. The unsafe package simply reports what they layout was - it does not determine it. compiler/sizes.go kind sort of "configures" LLVM, but at the end of the day whatever memory layout clang generates is the source of truth.

Yes, basically. Except this bit:

compiler/sizes.go kind sort of "configures" LLVM,

It doesn't. LLVM does its own thing. The only thing compiler/sizes.go is used for, is unsafe.Sizeof/unsafe.Alignof/unsafe.Offsetof. Of these, unsafe.Sizeof and unsafe.Alignof are used by the reflect package for things like Type.Size(), Type.Align(), and Value.Field(i). So essentially it has to match whatever LLVM will do to avoid tricky bugs.

@skabbes
Copy link
Contributor Author

skabbes commented May 21, 2022

Perfect, that plus a little extra reading of compiler.go - I think I got it.

I put your suggested changes up for PR #2862, along with to some tests for offsetof alignof and the actual offset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants