Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

function pointer address space handling (for AVR) is wrong #106367

Closed
RalfJung opened this issue Jan 2, 2023 · 7 comments · Fixed by #107248
Closed

function pointer address space handling (for AVR) is wrong #106367

RalfJung opened this issue Jan 2, 2023 · 7 comments · Fixed by #107248
Labels
A-codegen Area: Code generation O-AVR Target: AVR processors (ATtiny, ATmega, etc.) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@RalfJung
Copy link
Member

RalfJung commented Jan 2, 2023

@eddyb points out that the handling of address space for function pointers (introduced by #73270) is fundamentally wrong: PointeeInfo can be used for optimization hints only and also only becomes relevant way too late during compilation. To properly account for function pointers having a different address space, they need to be added in the Pointer variant of the Primitive ABI type. That is the only way to fix types like Result<&i32, fn()>, which are currently going horribly wrong for Harvard architectures because the enum layout algorithm will think that &i32 and fn() are exactly the same fundamental type, and put them into one and the same field (resulting LLVM layout: { i64, ptr }).

Not sure whom to ping here, so I will pick some people that seem to be active in an around AVR: @dylanmckay @Patryk27.
Also Cc @rust-lang/compiler since this is related to some pretty fundamental compiler infrastructure

@RalfJung RalfJung changed the title function pointer address space handling is wrong function pointer address space handling (for AVR) is wrong Jan 2, 2023
@bjorn3 bjorn3 added A-codegen Area: Code generation T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. O-AVR Target: AVR processors (ATtiny, ATmega, etc.) labels Jan 2, 2023
@Patryk27
Copy link
Contributor

Patryk27 commented Jan 2, 2023

Ouch, I see - is it something that would be fixable by a person with, uhm, moderate knowledge on rustc (such as me) or rather you'd say it's something pretty big?

In any case, I can surely offer my help in testing out patches on AVR simulators and actual hardware.

@RalfJung
Copy link
Member Author

RalfJung commented Jan 2, 2023

I'm not entirely sure -- I have worked a bunch with Primitive and its surrounding types, but never tried to change them, so it is hard to say how big the fallout of that would be. @oli-obk will probably have a better idea.

@oli-obk
Copy link
Contributor

oli-obk commented Jan 9, 2023

It's going to be moderately annoying to keep rebasing, but nothing complex. Afaict there should be no issues like performance or accidental regressions/bugs coming out of this. As an MVP you could change https://doc.rust-lang.org/nightly/nightly-rustc/rustc_target/abi/enum.Primitive.html#variant.Pointer to have a PointerKind field that is ignored everywhere and only has a single variant for Memory (name bikesheddable). Once that lands, adding a Function variant should be trivial, then we just need to figure out how to use it in the llvm backend properly, which should be very self-contained.

@RalfJung
Copy link
Member Author

RalfJung commented Jan 9, 2023

That would however also affect all other targets, and e.g. Result<&u32, fn()> would not get the optimized representation any more on x86_64.

I think the Pointer variant should have an address space field, which will be 0 almost all the time except for function pointers on AVR. That matches what we do with size and alignment: all the target-specific information is explicit in Scalar.

@oli-obk
Copy link
Contributor

oli-obk commented Jan 9, 2023

I think the Pointer variant should have an address space field, which will be 0 almost all the time except for function pointers on AVR.

Hmm... I was thinking we'd make this decision in the backend, but you're right, the Primitive should contain all the relevant information and the burden of picking the right information for the target is on the producer of the Primitive, not the consumer

@eddyb
Copy link
Member

eddyb commented Jan 10, 2023

to have a PointerKind field that is ignored everywhere and only has a single variant for Memory (name bikesheddable)

It's all memory, so that would be inappropriate (at least here) - the usual split is "data" vs "code".

However, I agree with @RalfJung - we already have a type for it, even:

/// An identifier that specifies the address space that some operation
/// should operate on. Special address spaces have an effect on code generation,
/// depending on the target and the address spaces it implements.
#[derive(Copy, Clone, Debug, PartialEq, Eq, PartialOrd, Ord)]
pub struct AddressSpace(pub u32);
impl AddressSpace {
/// The default address space, corresponding to data space.
pub const DATA: Self = AddressSpace(0);
}
/// Describes how values of the type are passed by target ABIs,
/// in terms of categories of C types there are ABI rules for.
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
#[cfg_attr(feature = "nightly", derive(HashStable_Generic))]

(though I think that can be changed from u32 to u8? not that it matters that much)

@erikdesjardins
Copy link
Contributor

erikdesjardins commented Jan 23, 2023

I'm working on this. It's natural to add AddressSpace to Primitive::Pointer.

Something that keeps coming up while doing this is code that looks "obviously wrong" because it assumes all pointers are the same size as Pointer(AddressSpace::DATA)...maybe I should fix that too. (Some targets actually do this in practice, e.g. UEFI, although it's not currently possible to create, e.g. ptr addrspace(272) [64 bit pointer on 32 bit UEFI target] via Rust code.)

though I think that can be changed from u32 to u8?

Technically LLVM allows up to 23 bits (why 23?) for the address space. We need at least u16 to parse 272 in UEFI datalayout strings; I'm not sure if any target goes beyond u16::MAX.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 25, 2023
abi: add AddressSpace field to Primitive::Pointer

...and remove it from `PointeeInfo`, which isn't meant for this.

There are still various places (marked with FIXMEs) that assume all pointers
have the same size and alignment. Fixing this requires parsing non-default
address spaces in the data layout string (and various other changes),
which will be done in a followup.
(That is, if it's actually worth it to support multiple different pointer sizes.
There is a lot of code that would be affected by that.)

Fixes rust-lang#106367

r? `@oli-obk`
cc `@Patryk27`
@bors bors closed this as completed in a8b5e5d Jan 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation O-AVR Target: AVR processors (ATtiny, ATmega, etc.) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants