StorageLive/StorageDead not always emmitted in Debug mode in lastest upsteam. #301

ptersilie · 2021-03-24T14:58:13Z

Latest rustc upstream (rust-lang/rust#78360) added a change that avoids creating StorageLive/StorageDead instructions if they are not required for codegen. Our previous assumption was that all locals (except arguments which live for the entirety of the function) have StorageLive and StorageDead instructions. This breaks our computation of live locals which are needed for stopgap interpreter initialisation.

For example, for the following function

fn guard(i: u8) -> u8 {
    if i != 3 {
        9
    } else {
        10
    }
}

previously, bb0 looked like this:

    bb0: {
        StorageLive(_2);
        _2 = _1; 
        switchInt(move _2) -> [3_u8: bb2, otherwise: bb1];
    }

In latest upstream, it looks like this:

    bb0: {
        _2 = _1; 
        switchInt(move _2) -> [3_u8: bb2, otherwise: bb1];
    }

Since at this time we don't want to write our own liveness analysis (we all know how this turned out last time), a quick and dumb solution for this is to emit StorageLive for all locals at the beginning of a function, and StorageDead at the end of it (unless a local already has those). Ideally, this would be done during SIR lowering so there won't be a runtime cost.

The text was updated successfully, but these errors were encountered:

bjorn3 · 2021-03-24T15:10:17Z

Maybe you could use the MaybeRequiresStorage dataflow analysis? In absence of StorageDead it never kills locals though I think.

vext01 · 2021-03-24T16:20:57Z

In absence of StorageDead it never kills locals though I think.

Can you expand on this? Does this pass not compute the position of the Lives and Deads (so we wouldn't expect any Deads beforehand, would we?)

bjorn3 · 2021-03-24T16:29:43Z

If there are no StorageDead statements, it will never consider the locals as dead. It is not possible to compute StorageDead after the fact. It needs to happen during MIR building. StorageDead itself is the moment a local is no longer accessible. Imagine you have a pointer (not reference) to a non-drop local. When does this pointer become invalid? The moment a StorageDead for the local is executed.

ptersilie · 2021-03-25T10:04:26Z

Do you mean, that when MIR doesn't emit StorageDeads in debug mode, it assumes the local to be alive forever, even across the function boundary? So if we go ahead and emit StorageDeads at the end of a function, we may cause UB since MIR might have intended to use these values later on?

vext01 · 2021-03-25T12:06:17Z

I'd like to ask a more fundamental question: what does MaybeRequiresStorage do?

Its docstring:

  /// Dataflow analysis that determines whether each local requires storage at a                                                                                                                                    
  /// given location; i.e. whether its storage can go away without being observed.

I had first assumed that takes a MIR without any live/dead annotations and adds them. That would indeed be useful to us. But looking at it's implementation, it looks like it tries to remove local variables that may or may not already have live ranges annotated with StorageLive/StorageDead. So it's more like an unused variable killer for MIR.

If I'm right above, I'm not sure of the utility of it in helping us to add live ranges. Have I missed something?

bjorn3 · 2021-03-25T12:16:35Z

@ptersilie

Do you mean, that when MIR doesn't emit StorageDeads in debug mode, it assumes the local to be alive forever, even across the function boundary?

No, until the end of the function.

@vext01

I'd like to ask a more fundamental question: what does MaybeRequiresStorage do?

One of the uses of that dataflow pass is to determine which variables are live across yield points in a generator and thus need to be stored in the generator state.

If I'm right above, I'm not sure of the utility of it in helping us to add live ranges. Have I missed something?

Yeah, it doesn't help with determining when variables become dead, but it does help a bit determining when variables become live.

ptersilie · 2021-03-25T13:35:28Z

No, until the end of the function.

Right, then I don't really see an issue with generating deads at the end of a function. Or can you think of an example where doing that would make things go wrong?

bjorn3 · 2021-03-25T13:42:47Z

Right, then I don't really see an issue with generating deads at the end of a function.

There is no functional issue with that, but I believe it will have a performance issue with the current regalloc of Yorick.

vext01 · 2021-03-25T13:59:16Z

Agreed, because a local which lives longer than it needs to unnecessarily deprives the system of one register, potentially causing spills.

Another option (which I haven't looked at yet): we could disable the code in rustc that eliminates the storagelive/deads.

bjorn3 · 2021-03-25T14:11:50Z

There are two kinds of locals important here. Those who have their address take. Those need to be on the stack anyway and losing the ability to not collapse stack slots when the live range doesn't overlap in debug mode isn't a huge deal. Cranelift doesn't either. And locals who don't have their address taken. For those it is possible to get live ranges as precise as when using StorageLive/StorageDead by using a dataflow analysis to find the first and last usage of every local.

ptersilie · 2021-03-25T14:34:10Z

There is no functional issue with that, but I believe it will have a performance issue with the current regalloc of Yorick.

Of course, but I think @vext01 and I had already agreed that as a temporary solution we can live with that. Our register allocator is pretty dumb anyway at the moment. But if @vext01 wants to have another go at dataflow analysis I won't stand in the way (on the contrary I will probably be standing very, very far away 😉 ).

vext01 · 2021-03-25T14:38:45Z

It's actually not that hard and there are lots of existing algorithms.

The mistake we made last time was that we did it on the TIR trace.

I suggest we do it in two phases:

the dumb but easy way
the better, but more involved way.

I don't mind doing this work either. It's been a while since I got my teeth into something non-build-system-or-ci-related.

vext01 · 2021-06-24T14:16:38Z

Closing, as this is for the old yk design.

vext01 self-assigned this Mar 24, 2021

vext01 closed this as completed Jun 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StorageLive/StorageDead not always emmitted in Debug mode in lastest upsteam. #301

StorageLive/StorageDead not always emmitted in Debug mode in lastest upsteam. #301

ptersilie commented Mar 24, 2021

bjorn3 commented Mar 24, 2021

vext01 commented Mar 24, 2021

bjorn3 commented Mar 24, 2021

ptersilie commented Mar 25, 2021

vext01 commented Mar 25, 2021

bjorn3 commented Mar 25, 2021

ptersilie commented Mar 25, 2021

bjorn3 commented Mar 25, 2021

vext01 commented Mar 25, 2021

bjorn3 commented Mar 25, 2021

ptersilie commented Mar 25, 2021

vext01 commented Mar 25, 2021

vext01 commented Jun 24, 2021

StorageLive/StorageDead not always emmitted in Debug mode in lastest upsteam. #301

StorageLive/StorageDead not always emmitted in Debug mode in lastest upsteam. #301

Comments

ptersilie commented Mar 24, 2021

bjorn3 commented Mar 24, 2021

vext01 commented Mar 24, 2021

bjorn3 commented Mar 24, 2021

ptersilie commented Mar 25, 2021

vext01 commented Mar 25, 2021

bjorn3 commented Mar 25, 2021

ptersilie commented Mar 25, 2021

bjorn3 commented Mar 25, 2021

vext01 commented Mar 25, 2021

bjorn3 commented Mar 25, 2021

ptersilie commented Mar 25, 2021

vext01 commented Mar 25, 2021

vext01 commented Jun 24, 2021