Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-design constant pools #1385

Open
abrown opened this issue Mar 23, 2020 · 3 comments
Open

Re-design constant pools #1385

abrown opened this issue Mar 23, 2020 · 3 comments

Comments

@abrown
Copy link
Contributor

abrown commented Mar 23, 2020

This is an open-ended issue for re-designing Cranelift's constant pools. In #1377 I added the ability to calculate the address of constants using const_addr and to declare constants in function preambles (e..g const42 = [0 1 2...]). Constant pools are still implemented at the function level, though, so there is no coalescing of constant values across functions.

I added the function-level ConstantPool implementation in order to support SIMD constants (which don't fit in the immediate fields of Cranelift's IR). This is not a pressing issue yet, but before other components start using constants, it might be good to discuss:

  • should we implement constant pools at the global level and what would this look like?
  • what should the relocation hooks look like for a more general constant pool implementation? Currently the interface looks like RelocSink::reloc_constant(&mut self, CodeOffset, Reloc, ConstantOffset) and this may need to change.
@abrown
Copy link
Contributor Author

abrown commented Oct 26, 2020

@cfallin, now that I think I understand the new backend better, I think this issue can be moved a bit further:

How things currently work in the new backend:

  • at the emission level, MachBuffer::pending_constants is the vector containing the constant values; eventually these get emitted into the buffer and the MachLabels are fixed up
  • MachBuffer::defer_constant(label, align, data, max_distance) is the user-accessible way to add a constant during emission (e.g. see uses of sink.defer_constant... in emit.rs)
  • Inst::gen_constant is the user-accessible way to create constant-generating instructions during lowering, but it only accepts values up to u64 in size
  • so there are problems:
    1. Inst::gen_constant cannot be used to generate vector values
    2. MachBuffer::defer_constant will duplicate constant values
    3. const_addr (a way to retrieve the address of constants for read-only table lookups) does not know which MachLabel to use in MachBuffer; to implement this instruction, we would have to search through MachBuffer::pending_constants to find a value that matched the ConstantData behind its const_addr's Constant handle

To simplify the creation of vector constants (all constants, really):

  • change the signature of MachInst::gen_constant to accept a DataValue instead of a u64 (involves fixing up 9 uses of this method)
  • search through lower.rs to see other places that should be using gen_constant but are not (e.g. using PXOR directly)
  • add a way for gen_constant to generate vectors of all 0s (PXOR/XORPS/XORPD) and all 1s (CMPPS/PCMPEQ)
  • modify Inst::XmmLoadConstSeq to become XmmLoadConst--instead of emitting the vector inline with the function code it would use sink.defer_constant
  • all of this suggests some related refactoring:
    1. remove LowerCtx::get_constant: LowerCtx::get_constant and LowerCtx::get_immediate are doing the same thing except that get_constant returns a u64 instead of the larger DataValue
    2. this affects LowerCtx::get_input, which returns a LowerInput with a constant: Option<u64> field--this field could go away (12 uses), replaced by ctx.get_immediate(...)

To keep ConstantPool's de-duplicated values AND allow const_addr to retrieve the address of a constant:

  • MachBuffer::defer_constant should keep a mapping of constant values to MachLabels (e.g. a HashMap?) and only return the MachLabel for a given value, de-duplicating them
  • this is redundant work since ConstantPool has already done this, but it is necessary work because lowering could introduce new constant values
  • I struggle with how to approach this inefficiency (suggestions?): this approach would mean that entire values would have to hashed/compared every time they are used (unlike ConstantPool, which uses Constant(u32) handles)... but perhaps this could be optimized in the future
  • perhaps we can even genericize the ConstantPool and reuse it?

When the old backend goes away:

  • ConstantPool can be simplified: ConstantPool::get_offset, ConstantPool::set_offset, binemit::const_disp4 can all go away since they are only necessary to work around old backend quirks.

@cfallin
Copy link
Member

cfallin commented Oct 26, 2020

Hi @abrown -- thanks for the detailed look into this!

MachBuffer::defer_constant is indeed the way forward here; the infra is built to emit constant pools, but just isn't wired up yet.

The API expects the client (so, the lowering code) to allocate its own labels, and specify the label for each constant. So the most straightforward thing to do here would be, for constants in the ConstantPool, make an initial pass over them and allocate labels, keeping them in a table. Then the lookup from constant-pool handle to MachLabel is just an array access. Or, slightly better, we lazily emit deferred constants, and the lookup table holds Option<MachLabel>.

I think we should probably build that first before deduplication at the MachInst-lowering level; I'm not fully convinced that the overhead of a hashmap lookup on every new constant generation is worth the potential code-size reduction (but we should measure).

So I would say we should:

  • Clean up the constant-related APIs on x64::MachInst and LowerCtx, as you suggest
  • Add a table to the LowerCtx tracking labels for every constant, emitting on first request. Then we have something like LowerCtx::constant_pool_label(...).
  • Eventually, deduplicate inside LowerCtx if we find it to be worthwhile.

@abrown
Copy link
Contributor Author

abrown commented Oct 26, 2020

Yeah, I like this better. I'll try it out and see what happens!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants