Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Epic] CEK machine optimizations #5955

Open
effectfully opened this issue May 4, 2024 · 1 comment
Open

[Epic] CEK machine optimizations #5955

effectfully opened this issue May 4, 2024 · 1 comment

Comments

@effectfully
Copy link
Contributor

effectfully commented May 4, 2024

This issue is for dumping all the plans regarding optimization of the CEK machine in a largely unstructured way.

@effectfully
Copy link
Contributor Author

effectfully commented May 4, 2024

  • we should try specializing the CEK machine at the restricting ExBudgetMode as it's the one that is used in production and having that code specialized may impact performance significantly
  • do we really need to do costing in ST? Can we just subtract from the current budget purely or whatever?
  • do we really need slippage in production? Can we just keep subtracting from the main budget and that's it?
  • Data.RandomAccessList.Class.indexOne returns a Maybe, the strict EvaluationResult instead would be more efficient. Or we could get rid of the Maybe entirely. PR: [Evaluation] [Performance] Tweak 'safeIndexOne' #6663 (-12%), [Evaluation] [Names] Define all lookups in terms of 'contIndexZero' #6702 (-1%)
  • We should statically unroll itraverseCounter_ in order not do it at runtime. PR: [Evaluation] Statically unroll 'itraverseCounter_' #5265 (-2.5%)
  • GHC Core for UntypedPlutusCore.Evaluation.Machine.Cek.ExBudgetMode contains unnecessary packing and unpacking, for example 0# -> jump $j_sGXo r#_iGvO (I# r#_iGvO). PR: [Evaluation] [Performance] Strictify 'spend' #6705 (-1.8%)
  • UntypedPlutusCore.Evaluation.Machine.Cek.Internal for the Vector-based Case-expressions is unoptimized: it contains things like int2Word# followed by a word2Int# (i.e. a non-free no-op) and what looks like an unnecessary call to integerFromWord64#
  • Is it possible to support join points in the CEK machine? The "Compiling without continuations" paper seems to say “yes”. If we had join points, we could’ve better implemented case-of-case (as per the paper) and possibly CSE
  • currently we implement the stack in the CEK machine as a pure data structure, but it is conceivable that a mutable one would be faster, so we should try it out. Doesn’t seem to be too complicated, given that the CEK machine runs in the ST monad anyway.
  • currently we use Vector in FrameCases. Would it be more efficient to use Array/SmallArray/custom data/array-primops/manual loop unrolling?
  • we should make the CEK machine parameterized by the type of names (using a LookupName name env class or whatever), so that we don't need to juggle those performance-hurting textual names -- as we should've done originally anyway (extricated from Variable name size can impact evaluation time? Or just drop the names #5834). PR (very unexpected results): [Evaluation] [Performance] Ditch 'NamedDeBruijn' in favor of 'DeBruijn' #6706
  • discharging-related bookkeeping costs us some, we should try dropping discharging at least to see how much it costs us, we're now supposed to return () anyway
  • we should return a dedicated CekResult value from the CEK machine, not just tuples. Would be a tad more efficient a whole lot less potentially problematic
  • Note [Term constructor ordering and numbers] explains how ordering of constructors has an impact of efficiency, but it seems like the Case and Constr constructors don't align with that reasoning. Also the Note says there's 8 constructors, but there's 10 now
  • Quoting Kenneth: I'm also wondering about the other implementations of RandomAccessList here. I don't remember the details, but I think we probably tried a number of different implementations in order to see which was the best. I ran index-envs-bench and looked at the html report and it looks as if the different versions win and lose in different places. In particular SkewBinarySlab looks as if it performs quite similarly. It mght be worth (a) refactoring SkewBinarySlab in a smilar way to SkewBinary (having similar implementations might help with maintainability anyway), and (b) running all of the benchmarks with CEK machines modified to use each of the different implementations in order to check that SekwBinary is still the best thing to use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant