Skip to content

Add flag to dump PIR ASTs for certifier #6797

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

jaccokrijnen
Copy link
Contributor

Context

For my certifier in Coq, I need to get my hands on all intermediate PIR ASTs. I've added a plugin flag dump-cert-trace, that dumps all PIR ASTs to a single text file during a compilation.

Dumping to file

To prevent keeping all intermediate ASTs in memory, I'm dumping ASTs directly after a pass has run (rather than collecting a pure list of ASTs and dumping them at the end of the pipeline). To that end, I've added a parameter dumpCert :: Text -> m () to runPass (in the pass abstraction), which can be used to append text to the dump file. It needs to be of type Text -> m () rather than Text -> IO (), since the monad stack m does not necessarily have IO in it (the pass abstraction does not require it). The concrete monad stack in Compiler.hs does have IO, so it can be lifted into m.

As a consequence, this new argument also shows up in functions like PIR.compileProgram, PIR.compileToReadable, PIR.compileReadableToPlc. This feels very clumsy, but I couldn't find a nice way to do it. Here are two ideas:

  • The dump function could go in CompilationCtx as a Text -> IO (), but that requires a MonadIO constraint in Compiling type and the runPass function in order to actually run it. I'm not sure if that is desired.
  • Similar to how logging works (which uses traceM), use unsafePerformIO and add a "pure" function Monad m => Text -> m () in the CompilationCtx

Any other suggestions?

Pretty printing ASTs

The dumping format is textual, and easily parseable in the proof assistant. It's almost Show, but without record syntax and uses Text instead of String for performance. I've added a new typeclass SimpleShow in module Text.SimpleShow, which has some basic instances and the possibility for deriving using GHC generics. I've also written/derived instances for all PIR AST types.

I've added the SimpleShow constraints to the Compiling type/constraint synonym, this required me turning on the following extensions in some places:

  • QuantifiedConstraints (because of the constraint forall t. SimpleShow (uni t))
  • ImpredicativeTypes, to add the constraint forall t. SimpleShow (uni t) to the Compiling type synonym.

Pretty printing pass information

In addition to ASTs I'm also dumping information about which pass was run. Here I've added the PassId type, which is used as an additional field for every basic Pass constructor.

@Unisay Unisay requested a review from a team January 21, 2025 14:07
Copy link
Contributor

@effectfully effectfully left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like SimpleShow. I don't think we need field selectors in Show output, so let's just get rid of them. Which ones do we have?

If for some reason we do need field selectors, then let's use TextShow.Generic instead of rolling out our own inefficient way of doing the same. So we can always do deriving Show Blah via AsTextNoFields Blah for a custom AsTextNoFields newtype-wrapper whose Show instance uses TextShow.Generic.

instance (forall a. SimpleShow (uni a)) => SimpleShow (SomeTypeIn uni) where
simpleShow (SomeTypeIn x) = parens True (simpleShow x)

instance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please move these two instance to Text.SimpleShow instead? This module is intended to be as Plutus-unspecific as possible. It'll probably become its own library eventually like it happened with prettyprinter-configurable.

{-# LANGUAGE TupleSections #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE ImpredicativeTypes #-}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the pragma.

@@ -1,5 +1,6 @@
-- editorconfig-checker-disable-file
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE ImpredicativeTypes #-}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here. And everywhere where it's not needed.

simpleShow' U1 = ""

instance (SimpleShow' f, SimpleShow' g) => SimpleShow' (f :*: g) where
simpleShow' (x :*: y) = simpleShow' x <> " " <> simpleShow' y
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet Text.pack . show is more efficient than this entire machinery, because <> is linear (for both Text and String) and the regular Show is difference-list-based to account for that, unlike your approach.

But why optimize Show at all? You're gonna dump all that stuff on the disk, IO is gonna be your bottleneck, not list creation.

And if you do want to optimize Show, we have configurable pretty-printing specifically to be able to pretty-print things differently, with Doc having O(1) <>.

So SimpleShow is a lot of additional complexity for what's likely negative benefit.

@jaccokrijnen
Copy link
Contributor Author

Thanks @effectfully , I think it's a good idea and I've tried it out. Types that needed a custom Show instances are : Unique, Name, TyName, VarDecl and TyVarDecl.

A different problem I'm running into is that the Show instance for Text is not exactly suitable for my purposes in Coq, which parses string literals in a different way (e.g. \ is not treated in a special way). So I'd prefer to produce a custom format that basically prints the UTF-8 byte sequence. For other parts of the AST I'd just reuse Show instances. Do you have any recommendations on how to do this?

@effectfully
Copy link
Contributor

effectfully commented Jan 30, 2025

A different problem I'm running into is that the Show instance for Text is not exactly suitable for my purposes in Coq, which parses string literals in a different way (e.g. \ is not treated in a special way). So I'd prefer to produce a custom format that basically prints the UTF-8 byte sequence. For other parts of the AST I'd just reuse Show instances. Do you have any recommendations on how to do this?

Well, that does sound like custom pretty-printing then, given that we can't change the Show instance of Text. So given that you wouldn't really benefit from prettyprinter-configurable, you can do something like this:

newtype AsCoq a = AsCoq a
newtype AsCoq1 f a = AsCoq1 (f a)

deriving newtype instance Show (AsCoq Integer)
instance Show (AsCoq Text) where <...>

instance Show (AsCoq (Term Name DefaultUni DefaultFun ann)) where
    showsPrec pr (AsCoq term) = showsPrec (coerce term :: Term Name (AsCoq1 DefaultUni) DefaultFun ann)

deriving anyclass instance Pretty (AsCoq (Term Name DefaultUni DefaultFun ann)) 

Hm, except propagating AsCoq1 in there looks complicated... OK, I'll take a look.

@effectfully
Copy link
Contributor

effectfully commented Jan 30, 2025

You can take the code from #6815 or fix the Show instances at the bottom of UntypedPlutusCore.Core.Instance.CoqShow and we'll merge my PR first and then yours.

It's pretty horrible, but you don't need to understand it, just fix the straightforward instances at the bottom.

An example using the custom Text instance:

example :: Term Name DefaultUni DefaultFun ()
example = Constant () (Some (ValueOf DefaultUniString (Text.pack "abc\n")))

-- >>> showAsCoq example
-- "Constant () (Some (ValueOf DefaultUniString abc\n))"

vs

-- >>> show example
-- "Constant () (Some (ValueOf DefaultUniString \"abc\\n\"))"

@effectfully
Copy link
Contributor

Damn I implemented UPLC instead of PIR. Anyway, adding support for PIR should be a matter of adding a few straightforward instances. I'll do it tomorrow.

@jaccokrijnen
Copy link
Contributor Author

Thanks, that's quite an interesting technique! It's the first time I see how the flexibility of Term's type parameters can be useful. I'll use the code and adapt it to PIR and what I need for Text.

jaccokrijnen and others added 3 commits January 30, 2025 15:12
This adds a plugin flag dump-cert-trace, which if enabled dumps all PIR ASTs
to a .cert_trace file, which can be processed by the WIP coq certifier
(plutus-cert).
@effectfully
Copy link
Contributor

It's the first time I see how the flexibility of Term's type parameters can be useful.

The fun parameter is very useful for tests and costing calibrations and I believe it was totally worth introducing it, but arguably the uni parameter doesn't have a high utility-to-cost ratio.

I'll use the code and adapt it to PIR

You can just move the CoqShow module it to PIR without keeping it on the UPLC side, since I assume we aren't exporting UPLC into Coq quite yet. Please do keep the mapUni functions in UPLC and TPLC though, these can be useful for others things.

@bezirg
Copy link
Contributor

bezirg commented Feb 4, 2025

Any other suggestions?

Maybe you can reuse the MonadWriter CoverageIndex, changing it to MonadWriter (CertTrace,CoverageIndex)?

Another possibility is to pass around the dumpCert function as an ImplicitParam.

NamedPass n pass -> NamedPass n (hoistPass f pass)
NoOpPass -> NoOpPass
hoistPass f = \case
Pass p mainPass pre post -> Pass p (f . mainPass) pre post
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestin: use a more descriptive variable name, like passId or pId

@@ -550,19 +553,27 @@ runCompiler moduleName opts expr = do
-- GHC.Core -> Pir translation.
pirT <- original <$> (PIR.runDefT annMayInline $ compileExprWithDefs expr)
let pirP = PIR.Program noProvenance plcVersion pirT
when (opts ^. posDumpPir) . liftIO $
when (opts ^. posDumpPir) . liftIO $ do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unnecessary change

spirP <- flip runReaderT pirCtx $ PIR.compileToReadable pirP
when (opts ^. posDumpPir) . liftIO $
spirP <- flip runReaderT pirCtx $ PIR.compileToReadable (liftIO . dumpCert) pirP
when (opts ^. posDumpPir) . liftIO $ do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also I think unnecessary do

when (opts ^. posDumpPlc) . liftIO $
dumpFlat (void plcP) "typed PLC program" (moduleName ++ ".tplc-flat")

-- For now, we only dump the PIR -> PLC ASTs for the certifier,
-- so the handle can be closed.
liftIO closeCertHandle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not extremely important, but it is better to use the withHandle-idiom here to make sure that the handle is closed in case of an exception thrown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants