Add array built-in functions to the spec #7054

Unisay · 2025-04-22T15:25:36Z

No description provided.

Unisay · 2025-04-22T15:27:06Z

doc/plutus-core-spec/cardano/builtins6.tex

+      & $[] \mapsto []$
+        \newline 
+        $[x_1] \mapsto [x_1]$
+        \newline 
+        $[x_1,\ldots,x_n] \mapsto [x_1,\ldots,x_n]$ 


I don't know a better denotation. Maybe we should use a different syntax to denotate arrays? @effectfully please suggest, what would you do?

This is how the whole table looks when rendered:

This looks good, although I think you could miss out the two [x₁] ↦ ... cases since they're covered by [x₁, ..., xₙ] with n=1 (and arguably the [] case is covered by n=0, but that might be a bit confusing).

Also,

[x₁, ..., xₙ] ↦ [x₁, ..., xₙ]

might look a bit weird,but on page 33 we've already got (x,y) ↦ (x,y) for the denotation of mkPairData, so its maybe not such a big deal.

I think you could miss out the two [x₁] ↦ ... cases since they're covered by [x₁, ..., xₙ] with n=1 (and arguably the [] case is covered by n=0, but that might be a bit confusing).

Yeah, I wasn't sure if [x₁, ..., xₙ] ↦ [x₁, ..., xₙ] subsumes [x₁] ↦ [x₁] or even [] ↦ [] so I went for more explicit notation. Happy to simplify it if you think that there is an ambiguity.

kwxm

Well done for getting to grips with the TeX! This looks pretty good, although I think the flat encoding may be wrong.

kwxm · 2025-04-22T21:55:43Z

doc/plutus-core-spec/cardano/builtins6.tex

There should probably be a paragraph at the start of the section saying that we've added a new type array(t) and saying what its denotation is: see Tables 4.7 and 4.8 on page 29. The question is what should ⟦array(t)⟧ be? Making it ⟦t⟧*, the set of all tuples of elements of the denotation of t is perfectly sensible even though that's the same as the denotation of list(t), since the only observable difference is in the operations we provide and their complexity. We could introduce some extra notation to distinguish tuples representing arrays from those representing lists, but I don't know if that's really necessary. Maybe an explanatory sentence when introducing the array type operator would be just as good.

Added a Batch 6 sub-sub-section for the array type operator and its concrete syntax.

kwxm · 2025-04-22T22:25:16Z

doc/plutus-core-spec/flat-serialisation.tex

@@ -528,7 +531,8 @@ \subsection{Constants}
  & \Econstant{\ty{bool}}(s, \texttt{False})       &&= s \cdot \bits{0}\\
  & \Econstant{\ty{bool}}(s, \texttt{True})        &&= s \cdot \bits{1}\\
  & \Econstant{\listOf{\tn}}(s,l)                  &&= \Elist^{\tn}_{\mathsf{constant}}(s, l) \\
-  & \Econstant{\pairOf{\tn_1}{\tn_2}}(s,(c_1,c_2))  &&= \Econstant{\tn_2}(\Econstant{\tn_1}(s, c_1), c_2)\\
+  & \Econstant{\arrayOf{\tn}}(s,l)                 &&= \Earray^{\tn}_{\mathsf{constant}}(s, l) \\


I think that flat encodes arrays in a different way from lists. I have a program here that'll take a flat file and decode it, describing what the bits of the encoding mean. I wrote that to help me understand the flat encoding so that I could write a proper specifcation of the parts that we use. I don't quite remember why, but I added the array operations a few weeks ago and had to write some code to decode arrays. After a bit of trial and error, it seemed that arrays are encoded by emitting a natural number that says how many elements it has, then emitting the encodings of the individual elements, and then for some reason emitting 8 zero bits afterwards. I didn't actually check the implementation though, so maybe you should have a look and see if that's correct.

I had a look but I couldn''t work out how the flat encoding for arrays is determined. I checked again and arrays are definitely encoded differently from lists.

Right, this is how vectors/arrays are "flat"-coded:

https://hackage.haskell.org/package/flat-0.6/docs/Flat-Instances-Vector.html

https://hackage.haskell.org/package/flat-0.6/docs/Flat-Instances-Array.html

The haddock for the Vector instance says

Vectors are encoded as arrays.

but that seems to be very misleading. The Haskell Array type is more like some kind of association list and you can see that in the examples for the Array instance: it's encoded as a sequence of (index,element) pairs. That's definitely not what it's doing with UPLC arrays. It seems to be doing something much more like what you'd do with standard arrays. In the source for the flat library it has

instance Flat a => Flat (V.Vector a) where size = sizeSequence encode = encodeSequence decode = decodeSequence

and in the the file that defines `encodeSequence it says

In practice, this means that the sequence is encoded as a sequence of blocks of up to 255 elements, with every block preceded by the count of the elements in the block and a final 0-length block.

This looks much more like what I observed, and my flat debugger fails when I give it an array with more than 255 elements because I assumed that there was only one block. I think (but haven't checked in detail) that it's encoding an array as a sequence of blocks, each one consisting of a size (eight bits long?) followed by that many elements, and then the 00000000 at the end is an empty block that tells you thhat you've finished. Presumably that means that you can serialise an array in many different ways depending on how you break it into blocks. I guess we''ll have work out the details and put it all into the specification. I think something similar happens in CBOR and that's described in our specification, so maybe we could steal that, or at least a modified version.

You're right, my experiments show that Strict.Vectors are encoded as sequences:

#!/usr/bin/env runhaskell import Data.Bits (testBit) import Data.ByteString (ByteString, unpack) import Data.Vector.Strict (Vector) import Data.Vector.Strict qualified as StrictVector import Data.Word (Word8) import Flat (DecodeException, Flat (..), flat, unflat) import Flat.Instances.Test (tstBits) import Flat.Instances.Vector () main :: IO () main = do let words :: Vector Word8 words = StrictVector.fromList [1, 2, 3, 4, 5] bytes :: ByteString bytes = flat words print $ tstBits words -- (True,56,"00000101 00000001 00000010 00000011 00000100 00000101 00000000") putStrLn $ renderBits bytes -- 00000101 00000001 00000010 00000011 00000100 00000101 00000000 00000001 let words' :: Either DecodeException (Vector Word8) words' = unflat bytes print words' -- Right [1,2,3,4,5] renderBits :: ByteString -> String renderBits bs = unwords [byteToBits w | w <- unpack bs] where byteToBits :: Word8 -> String byteToBits byte = [if testBit byte i then '1' else '0' | i <- [7, 6 .. 0]] -- from module Data.Vector.Orphans () in the Plutus repo instance (Flat a) => Flat (Vector a) where size = size . StrictVector.toLazy -- Strict to Lazy is O(1) encode = encode . StrictVector.toLazy decode = StrictVector.fromLazy <$> decode -- Strict from Lazy is O(1)

I interpret the encoded bits as:

00000101: size = 5 00000001: 1 00000010: 2 00000011: 3 00000100: 4 00000101: 5 00000000: end of the vector 00000001: padding

Unisay self-assigned this Apr 22, 2025

Unisay temporarily deployed to github-pages April 22, 2025 15:25 — with GitHub Actions Inactive

Unisay requested a review from effectfully April 22, 2025 15:25

Unisay commented Apr 22, 2025

View reviewed changes

kwxm requested changes Apr 22, 2025

View reviewed changes

Unisay added 2 commits April 23, 2025 16:20

Add array built-in functions to the spec

3406244

Add array type operator and its concrete syntax to Batch 6

b9499e4

Unisay force-pushed the yura/plc-spec-builtin-array branch from a52b9f8 to b9499e4 Compare April 23, 2025 14:54

Unisay temporarily deployed to github-pages April 23, 2025 14:54 — with GitHub Actions Inactive

Unisay added the No Changelog Required Add this to skip the Changelog Check label Apr 23, 2025

spec: flat serialisation of Arrays

9cb273f

Unisay temporarily deployed to github-pages April 25, 2025 09:27 — with GitHub Actions Inactive

Unisay requested a review from kwxm April 25, 2025 09:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add array built-in functions to the spec #7054

Add array built-in functions to the spec #7054

Unisay commented Apr 22, 2025

Unisay Apr 22, 2025

Unisay Apr 22, 2025

kwxm Apr 22, 2025

Unisay Apr 23, 2025

kwxm left a comment

kwxm Apr 22, 2025

Unisay Apr 23, 2025 •

edited

Loading

kwxm Apr 22, 2025

kwxm Apr 22, 2025

Unisay Apr 23, 2025 •

edited

Loading

kwxm Apr 23, 2025

Unisay Apr 24, 2025

Unisay Apr 25, 2025

Add array built-in functions to the spec #7054

Are you sure you want to change the base?

Add array built-in functions to the spec #7054

Conversation

Unisay commented Apr 22, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kwxm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Unisay Apr 23, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Unisay Apr 23, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Unisay Apr 23, 2025 •

edited

Loading

Unisay Apr 23, 2025 •

edited

Loading