Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Strings in the Anoma backend #2789

Merged
merged 1 commit into from
May 28, 2024
Merged

Add support for Strings in the Anoma backend #2789

merged 1 commit into from
May 28, 2024

Conversation

paulcadman
Copy link
Collaborator

@paulcadman paulcadman commented May 24, 2024

This PR adds support for the String type, String literals and string concatenation to the Nockma backend. Support for the builtins show and intToString is not supported.

Example

test079.juvix

module test079;

import Stdlib.Prelude open;

main (s : String) : String :=
  s ++str " " ++str "✨ héllo" ++str " " ++str "world ✨";

args.nockma

[quote "Juvix!"]
$ juvix compile anoma test079.juvix
$ juvix dev nockma run test079.pretty.nockma --args args.nockma
"Juvix! ✨ héllo world ✨"

String representation

A String is a sequence of UTF-8 encoded bytes. We interpret these bytes as a sequence of bits to represent the string as an integer atom in nockma.

For example:

The string "a" is UTF-8 encoded as 97 which is 0b1100001 in bits.

The string "ab" is UTF-8 encoded at the pair of bytes: 97 98 which is 0b1100001 0b1100010.

When we combine the bytes into a single sequence of bits we must take care to pad each binary representation with zeros to each byte boundary.

So the binary representation of "ab" as an atom is 0b110000101100010 or 24930 as an integer atom.

String concatenation

We use the cat function in the Anoma stdlib to concatenate the bytes representing two strings.

We need to use the block parameter 3 in the Anoma call because we want to treat the atoms representing the strings as sequences of bytes (= 2^3 bits).

To find the relevant Nock code to call cat with block parameter 3 we use the urbit dojo as follows:

 =>  anoma  !=(~(cat block 3))
[8 [9 10 0 7] 9 4 10 [6 7 [0 3] 1 3] 0 2]

Stdlib intercept in Evaluator

The evaluator has support for strings using AtomHints, so strings can be printed and traced. The stdlib cat call is also intercepted because evaluating the unjetted hoon version is slow.

String support in pretty nockma

In a pretty nockma file or nock quasi-quote you can write double quoted string literals, e.g "abc". These are automatically translated to UTF-8 integer atoms as in the previous section.

@paulcadman paulcadman self-assigned this May 24, 2024
@paulcadman paulcadman force-pushed the anoma-string branch 2 times, most recently from afcd497 to 16748d8 Compare May 24, 2024 17:39
@paulcadman paulcadman added this to the 0.6.2 milestone May 24, 2024
@paulcadman paulcadman marked this pull request as ready for review May 28, 2024 08:46
@paulcadman paulcadman merged commit 9faa88d into main May 28, 2024
6 of 8 checks passed
@paulcadman paulcadman deleted the anoma-string branch May 28, 2024 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants