Zeolite is a statically-typed, general-purpose programming language. The type system revolves around defining objects and their usage patterns.
Zeolite prioritizes making it easy to write maintainable and understandable code. This is done by rethinking standard language idioms and limiting flexibility in some places while increasing it in others. In particular, emphasis is placed on the user's experience when troubleshooting code that is incorrect.
The design of the type system and the language itself is influenced by positive and negative experiences with Java, C++, Haskell, Python, Ruby, and Go, with collaborative development, and with various systems of code-quality enforcement.
Due to the way GitHub renders embedded HTML, the colors might not show up in the syntax-highlighted code in this document. If you use the Chrome browser, you can view the intended formatting using the Markdown Viewer extension to view the raw version of this document.
- Project Status
- Language Overview
- Quick Start
- Writing Programs
- Layout and Dependencies
- Unit Testing
- Compiler Pragmas and Macros
- Known Language Limitations
Zeolite is still evolving in all areas (syntax, build system, etc.), and it still lacks a lot of standard library functionality. That said, it was designed with practical applications in mind. It does not prioritize having impressive toy examples (e.g., merge-sort or "Hello World" in one line); the real value is seen in programs with higher relational complexity.
This section discusses some of the features that make Zeolite unique. It does not go into detail about all of the language's features; see Writing Programs and the full examples for more specific language information.
Zeolite currently uses both procedural and object-oriented programming paradigms. It shares many features with Java, but it also has additional features and restrictions meant to simplify code maintenance.
The initial motivation for Zeolite was a type system that allows implicit
conversions between different parameterizations of parameterized types. A
parameterized type is a type with type "place-holders", e.g., template
s in C++
and generics in Java.
Java and C++ do not allow you to safely convert between different
parameterizations. For example, you cannot safely convert a List<String>
into
a List<Object>
in Java. This is primarily because List
uses its type
parameter for both input and output.
Zeolite, on the other hand, uses declaration-site variance for each parameter. (C# also does this to a lesser extent.) This allows the language to support very powerful recursive type conversions for parameterized types. Zeolite also allows use-site variance declarations, like Java uses.
Building variance into the core of the type system also allows Zeolite to have a special meta-type that interfaces can use to require that implementations return a value of their own type rather than the type of the interface. This is particularly useful for defining interfaces for iterators and builders, whose methods often perform an update and return a value of the same type.
Zeolite treats type parameters both as type place-holders (like in C++ and
Java) and as type variables that you can call functions on. This further
allows Zeolite to have interfaces that declare functions that operate on types
in addition to interfaces that declare functions that operate on values. (This
would be like having abstract static
methods in Java.)
This helps solve a few separate problems:
-
Operations like
equals
comparisons in Java are always dispatched to the left object, which could lead to inconsistent results if the objects are swapped:foo.equals(bar)
is not the same asbar.equals(foo)
. This dispatching asymmetry can be eliminated by makingequals
a type function (e.g.,MyType.equals(foo, bar)
), and further creating an interface that requires implementations to support such calls. -
Factory patterns can be abstracted out into interfaces. For example, you could create a factory interface that requires an implementation to parse a new object from a
String
, without needing to instantiate the factory object itself. You could just implement the factory function directly inMyType
, without needing a separateMyTypeFactory
.
The major advantage of statically-typed programming languages is their compile-time detection of code that should not be allowed. On the other hand, there is a major testability gap when it comes to ensuring that your statically-typed code disallows what you expect it to.
Zeolite has a special source-file extension for unit tests, and a built-in compiler mode to run them.
-
Tests can check for runtime success, compilation success, compilation failure, and even crashes. Normally you would need a third-party test runner to check for required compilation failures and crashes.
-
The test mode includes a command-line option to collect code-coverage data, which can be critical for determining test efficacy.
Nearly all of the integration testing of the Zeolite language itself is done using this feature, but it is also supported for general use with Zeolite projects.
The Zeolite compiler supports a module system that can incrementally compile
projects without the user needing to create build scripts or Makefile
s.
- Modules are configured via a simple config file.
- File-level and symbol-level imports and includes are not necessary, allowing module authors to freely rearrange file structure.
- Type dependencies are automatically resolved during linking so that output binaries contain only the code that is relevant.
- Module authors can back Zeolite code with C++.
- The module system is integrated with the compiler's built-in testing mode.
This means that the programmer can focus on code rather than on build rules, and module authors can avoid writing verbose build instructions for the users of their modules.
The overall design of Zeolite revolves around data encapsulation:
-
No default construction or copying. This means that objects can only be created by explicit factory functions. (A very common mistake in C++ code is forgetting to disallow or override default construction or copying.) This also means that accidental deep copying is not even possible in Zeolite.
-
Only abstract interfaces can be inherited. Types that define procedures or contain data members cannot be further extended. This encourages the programmer to think more about usage patterns and less about data representation when designing interactions between types.
-
No "privileged" data-member access. No object has direct access to the data members of any other object; not even other objects of the same type. This forces the programmer to also think about usage patterns when dealing with other objects of the same type.
-
No
null
values. Every variable must be explicitly initialized. This obviates questions of whether or not a variable actually contains a value. When combined with the elimination of default construction, this means that a variable can only hold values that a type author has specifically allowed.Zeolite has an
optional
storage modifier for use in a variable's type, but it creates a new and incompatible type, whose value cannot be used without first extracting it. This is in contrast to Java allowing any variable to benull
. (It is more likeOptional
in Java 8, but as built-in syntax, and with nesting disallowed.) -
No reflection or down-casting. The only thing that has access to the "real" type of a value is the value itself. This means that the reflection and down-casting tricks available in Java to circumvent the compiler are not available in Zeolite. (User-defined types can selectively allow "down-casting like" semantics by reasoning about parameters as variables, however.)
-
Implementation details are kept separate. In Zeolite, only public inheritance and public functions show up where an object type is declared, to discourage relying on implementation details of the type.
C++ and Java allow (and in some cases require) implementation details (data members, function definitions, etc.) to show up in the same place as the user-accessible parts of a
class
. The result is that the user of theclass
will often rely on knowledge of how it works internally.
Although all of these limitations preclude a lot of design decisions allowed in languages like Java, C++, and Python, they also drastically reduce the possible complexity of inter-object interactions. Additionally, they generally do not require ugly work-arounds; see the full examples.
Requirements:
- A POSIX-compliant operating system. Zeolite has been tested on Linux and FreeBSD, and to a limited extent on MacOS. It probably won't work on Windows, due to how it interacts with the filesystem and subprocesses.
- A Haskell compiler such as
ghc
that can install packages usingcabal
, as well as thecabal
installer. - A C++ compiler such as
clang++
org++
and the standardar
archiver present on most Unix-like operating systems.
If you use a modern Linux distribution, most of the above can be installed using
the package manager that comes with your distribution. On MacOS, you can install
Xcode for a C++ compiler and brew install cabal-install
for cabal
.
Once you meet all of those requirements, follow the installation instructions
for the zeolite-lang
package on
Hackage. Please take a look at the issues page if
you run into problems.
The entire process will probably look like this, once you have cabal
and a C++
compiler installed:
$ cabal update
# Also add --overwrite-policy=always if you're upgrading to a specific version.
$ cabal install zeolite-lang
$ zeolite-setup -j8
# Follow interactive prompts...
You might also need to add $HOME/.cabal/bin
or $HOME/.local/bin
to your
$PATH
.
For syntax highlighting in Visual Studio Code, See "VS Code Support"
in the Zeolite releases and download the .vsix
file. If
you happen to use the kate
text editor, you can use the syntax
highlighting in zeolite.xml
.
It's the any% of programming.
// hello-world.0rx concrete HelloWorld { @type run () -> () } define HelloWorld { run () { \ BasicOutput.stderr().writeNow("Hello World\n") } }
# Compile.
zeolite -I lib/util --fast HelloWorld hello-world.0rx
# Execute.
./HelloWorld
Also see some full examples for more complete feature usage.
This section breaks down the separate parts of a Zeolite program. See the full examples for a more integrated language overview.
Zeolite programs use object-oriented and procedural programming paradigms.
Type categories are used to define object types, much like class
es in
Java and C++. They are not called "classes", just to avoid confusion about
semantic differences with Java and C++.
All type-category names start with an uppercase letter and contain only letters and digits.
All procedures and data live inside concrete
type categories. Every program
must have at least one concrete
category with the procedure to be executed
when the program is run.
concrete
categories are split into a declaration and a definition.
Code for both should be in files ending with .0rx
. (The .0rp
file type
contains only declarations, and will be discussed later.)
// myprogram/myprogram.0rx // This declares the type. concrete MyProgram { // The entry point must be a () -> () function. This means that it takes no // arguments and returns no arguments. (@type will be discussed later.) @type run () -> () } // This defines the type. define MyProgram { run () { // ... } }
IMPORTANT: All programs or modules must be in their own directory so that
zeolite
is able to cache information about the build. Unlike some other
compilers, you do not specify all command-line options every time you
recompile a binary or module.
# Create a new .zeolite-module config. (Only once!)
zeolite -m MyProgram myprogram
# Recompile the module and binary. (After any config or code updates.)
# All sources in myprogram will be compiled. -m MyProgram selects the entry
# point. The default output name for the binary here is myprogram/MyProgram.
zeolite -r myprogram
# Execute.
myprogram/MyProgram
# An alternative, if you only have one .0rx and want to quickly iterate.
zeolite --fast MyProgram myprogram/myprogram.0rx
A function declaration specifies the scope of the function and its argument and return types. (And optionally type parameters and parameter filters, to be discussed later.) The declaration simply indicates the existence of a function, without specifying its behavior.
All function names start with a lowercase letter and contain only letters and digits.
concrete MyCategory { // @value indicates that this function requires a value of type MyCategory. // This function takes 2x Int and returns 2x Int. @value minMax (Int, Int) -> (Int, Int) // @type indicates that this function operates on MyCategory itself. This is // like a static function in C++. // This function takes no arguments and returns MyCategory. @type create () -> (MyCategory) // @category indicates that this function operates on MyCategory itself. This // is like a static function in Java. (The semantics of @category are similar // to those of @type unless there are type parameters.) @category copy (MyCategory) -> (MyCategory) }
In many cases, the choice between using a @category
function or a @type
function is arbitrary, but there are pros and cons of each:
-
@category
functions do not inherit any of the category's parameters, or their filters. This can be useful in a few situations:-
You want to impose additional restrictions on what parameters can be used. For example,
Vector:createSize
(lib/container
) requires that param#y defines Default
so that it can populate theVector
with default values. -
The caller will pass arguments that can be used to infer the category's parameters. For example,
Vector:duplicateSize
(lib/container
) takes a single#y
. Since#y
is a function param, it can be inferred from the argument that gets passed, e.g.,Vector:duplicateSize(0, 25)
.
If neither of these situations apply, a
@type
function might be better. -
-
@type
functions do inherit the category's parameters and their filters, which means that they do not need to be specified again, and they do not need to be passed again when calling from another@type
or@value
function. This is more efficient to maintain and execute.@type interface
s are another advantage of@type
functions.
Functions are defined in the category definition. They do not need to repeat the function declaration; however, they can do so in order to refine the argument and return types for internal use.
All function names start with a lowercase letter and contain only letters and digits.
The category definition can also declare additional functions that are not visible externally.
concrete MyCategory { @type minMax (Int, Int) -> (Int, Int) } define MyCategory { // minMax is defined here. minMax (x, y) { if (superfluousCheck(x, y)) { return x, y } else { return y, x } } // superfluousCheck is only available inside of MyCategory. @type superfluousCheck (Int, Int) -> (Bool) superfluousCheck (x, y) { return x < y } }
All arguments must either have a unique name or be ignored with _
.
@value
functions have access to a special constant self
, which refers to
the object against which the function was called.
Variables are assigned with <-
to indicate the direction of assignment.
Every variable must be initialized; there are no null
values in Zeolite.
(However, see optional
later on.)
All variable names start with a lowercase letter and contain only letters and digits.
When a location is needed for assignment (e.g., handling a function return,
taking a function argument), you can use _
in place of a variable name to
ignore the value.
// Initialize with a literal. Int value <- 0 // Initialize with a function result. Int value <- getValue()
Unlike other languages, Zeolite does not allow variable masking. For example,
if there is already a variable named x
available, you cannot create a new
x
variable even in a smaller scope.
All variables are shared and their values are not scoped like they are in C++. You should not count on knowing the lifetime of any given value.
As of compiler version 0.24.0.0
, you can also swap the values of two
variables that have the same type, as long as both are writable. This is more
efficient than "manually" swapping using a temp variable.
Int foo <- 123 Int bar <- 456 foo <-> bar
Return values from function calls must always be explicitly handled by assigning them to a variable, passing them to another function or ignoring them. (This is required even if the function does not return anything, primarily to simplify parsing.)
// Utilize the return. Int value <- getValue() // Explicitly ignore a single value. _ <- getValue() // Ignore all aspects of the return. // (Prior to compiler version 0.3.0.0, ~ was used instead of \.) \ printHelp()
- Calling a function with
@value
scope requires a value of the correct type, and uses.
notation, e.g.,foo.getValue()
. - Calling a function with
@type
scope requires the type with parameter substitution (if applicable), and uses.
notation, e.g.,MyCategory<Int>.create()
. (Prior to compiler version0.9.0.0
,$
was used instead of.
.) - Calling a function with
@category
scope requires the category itself, and uses the:
notation, e.g.,MyCategory:foo()
. (Prior to compiler version0.9.0.0
,$$
was used instead of:
.) - You can skip qualifying function calls (e.g., in the example above) if the
function being called is in the same scope or higher. For example, you can
call a
@type
function from the procedure for a@value
function in the same category.
Functions cannot be overloaded like in Java and C++. Every function must have a unique name. Functions inherited from different places can be explicitly merged, however. This can be useful if you want interfaces to have overlapping functionality without having an explicit parent for the overlap.
@value interface ForwardIterator<|#x> { next () -> (optional #self) get () -> (#x) } @value interface ReverseIterator<|#x> { prev () -> (optional #self) get () -> (#x) } concrete Iterator<|#x> { refines ForwardIterator<#x> refines ReverseIterator<#x> // An explicit override is required in order to merge get from both parents. get () -> (#x) }
Zeolite allows some functions to be used as operators. This allows users to avoid excessive parentheses when using named mathematical functions.
Functions with two arguments can use infix notation. The operator precedence
is always between comparisons (e.g., ==
) and logical (e.g., &&
).
Functions with one argument can use prefix notation. These are evaluated strictly before all infix operators.
concrete Math { @type plus (Int, Int) -> (Int) @type neg (Int) -> (Int) } // ... // Math.plus is evaluated first. Int x <- 1 `Math.plus` 2 * 5 // Math.neg is evaluated first. Int y <- `Math.neg` x `Math.plus` 2
Unlike Java and C++, there is no "default construction" in Zeolite. In addition, Zeolite also lacks the concept of "copy construction" that C++ has. This means that new values can only be created using a factory function. In combination with required variable initialization, this ensures that the programmer never needs to worry about unexpected missing or uninitialized values.
Data members are never externally visible; they only exist in the category definition. Any access outside of the category must be done using explicitly-defined functions.
concrete MyCategory { @type create () -> (MyCategory) } define MyCategory { // A data member unique to each MyCategory value. @value Int value create () { // Initialization is done with direct assignment. return MyCategory{ 0 } } } // ... // Create a new value in some other procedure. MyCategory myValue <- MyCategory.create()
There is no syntax for accessing a data member from another object; even
objects of the same type. This effectively makes all variables internal
rather than just private
like in Java and C++. As long as parameter variance
is respected, you can provide access to an individual member with getters and
setters.
As of compiler version 0.14.0.0
, you can use #self
in
place of the full type when you are creating a value of the same type from a
@type
or @value
function.
concrete MyCategory { @type create () -> (MyCategory) } define MyCategory { @value Int value create () { return #self{ 0 } } }
A category can also have @category
members, but not @type
members. (The
latter is so that the runtime implementation can clean up unused @type
s
without introducing ambiguitites regarding member lifespan.)
concrete MyCategory { @type global () -> (MyCategory) } define MyCategory { @value Int value // @category members use inline initialization. @category MyCategory singleton <- MyCategory{ 0 } global () { // @category members are accessible from all functions in the category. return singleton } }
Zeolite uses the if
/elif
/else
conditional construct. The elif
and else
clauses are always optional.
if (x) { // something } elif (y) { // something } else { // something }
Variables can be scoped to specific blocks of code. Additionally, you can
provide a cleanup procedure to be executed upon exit from the block of code.
This is useful if you want to free resources without needing to explicitly do so
for every return
statement.
// Simple scoping during evaluation. scoped { Int x <- getValue() } in if (x < 0) { // ... } elif (x > 0) { // ... } else { // ... } // Simple scoping during assignment. scoped { Int x <- getValue1() Int y <- getValue2() } in Int z <- x+y // Scoping with cleanup. scoped { // ... } cleanup { // ... } in { // ... } // Cleanup without scoping. cleanup { i <- i+1 // Post-increment behavior. } in return i
The cleanup
block is executed at every return
, break
, and continue
in
the respective in
block, and right after the in
block. For this reason, you
cannot use return
, break
, or continue
within a cleanup
block.
Additionally, you cannot overwrite named returns. You can use fail
, however,
since that just ends program execution.
When cleanup
is executed at a return
statement in the in
block, the
returns from the return
statement are "locked in", then cleanup
is executed,
then those locked-in return values are returned. (This is what allows the
post-increment example above to work.)
Zeolite supports two loop types:
-
while
loops, which are the traditional repetition of a procedure while a predicate holds.// With break and continue. while (true) { if (true) { break } else { continue } } // With an update after each iteration. while (true) { // ... } update { // ... }
-
traverse
loops (as of compiler version0.16.0.0
), which automatically iterate over the#x
values in anoptional Order<#x>
. This is similar tofor (int i : container) { ... }
in C++ andfor i in container: ...
in Python.traverse (orderedStrings -> String s) { // executed once per String s in orderedStrings // you can also use break and continue } // With an update after each iteration. traverse (orderedStrings -> String s) { // ... } update { // ... }
Since the
Order
is optional,empty
can be used to iterate zero times.IMPORTANT: Most containers are not iterable by
traverse
as-is; you will need to call a@value
function to get theOrder
. Some categoriesrefine DefaultOrder<#x>
(such asString
, andVector
inlib/container
), which allows you to use itsdefaultOrder()
. Other categories provide multiple ways toOrder
the container, such asSearchTree
inlib/container
.traverse ("hello".defaultOrder() -> Char c) { // executed once per Char c in "hello" }
for
loops (e.g., for (int i = 0; i < foo; ++i) { ... }
in C++) are not
supported, since such syntax is too restrictive to scale, and they can be
replaced with traverse
or scoped
+while
in nearly all situations.
// Combine while with scoped to create a for loop. scoped { Int i <- 0 Int limit <- 10 } in while (i < limit) { // ... } update { i <- i+1 }
A procedure definition has two options for returning multiple values:
-
Return all values. (Prior to compiler version
0.3.0.0
, multiple returns were enclosed in{}
, e.g.,return { x, y }
.)define MyCategory { minMax (x, y) { if (x < y) { return x, y } else { return y, x } } }
-
Naming the return values and assigning them individually. This can be useful (and less error-prone) if the values are determined at different times. The compiler uses static analysis to ensure that all named variables are guaranteed to be set via all possible control paths.
define MyCategory { // Returns are named on the first line. minMax (x, y) (min, max) { // Returns are optionally initialized up front. min <- y max <- x if (x < y) { // Returns are overwritten. min <- x max <- y } // Implicit return makes sure that all returns are assigned. Optionally, // you can use return _. } }
-
To return early when using named returns or when the function has no returns, use
return _
. You will get an error if a named return might not be set.
The caller of a function with multiple returns also has a few options:
-
Assign the returns to a set of variables. You can ignore a position by using
_
in that position. (Prior to compiler version0.3.0.0
, multiple assignments were enclosed in{}
, e.g.,{ Int min, _ } <- minMax(4,3)
.)Int min, _ <- minMax(4, 3)
-
Pass them directly to a function that requires the same number of compatible arguments. (Note that you cannot concatenate the returns of multiple functions.)
Int delta <- diff(minMax(4, 3))
-
If you need to immediately perform an operation on just one of the returned values while ignoring the others, you can select just that return inline. (As of compiler version
0.21.0.0
.)// Select return 0 from minMax. return minMax(4, 3){0}
Note that the position must be an integer literal so that the compiler can validate both the position and the return type.
Zeolite requires that all variables be initialized; however, it provides the
optional
storage modifier to allow a specific variable to be
empty
. This is not the same as null
in Java because optional
variables need to be require
d before use.
// empty is a special value for use with optional. optional Int value <- empty // Non-optional values automatically convert to optional. value <- 1 // present returns true iff the value is not empty. if (present(value)) { // Use require to convert the value to something usable. \ foo(require(value)) }
As of compiler version 0.24.0.0
, you can use <-|
to conditionally
overwrite an optional
variable if it's currently empty
.
optional Int value <- empty value <-| 123 // Assigned, because value was empty. value <-| 456 // Not assigned, because value wasn't empty. value <-| foo() // foo() isn't called unless value is empty.
Note that if the right side isn't optional then you can use the result as non-optional.
optional Int value <- empty Int value2 <- (value <-| 123)
As of compiler version 0.24.0.0
, you can conditionally call a function on an
optional
value if it's non-empty
using &.
.
optional Int value <- 123 // All returned values will be optional. optional Formatted formatted <- value&.formatted() // foo() won't be called unless the readAt call is going to be made. optional Char char <- formatted&.readAt(foo())
As of compiler version 0.24.1.0
, you can use x <|| y
to use y
if x
is empty. Note that x
must have an optional
type, and the resulting type of
the entire expression is the type union of the types of x
and
y
.
weak
values allow your program to access a value if it is available,
without holding up that value's cleanup if nothing else needs it. This can be
used to let threads clean themselves up (example below) or to handle cycles in
references between objects.
concrete MyRoutine { @type createAndRun () -> (MyRoutine) @value waitCompletion () -> () } define MyRoutine { refines Routine // (See lib/thread.) @value weak Thread thread createAndRun () { // Create a new MyRoutine and then start the thread. return MyRoutine{ empty }.start() } run () { // routine } waitCompletion () { scoped { // Use strong to turn weak into optional. If the return is non-empty, the // value is guaranteed to remain valid while using thread2. optional Thread thread2 <- strong(thread) } in if (present(thread2)) { \ require(thread2).join() } } @value start () -> (#self) start () { // ProcessThread holds a reference to itself only while the Routine is // running. Making thread weak means that the ProcessThread can clean itself // up once the Routine terminates. thread <- ProcessThread.from(self).start() return self } }
In some situations, a variable's value depends on conditional logic, and there
is no low-cost default value. In such situations, you can use the defer
keyword to allow a variable to be temporarily uninitialized. (As of compiler
version 0.20.0.0
.)
LargeObject object <- defer if (debug) { object <- LargeObject.newDebug() } else { object <- LargeObject.new() } \ object.execute()
In this example, object
is declared without an initializer, and is then
initialized in both the if
and else
clauses.
- A variable initialized with
defer
must be initialized via all possible control paths prior to its use. This is checked at compile time. - An existing variable can also be marked as
defer
red. This will not change its value, but will instead require that it be assigned a new value before it gets used again. - If you never read the variable in a particular control branch then you do not need to initialize it; initialization is only checked where necessary.
There are two ways to terminate the program immediately.
-
The
fail
builtin can be used to immediately terminate the program with a stack trace. This is not considered a function call since it cannot return; therefore, do not precede it with\
.define MyProgram { run () { fail("MyProgram does nothing") } }
The value passed to
fail
must implement theFormatted
builtin@value interface
.The output to
stderr
will look something like this:./MyProgram: Failed condition: MyProgram does nothing From MyProgram.run at line 7 column 5 of myprogram.0rx From main Terminated
-
The
exit
builtin can be used to immediately terminate the program with a traditionalInt
exit code. (0 conventionally means program success.) This is not considered a function call since it cannot return; therefore, do not precede it with\
.define MyProgram { run () { exit(0) } }
The value passed to
exit
must be anInt
.
As of compiler version 0.24.0.0
, you can delegate function calls using the
delegate
keyword. This has the effect of forwarding all of the
arguments passed to the enclosing function call to the handler specified. (The
call is actually rewritten using a substitution during compilation.)
new (value1,value2) { // Same as Value{ value1, value2 }. \ delegate -> Value // Same as foo(value1,value2). \ delegate -> `foo` // Same as something(123).bar(value1,value2). \ delegate -> `something(123).bar` }
IMPORTANT: If the enclosing function specifies argument labels then those will be used in the forwarded call.
@type new (String name:, Int) -> (Value) new (value1,value2) { // Same as foo(name: value1, value2). return delegate -> `foo` }
IMPORTANT: Delegation will fail to compile if:
- One or more function arguments is ignored with
_
, e.g.,call(_) { ... }
. - One or more function arguments is hidden with
$Hidden[]$
, e.g.,$Hidden[someArg]$
.
This is primarily as a sanity check, since all of the above imply that a given argument should not be used.
As of compiler version 0.24.0.0
, function declarations in Zeolite can
optionally have labels for any individual argument. Note that this is a
label and not an argument name.
All labels start with a lowercase letter and contain only letters and digits,
and end with :
.
-
The syntax for labeling an argument in the function declaration is to specify it after the type.
concrete Value { // The first argument requires start: as a label. @type new (Int start:) -> (Value) }
-
The syntax for labeling an argument in the function call is to precede the argument with the label.
// The first argument is labeled with start:. Value foo <- Value.new(start: 123)
IMPORTANT: If the function declaration specifies a label, the label must always be used when calling that function. Additionally, arguments must still be passed in the same order; labels don't allow you to reorder arguments.
When defining a function, the name you give to an argument should match the label, but that isn't a requirement. Also note that labels can be reused, e.g.,
@value swapRows (Int row:, Int row:) -> ()
. This allows the label to be descriptive rather than just an identifier.
All concrete
categories and all interface
s can have type parameters. Each
parameter can have a variance rule assigned to it. This allows the compiler to
do type conversions between different parameterizations.
Parameter names must start with #
and a lowercase letter, and can only contain
letters and digits.
Parameters are never repeated in the category or function definitions. (Doing so would just create more opportunity for unnecessary compile-time errors.)
// #x is covariant (indicated by being to the right of |), which means that it // can only be used for output purposes. @value interface Reader<|#x> { read () -> (#x) } // #x is contravariant (indicated by being to the left of |), which means that // it can only be used for input purposes. @value interface Writer<#x|> { write (#x) -> () } // #x is for output and #y is for input, from the caller's perspective. @value interface Function<#x|#y> { call (#x) -> (#y) } // By default, parameters are invariant, i.e., cannot be converted. You can also // explicitly specify invariance with <|#x|>. This allows all three variance // types to be present. concrete List<#x> { @value append (#x) -> () @value head () -> (#x) } // Use , to separate multiple parameters that have the same variance. concrete KeyValue<#k, #v> { @type new (#k, #v) -> (#self) @value key () -> (#k) @value value () -> (#v) }
-
Specifying parameter variance allows the compiler to automatically convert between different types. This is done recursively in terms of parameter substitution.
// Covariance allows conversion upward. Reader<MyValue> reader <- // ... Reader<MyBase> reader2 <- reader // Contravariance allows conversion downward. Writer<MyBase> writer <- // ... Writer<MyValue> writer2 <- writer // Conversion is also recursive. Writer<Reader<MyBase>> readerWriter <- // ... Writer<Reader<MyValue>> readerWriter2 <- readerWriter // Invariance does not allow conversions. List<MyValue> list <- // ... List<MyBase> list2 <- // ...
-
You can apply filters to type parameters to require that the parameters meet certain requirements. This allows you to call
interface
functions on parameters and their values in procedure definitions.concrete Helper { @type format<#x> // Ensures that #x -> Formatted. (Like T extends Foo in Java.) // Example: String f <- Helper.format(123) #x requires Formatted (#x) -> (String) @type get<#x> // Ensures that #x <- String. (Like T super Foo in Java.) // Example: AsBool v <- Helper.get<AsBool>() #x allows String () -> (#x) @type create<#x> // Ensures that #x defines the Default @type interface. // Example: Int v <- Helper.create<Int>() #x defines Default () -> (#x) } define Helper { format (x) { // #x -> Formatted means x has formatted(). return x.formatted() } get () { // #x <- String means we can return a String here. return "message" } create () { // #x defines Default means #x has default(). return #x.default() } }
Filters on category params must be specified after all
refines
/defines
and before any function declarations.IMPORTANT: As of compiler version
0.16.0.0
, you can no longer use parameter filters in@value interface
s and@type interface
s.
Zeolite has @value interface
s that are similar to Java interface
s, which
declare functions that implementations must define. In addition, Zeolite also
has @type interface
s that declare @type
functions that must be defined.
(This would be like having abstract static
functions in Java.)
// @value indicates that the interface declares @value functions. @value interface Printable { // @value is not allowed in the declaration. print () -> () } // @type indicates that the interface declares @type functions. @type interface Diffable<#x> { // @type is not allowed in the declaration. diff (#x, #x) -> (#x) }
Type | Param Variance | Param Filters | Can Inherit | @category Funcs |
@type Funcs |
@value Funcs |
Define Procedures |
---|---|---|---|---|---|---|---|
concrete |
âś“ | âś“ | @value interface @type interface |
âś“ | âś“ | âś“ | âś“ |
@value interface |
âś“ | @value interface |
âś“ | ||||
@type interface |
âś“ | -- | âś“ |
-
@value interface
s can be inherited by other@value interface
s andconcrete
categories usingrefines
.concrete MyValue { refines Printable @type new (Int) -> (MyValue) // The functions of Printable do not need to be declared again, but you can do // so to refine the argument and return types. } define MyValue { @value Int value new (value) { return MyValue{ value } } // Define Printable.print like any other MyValue function. print () { \ BasicOutput.writeNow(value) } }
-
@type interface
s can only be inherited byconcrete
categories.concrete MyValue { defines Diffable<MyValue> @type new (Int) -> (MyValue) // The functions of Diffable do not need to be declared again, but you can do // so to refine the argument and return types. } define MyValue { @value Int value new (value) { return MyValue{ value } } // Define Diffable.diff like any other MyValue function. diff (x, y) { return MyValue{ x.get() - y.get() } } // A getter is needed to access the value outside of the object that owns it. @value get () -> (Int) get () { return value } }
-
You can also specify
refines
anddefines
when defining aconcrete
category. This allows the inheritance to be private.concrete MyValue { @type create () -> (Formatted) } define MyValue { // Formatted is not a visible parent outside of MyValue. refines Formatted create () { return MyValue{ } } // Inherited from Formatted. formatted () { return "MyValue" } }
You can modify interface
and concrete
with immutable
at the very top
of the declaration. (As of compiler version 0.20.0.0
.) This creates two
requirements for @value
members:
- They are marked as read-only, and cannot be overwritten with
<-
. - They must have a type that is also
immutable
.
(@category
members are not affected.)
Note that this applies to the entire implementation; not just to the
implementations of functions required by the immutable
interface
.
immutable
is therefore intended for objects that cannot be modified, rather
than as a way to define a read-only view (e.g., const
in C++) of an object.
@value interface Foo { immutable call () -> () } concrete Bar { refines Foo @type new () -> (Bar) @value mutate () -> () } define Bar { @value Int value new () { return Bar{ 0 } } call () { // call cannot overwrite value } mutate () { // mutate also cannot overwrite value, even though mutate isn't in Foo. } }
For members that use a parameter as a type, you can use immutable
as a filter
if the other filters do not otherwise imply it. Note that this will prevent
substituting in a non-immutable
type when calling @type
functions.
concrete Type<#x> { immutable #x immutable } define Type { // #x is allowed as a member type because of the immutable filter. @value #x value }
Every category has an implicit covariant parameter #self
. (As of
compiler version 0.14.0.0
.) It always means the type of the current
category, even when inherited. (#self
is covariant because it needs to be
convertible to a parent of the current category.)
For example:
@value interface Iterator<|#x> { next () -> (#self) get () -> (#x) } concrete CharIterator { refines Iterator<Char> // next must return CharIterator because #self = CharIterator here. }
The primary purpose of this is to support combining multiple interfaces with iterator or builder semantics into composite types without getting backed into a corner when calling functions from a single interface.
@value interface ForwardIterator<|#x> { next () -> (#self) get () -> (#x) } @value interface ReverseIterator<|#x> { prev () -> (#self) get () -> (#x) } concrete CharIterator { refines ForwardIterator<Char> refines ReverseIterator<Char> get () -> (Char) // (Remember that merging needs to be done explicitly.) } concrete Parser { // trimWhitespace can call next and still return the original type. In // contrast, if next returned ForwardIterator<#x> then trimWhitespace would // need to return ForwardIterator<Char> to the caller instead of #i. @type trimWhitespace<#i> #i requires ForwardIterator<Char> (#i) -> (#i) }
#self
can also be used to generalize a factory pattern:
@type interface ParseFactory { fromString (String) -> (#self) } concrete FileParser { @type parseFromFile<#x> #x defines ParseFactory (String) -> (#x) } define FileParser { parseFromFile (filename) { String content <- FileHelper.readAll(filename) // Notice that ParseFactory doesn't need a type parameter to indicate what // type is going to be parsed in fromString; it's sufficient to know that #x // implements ParseFactory and that fromString returns #self. return #x.fromString(content) } } concrete Value { defines ParseFactory } define Value { fromString (string) { if (string == "Value") { return Value{ } } else { fail("could not parse input") } } }
#self
is nothing magical; this could all be done by explicitly adding a
covariant #self
parameter to every type, with the appropriate requires
and
defines
filters.
Starting with compiler version 0.7.0.0
, Zeolite supports optional inference of
specific function parameters by using ?
. This must be at the top level (no
nesting), and it cannot be used outside of the parameters of the function.
The type-inference system is intentionally "just clever enough" to do things that the programmer can easily guess. More sophisticated inference is feasible in theory (like Haskell uses); however, type errors with such systems can draw a significant amount of attention away from the task at hand. (For example, a common issue with Haskell is not knowing which line of code contains the actual mistake causing a type error.)
concrete Value<#x> { @category create1<#x> (#x) -> (Value<#x>) @type create2 (#x) -> (Value<#x>) } // ... // This is fine. Value<Int> value1 <- Value:create1<?>(10) // These uses of ? are not allowed: // Value<Int> value2 <- Value<?>.create2(10) // Value<?> value2 <- Value<Int>.create2(10)
Only the function arguments and the parameter filters are used to infer the type substitution; return types are ignored. If inference fails, you will see a compiler error and will need to explicitly write out the type.
As of compiler version 0.21.0.0
, if you want to infer all params, you can
skip <...>
entirely. If you only want to infer some of the params, you must
specify all params, using ?
for those that should be inferred.
Type inference will only succeed if:
-
There is a valid pattern match between the expected argument types and the types of the passed arguments.
-
There is exactly one type that matches best:
- For params only used in covariant positions, the lower bound of the type is unambiguous.
- For params only used in contravariant positions, the upper bound of the type is unambiguous.
- For all other situations, the upper and lower bounds are unambiguous and equal to each other.
Type inference in the context of parameterized types is specifically disallowed in order to limit the amount of code the reader needs to search to figure out what types are being used. Forcing explicit specification of types for local variables is more work for the programmer, but it makes the code easier to reason about later on.
This section discusses language features that are less frequently used.
Zeolite provides two meta types that allow unnamed combinations of other types.
-
A value with an intersection type
[A & B]
can be assigned from something that is bothA
andB
, and can be assigned to either anA
orB
. There is a special empty intersection namedany
that can be assigned from any value but cannot be assigned to any other type.Intersections can be useful for requiring multiple interfaces without creating a new category that refines all of those interfaces. An intersection
[Foo & Bar]
in Zeolite is semantically similar to the existential typeforall a. (Foo a, Bar a) => a
in Haskell and? extends Foo & Bar
in Java, except that in Zeolite[Foo & Bar]
can be used as a first-class type.@value interface Reader { } @value interface Writer { } concrete Data { refines Reader refines Writer @type new () -> (Data) } // ... [Reader & Writer] val <- Data.new() Reader val2 <- val Writer val3 <- val
-
A value with a union type
[A | B]
can be assigned from eitherA
orB
, but can only be assigned to something that bothA
andB
can be assigned to. There is a special empty union namedall
that cannot ever be assigned a value but that can be assigned to everything. (empty
is actually of typeoptional all
.)Unions can be useful if you want to artificially limit what types can be used in a particular context. This can be useful for disallowing use of "unknown" implementations in critical or risky functions, etc.
-
When used with
requires
, a union can limit the allowed types without losing the original type.concrete Helper { @type describe<#x> #x requires Formatted #x requires [String | Int] // <- limits allowed types (#x) -> (String) } define Helper { describe (value) { return String.builder() .append(typename<#x>()) // <- original type is still available .append(": ") .append(value) .build() } } // ... \ Helper.describe(123) // Fine. \ Helper.describe("message") // Fine. \ Helper.describe(123.0) // Error!
-
When used as a variable type, the original type is lost, but you can still convert to a common parent type.
[String | Int] value <- 123 // You can only convert to types that _all_ constituent types convert to. Formatted formatted <- value // Fine. Int number <- value // Error!
-
Intersection and union types also come up in type inference.
-
If there are two or more incompatible guesses for an inferred type used only for input, the union of those types will be used.
-
If there are two or more incompatible guesses for an inferred type used only for output, the intersection of those types will be used.
// (Just for creating an output parameter.) concrete Writer<#x|> { @type new () -> (#self) } concrete Helper { // #x is only used for input to the function. @type inferInput<#x> (#x, #x) -> (String) // #x is only used for output from the function. // (This is due to contravariance of #x in Writer.) @type inferOutput<#x> (Writer<#x>, Writer<#x>) -> (String) } define Helper { inferInput (_, _) { return typename<#x>().formatted() } inferOutput (_, _) { return typename<#x>().formatted() } } define Writer { new () { return #self{ } } } // ... // Returns "[Int | String]". \ Helper.inferInput(123, "message") // Returns "[Int & String]". \ Helper.inferOutput(Writer<Int>.new(), Writer<String>.new())
In this context, unions/intersections are the most restrictive valid types that will work for the substution. (They are respectively the coproduct/product of the provided types under implicit type conversion.)
In some situations, you might want to peform an explicit type conversion on a
@value
. The syntax for such conversions is value
?
Type
, where
value
is any @value
and Type
is any type, including params and
meta types.
-
With values that have a union type (e.g.,
[A | B]
), you might need an explicit type conversion when making a function call.@value interface Object<|#x> { get () -> (#x) } concrete IntObject { refines Object<Int> @type new (Int) -> (IntObject) } concrete StringObject { refines Object<String> @type new (String) -> (StringObject) } // ... [IntObject | StringObject] value <- StringObject.new("message") // Convert to Object<Formatted> before calling get(). Formatted formatted <- value?Object<Formatted>.get() // Should get() return Int or String here? // Formatted formatted <- value.get()
-
Type conversions of function arguments can be used for influencing type inference.
concrete Helper { @category call<#x> (#x) -> () } // ... Int value <- 1 // #x will be inferred as Formatted rather than as Int here. \ Helper:call(value?Formatted)
-
You can also explicitly convert
optional
andweak
values, although they will still retain their original storage modifier.optional Int value <- 1 // Passed as optional Formatted. \ call(value?Formatted) // Not allowed, since value.Formatted is still optional. // String string <- value?Formatted.formatted()
The reduce
builtin function enables very limited runtime reasoning about
type conversion.
- The call
reduce<Foo, Bar>(value)
will returnvalue
with typeoptional Bar
iffFoo
can be converted toBar
. Note thatvalue
must itself be convertible tooptional Foo
. - When type params are used, the types that are assigned at the point of
execution are checked. For example, the result of
reduce<#x, #y>(value)
will depend on the specific types assigned to#x
and#y
upon execution.
Here are a few motivating use-cases:
-
Allowing creation of a container that can hold objects of different types while still being able to access the objects with their original types. (Also see
TypeMap
inlib/container
, which was actually the original target use-case for the initial version of Zeolite.) -
Enabling optional functionality for a parameter without using a filter. For example, printing info about
value
if available usingreduce<#x, Formatted>(value)
during debugging.
@value interface AnyObject { getAs<#y> () -> (optional #y) } concrete Object<#x> { @category create<#x> (#x) -> (AnyObject) } define Object { refines AnyObject @value #x value create (value) { return Object<#x>{ value } } getAs () { return reduce<#x, #y>(value) } }
AnyObject value <- Object:create<?>("message") // This will be empty because String does not convert to Int. optional Int value1 <- value.getAs<Int>() // This will be "message" as Formatted because String converts to Formatted. optional Formatted value2 <- value.getAs<Formatted>()
reduce
cannot be used to "downcast" a value (e.g., converting a Formatted
to a Float
) since the argument has the same type as the first parameter.
For example, reduce<#x, #y>(value)
checks #x
→#y
, and since value
must be optional #x
, value
can only be converted upward. In other words, it
only allows conversions that would otherwise be allowed, returning empty
for
all other conversions.
The AnyObject
example above works because Object
stores the original type
passed to create
as #x
, which it then has available for the reduce
call.
The type variables #x
and #y
are the primary inputs to reduce
; there is
absolutely no examination of the "real" type of value
at runtime.
// Here we explicitly set #x = Formatted when calling create. AnyObject value <- Object:create<Formatted>("message") // This will be empty even though the actual value is a String because getAs // uses #x = Formatted in the reduce call. optional String value1 <- value.getAs<String>()
As of compiler version 0.24.0.0
, you can get a value that identifies a
specific @value
instance using the identify
builtin.. This can be useful
for creating identifiers that don't otherwise have a unique member.
String value <- "value" Identifier<String> valueId <- identify(value)
-
You can use comparison operators (e.g.,
==
,<
) betweenIdentifier
s of any two types. -
The
Identifier
remains valid even if the original@value
is deallocated. -
identify
can also be used foroptional
types, but not forweak
types. -
Type conversions have no effect on the resulting
Identifier
, other than the type used for compile-time checking.String value <- "value" // The following are equivalent: Identifier<Formatted> id1 <- identify(value) Identifier<Formatted> id2 <- identify(value?Formatted)
-
Identifier
uniqueness isn't reliable for unboxed types (e.g.,Int
) because the values are stored without being contained in a Zeolite object. (Also see Builtin Types.) For example,identifier(2) == identifier(2)
, whereasidentifier("value") != identifier("value")
, due to storage differences.
As of compiler version 0.24.0.0
, you can restrict where @value
and @type
functions can be called from with the visibility
keyword.
concrete Value { // This applies to everything below. visibility Factory // This can only be called from Factory. @type new () -> (Value) // This resets the visibility to the default. visibility _ // This can be called from anywhere. @value call () -> () }
- You can specify multiple types separated by
,
. - You will get a compiler error if you declare a
@category
function when the visibility is other than the default. - Functions with restricted visibility can't be called from
@category
functions because the full type where the call originates isn't defined. - The category's own definition and
unittest
s are exempt from visibility requirements. visibility all
will prevent functions from being called outside of the category itself andunittest
.visibility any
will prevent functions from getting called in@category
functions but not anywhere else.- For
@value
functions with limited visibility, sometimes an explicit type conversion of the value will be needed prior to making the function call, e.g.,value?Parent.call()
. The effect is that the allowed argument and return types could be more limited.
#self
@category
@type
@value
_
all
allows
any
break
cleanup
concrete
continue
defer
define
defines
delegate
elif
else
empty
exit
fail
false
identify
if
immutable
in
interface
optional
present
reduce
refines
require
requires
return
scoped
self
strong
testcase
traverse
true
typename
unittest
update
visibility
weak
while
See
builtin.0rp
and
testing.0rp
for more details about builtin types. (For your locally-installed version, which
might differ, see $(zeolite --get-path)/base/builtin.0rp
.)
Builtin concrete
types:
Bool
[unboxed]: Eithertrue
orfalse
.Char
[unboxed]: Use single quotes, e.g.,'a'
. Use literal characters, standard escapes (e.g.,'\n'
), 2-digit hex (e.g.,\x0B
), or 3-digit octal (e.g.,'\012'
). At the moment this only supports ASCII; see Issue #22.CharBuffer
: Mutable, fixed-size buffer ofChar
. This type has no literals.Float
[unboxed]: Use decimal notation, e.g.,0.0
or1.0E1
. You must have digits on both sides of the.
. As of compiler version0.24.0.0
, you can also escape hex with\x
, octal with\o
, and binary with\b
.Identifier<#x>
[unboxed]: An opaque identifier used to compare underlying@value
instances.Int
[unboxed]: Use decimal (e.g.,1234
), hex (e.g.,\xABCD
), octal (e.g.,\o0123
), or binary (e.g.,\b0100
).Pointer<#x>
[unboxed]: An opaque pointer type for use in C++ extensions. Only C++ extensions can create and accessPointer
contents, but Zeolite code can still store and pass them around.String
: Use double quotes to sequenceChar
literals, e.g.,"hello\012"
. You can build a string efficiently usingString.builder()
, e.g.,String foo <- String.builder().append("bar").append("baz").build()
.
Builtin @value interface
s:
Append<#x>
: Supports appending#x
.AsBool
: Convert a value toBool
usingasBool()
.AsChar
: Convert a value toChar
usingasChar()
.AsFloat
: Convert a value toFloat
usingasFloat()
.AsInt
: Convert a value toInt
usingasInt()
.Build<#x>
: Build a#x
from the current state.Container
: Contains multiple values.DefaultOrder<#x>
: The container provides a defaultOrder<#x>
for use withtraverse
.Duplicate
: Duplicate the value withduplicate()
.Formatted
: Format the value as aString
usingformatted()
.Hashed
: Hash the value as anInt
usinghashed()
.Order<#x>
: An ordering of#x
values, for use with thetraverse
ReadAt<#x>
: Random access reads from a container with values of type#x
.SubSequence
: Extract a subsequence from the object.WriteAt<#x>
: Random access writes to a container with values of type#x
.
Builtin @type interface
s:
Default
: Get the default@value
withdefault()
.Equals<#x>
: Compare values usingequals(x,y)
.LessThan<#x>
: Compare values usinglessThan(x,y)
.Testcase
: For use intestcase
.
Builtin meta-types:
#self
: The type of the category where it is used. See#self
.any
: Value type that can be assigned a value of any type. (This is the terminal object in the category of Zeolite types.)all
: Value type that can be assigned to all other types. (This is the initial object in the category of Zeolite types.)
empty
(optional all
): A missingoptional
value.false
(Bool
): Obvious.self
(#self
): The value being operated on in@value
functions.true
(Bool
): Obvious.
identify
: Returns anIdentifier<#x>
for the value.present
: Checkoptional
forempty
.reduce<#x, #y>(value)
: See Runtime Type Reduction.require
: Convertoptional
to non-optional
.strong
: Convertweak
tooptional
.typename<#x>()
: Formats the real type of#x
as aFormatted
value.
Operators | Semantics | Example | Input Types | Result Type | Notes |
---|---|---|---|---|---|
+ ,- ,* ,/ |
arithmetic | x + y |
Int ,Float |
original type | |
% |
arithmetic | x % y |
Int |
Int |
|
- |
arithmetic | x - y |
Char |
Int |
|
^ ,| ,& ,<< ,>> ,~ |
bit operations | x ^ y |
Int |
Int |
|
+ |
concatenation | x + y |
String |
String |
|
^ ,! ,|| ,&& |
logical | x && y |
Bool |
Bool |
|
< ,> ,<= ,== ,>= ,!= |
comparison | x < y |
built-in unboxed, String |
Bool |
not available for Pointer |
. |
function call | x.foo() |
value | function return type(s) | |
. |
function call | T.foo() |
type instance | function return type(s) | |
: |
function call | T:foo() |
category | function return type(s) | |
&. |
conditional function call | x&.foo() |
optional value |
function return type(s) converted to optional |
skips evaluation of args if call is skipped |
? |
type conversion | x?T |
left: value right: type instance |
right type with optionality of left | can also be used with optional values |
<- |
assignment | x <- y |
left: variable right: expression |
right type | |
<-| |
conditional assignment | x <-| y |
left: optional variable right: non- weak expression |
left type with optionality of right | skips evaluation of right if left is present |
<|| |
fallback value | x <|| y |
left: optional expression right: non- weak expression |
union of left and right types with optionality of right | skips evaluation of right if left is present |
You can create public .0rp
source files to declare concrete
categories
and interface
s that are available for use in other sources. This is the only
way to share code between different source files. .0rp
cannot contain
define
s for concrete
categories.
During compilation, all .0rp
files in the project directory are loaded up
front. This is then used as the set of public symbols available when each .0rx
is separately compiled.
The standard library currently temporary and lacks a lot of functionality. See
the public .0rp
sources in lib
. Documentation will eventually follow.
You can depend on another module using -i lib/util
for a public dependency and
-I lib/util
for a private dependency when calling zeolite
. (A private
dependency is not visible to modules that depend on your module.)
Dependency paths are first checked relative to the module depending on them. If
the dependency is not found there, the compiler then checks the global location
specified by zeolite --get-path
.
Public .0rp
source files are loaded from all dependencies during compilation,
and so their symbols are available to all source files in the module. There is
currently no language syntax for explicitly importing or including modules or
other symbols.
If you are interested in backing a concrete
category with C++, you will need
to write a custom .zeolite-module
file. Better documentation will eventually
follow, but for now:
- Create a
.0rp
with declarations of all of theconcrete
categories you intend to define in C++ code. - Run
zeolite
in--templates
mode to generate.cpp
templates for allconcrete
categories that lack a definition in your module. - Run
zeolite
in-c
mode to get a basic.zeolite-module
. After this, always use recompile mode (-r
) to use your.zeolite-module
. - Take a look at
.zeolite-module
inlib/file
to get an idea of how to tell the compiler where your category definitions are. - Add your code to the generated
.cpp
files.lib/file
is also a reasonable example for this. - If you need to depend on external libraries, fill in the
include_paths
andlink_flags
sections of.zeolite-module
.
IMPORTANT: @value
functions for immutable
categories will be marked as
const
in C++ extensions. immutable
also requires that @value
members have
immutable
types, but there is no reasonable way to enforce this in C++. You
will need to separately ensure that the implementation only stores other
immutable
types, just for consistency with categories implemented in Zeolite.
Unit testing is a built-in capability of Zeolite. Unit tests use .0rt
source
files, which are like .0rx
source files with testcase
metadata. The test
files go in the same directory as the rest of your source files. (Elsewhere in
this project these tests are referred to as "integration tests" because this
testing mode is used to ensure that the zeolite
compiler operates properly
end-to-end.)
IMPORTANT: Prior to compiler version 0.10.0.0
, the testcase
syntax was
slightly different, and unittest
was not available.
// myprogram/tests.0rt // Each testcase starts with a header specifying a name for the group of tests. // This provides common setup code for a group of unit tests. testcase "passing tests" { // All unittest are expected to execute without any issues. success } // Everything after the testcase (up until the next testcase) is like a .0rx. // At least one unittest must be defined when success is expected. Each unittest // must have a distinct name within the testcase. Each unittest is run in a // separate process, making it safe to alter global state. unittest myTest1 { // The test content goes here. It has access to anything within the testcase // besides other unittest. } unittest myTest2 { \ empty } // A new testcase header indicates the end of the previous test. testcase "missing function" { // The test is expected to have a compilation error. Note that this cannot be // used to check for parser failures! // // Any testcase can specify require and exclude regex patterns for checking // test output. Each pattern can optionally be qualified with one of compiler, // stderr, or stdout, to specify the source of the output. error require compiler "run" // The compiler error should include "run". exclude compiler "foo" // The compiler error should not include "foo". } // You can include unittest when an error is expected; however, they will not be // run even if compilation succeeds. define MyType { // Error! MyType does not have a definition for run. } concrete MyType { @type run () -> () } testcase "intentional failure" { // The test is expected to fail. failure require stderr "message" // stderr should include "message". } // Exactly one unittest must be defined when a failure is expected. unittest myTest { // Use the fail built-in to cause a test failure. fail("message") } testcase "compilation tests" { // Use compiles to check only Zeolite compilation, with no C++ compilation or // execution of tests. compiles } unittest myTest { // unittest is optional in this mode, but can still be used if the tests does // not require any new types. }
Unit tests have access to all public and $ModuleOnly$
symbols in the module.
You can run all tests for module myprogram
using zeolite -t myprogram
.
Specific things to keep in mind with testcase
:
- All individual
unittest
have a default timeout of 30 seconds. This is to prevent automated test runners from hanging indefinitely if there is a deadlock or an infinite loop. This can be changed withtimeout t
, wheret
is specified in seconds. Specifingtimeout 0
disables the time limit. - To simplify parsing, there are a few limitations with how you order the fields
in a
testcase
:- The expected outcome (
success
,failure
,error
,compiles
) must be at the top. (Prior to compiler version0.24.0.0
,crash
was used instead offailure
.) - All
require
andexclude
patterns must be grouped together. (Put another way, if a field other thanrequire
orexclude
follows one of those two, the parser will move on.)
- The expected outcome (
As of compiler version 0.16.0.0
, you can get a log of all lines of Zeolite
code (from .0rx
or .0rt
sources) with the --log-traces
[filename]
option when running tests with zeolite -t
.
-
If
[filename]
is not an absolute path, it will be created relative to the path specified with-p
if used. The file will be overwritten, and will contain all traces from a single call tozeolite -t
. -
The current format is
.csv
with the following columns (includes a header):"microseconds"
: The call time from a monotonic microseconds timer, with an unspecified starting point."pid"
: A unique ID for the current process. This is not the real process ID from the system, since that will often not be unique."function"
: The name of the function where the line was executed."context"
: The source-code context (e.g., file, line) of the call. The context has the same format as stack traces for crashes.
-
If a procedure uses the
$NoTrace$
pragma, there will be no trace information in the log for that procedure. This is because the logging uses the same tooling that is used for stack traces. Similarly, if a specificunittest
uses the$DisableCoverage$
pragma, no coverage will be recorded as a result of thatunittest
. -
Nothing will be logged for
testcase
that usecompiles
orerror
, since those modes do not actually execute any compiled code. -
Keep in mind that the simple act of processing text for logging can obscure race conditions in a program; therefore,
--log-traces
should be skipped when troubleshooting race conditions. -
Check the size of the log file before attempting to open it in a desktop application. In many cases, it will be too large to display.
As of compiler version 0.20.0.0
, zeolite -r
will cache information about the
possible .0rx
lines that can show up in --log-traces
mode.
-
You can access this information using
zeolite --show-traces
[module path]
. This can then be compared to the"context"
field in the output.csv
to determine if any code was missed by the executed tests. -
Alternatively, you can have
zeolite
compute the missed lines usingzeolite --missed-lines
[filename] [module path]
, where[filename]
is the file written by--log-traces
. If a singlezeolite -t
command executed tests for multiple modules, you can pass all of those modules to a singlezeolite --missed-lines
call.For example:
# Run the tests. zeolite -t --log-traces traces.csv your/module1 your/module2 # Output the line numbers missed by the tests. zeolite --missed-lines traces.csv your/module1 your/module2
Note that
--missed-lines
does not account for which module's test was responsible for covering a line during the tests. For example, if you runzeolite -t --log-traces traces.csv your/module1 your/module2
andyour/module2
depends onyour/module1
, tests fromyour/module2
that executeyour/module1
code will contribute toyour/module1
's coverage.
(As of compiler version 0.5.0.0
.)
Pragmas allow compiler-specific directives within source files that do not otherwise need to be a part of the language syntax. Macros have the same format, and are used to insert code after parsing but before compilation.
The syntax for both is $SomePragma$
(no options) or
$AnotherPragma[
OPTIONS
]$
(uses pragma-specific options). The syntax for
OPTIONS
depends on the pragma being used. Pragmas are specific to the
context they are used in.
These must be at the top of the source file, before declaring or defining
categories or testcase
s.
-
$ModuleOnly$
. This can only be used in.0rp
files. It takes an otherwise-public source file and limits visibility to the module. (This is similar to package-private in Java.) -
$TestsOnly$
. This can be used in.0rp
and.0rx
files. When used, the file is only visible to other sources that use it, as well as.0rt
sources..0rp
sources still remain public unless$ModuleOnly$
is used. The transitive effect of$TestsOnly$
is preventing the use of particular categories in output binaries.
These must occur at the very top of a function definition.
-
$NoTrace$
. (As of compiler version0.6.0.0
.) Disables stack-tracing within this procedure. This is useful for recursive functions, so that trace information does not take up stack space. This does not affect tracing for functions that are called from within the procedure. -
$TraceCreation$
. (As of compiler version0.6.0.0
.) Includes a trace of the value's creation when the given@value
function is called. If multiple functions in a call stack use$TraceCreation$
, only the trace from the bottom-most function will be included.$TraceCreation$
is useful when the context that the value was created in is relevant when debugging crashes. The added execution cost for the function is trivial; however, it increases the memory size of the value by a few bytes per call currently on the stack at the time it gets created.
These must be at the top of a category define
immediately following {
.
-
$FlatCleanup[
memberName
]$
. (As of compiler version0.21.0.0
.) Clear the@value
membermemberName
before actually cleaning up the@value
itself to avoid recursive cleanup. Use this when recursive cleanup might otherwise result in a stack overflow, e.g., with linked lists.Only one member can be specified in this pragma so that the implementation does not need to use a dynamically-sized cleanup queue to handle branching.
-
$ReadOnly[
var1, var2, ...
]$
. (As of compiler version0.16.0.0
.) See Local Variable Rules. Only applies to@category
and@value
members. -
$Hidden[
var1, var2, ...
]$
. (As of compiler version0.16.0.0
.) See Local Variable Rules. Only applies to@category
and@value
members. -
$ReadOnlyExcept[
var1, var2, ...
]$
. (As of compiler version0.22.1.0
.) Marks all@category
and@value
members as read-only except those listed.-
If multiple
ReadOnlyExcept
are used, they are unioned rather than intersected. For example,$ReadOnlyExcept[foo]$
and$ReadOnlyExcept[bar]$
together make$ReadOnlyExcept[foo, bar]$
and not$ReadOnlyExcept[]$
. -
If a variable is listed in both
ReadOnly
andReadOnlyExcept
, the variable is marked as read-only.
-
These must be at the top of a unittest
immediately following {
.
$DisableCoverage$
. (As of compiler version0.20.0.0
.) Disables all collection of code coverage when--log-traces
is used. This is useful if a particular unit test causes a massive number of lines to be executed.
These pragmas alter how variables are dealt with locally:
-
$ReadOnly[
var1, var2, ...
]$
. (As of compiler version0.13.0.0
.) Marksvar1
,var2
, etc. as read-only for the remainder of the statements in this context. Note that this only prevents assignment to the variable; it does not prevent making calls to functions that change the state of the underlying value. It also does not prevent calls to other functions modifying the variable.scoped { Int i <- 0 } in while (i < 100) { // i can still be overwritten here $ReadOnly[i]$ // i can not be overwritten below here within the while block } update { // this is fine because it's in a different scope i <- i+1 }
This can be used for any variable name visible in the current scope, including
@value
and@category
members and argument and return variables. -
$Hidden[
var1, var2, ...
]$
. (As of compiler version0.13.0.0
.) This works the same way asReadOnly
except that it also makes the variables inaccessible for reading. Note that this does not allow you to reuse a variable name; the variable name remains reserved. -
As of compiler version
0.16.0.0
both$ReadOnly[...]$
and$Hidden[...]$
can be used at the top of adefine
for aconcrete
category to protect member variables.define Type { $Hidden[foo]$ $ReadOnly[bar]$ @category Int foo <- 1 // foo can still be read here for the purposes of initialization. @category Int bar <- foo }
Zeolite uses pragmas instead of something like final
in Java for a few
reasons:
-
In practice, a large percentage of variables to be treated as read-only require some sort of iterative or conditional setup. Marking the variable as
final
at declaration time would require creating a helper to initialize it.Int nextPow2 <- 1 while (nextPow2 < n) { nextPow2 <- 2*nextPow2 } $ReadOnly[nextPow2]$
-
A frequent source of errors in any language is multiple variables of the same type in the same scope. Marking some of them as read-only (or as hidden) within a certain context helps prevent errors caused by inadvertently using the wrong variable.
-
If your code works correctly with variables marked as read-only or as hidden, then it will also work correctly without such markings; the markings are only there to cause compile-time errors. This means that they add limited value to interpreting the code, and can therefore be kept separate from the respective variable's definition without loss of clarity.
These can be used in place of language expressions.
-
$SourceContext$
. (As of compiler version0.7.1.0
.) Inserts aString
-literal with information about the macro's location within the source file. Note that if this is used within an expression macro in.zeolite-module
(seeExprLookup
below), the context will be within the.zeolite-module
file itself. (Remember that macro substitution is not a preprocessor stage, unlike the C preprocessor.) -
$CallTrace$
. (As of compiler version0.24.0.0
.) Inserts anoptional Order<Formatted>
containing the current call trace. This is only available in.0rt
test files and in.0rx
with$TestsOnly$
. This cannot be used outside of non-test code, just so that logic can't depend on where the call originated from. -
$ExprLookup[
MACRO_NAME
]$
. (As of compiler version0.6.0.0
.) This directly substitutes in a language expression, as if it was parsed from that exact code location.MACRO_NAME
is the key used to look up the expression. Symbols will be resolved in the context that the substutition happens in.-
MODULE_PATH
is always defined. It is aString
-literal containing the absolute path to the module owning the source file. This can be useful for locating data directories within your module independently of$PWD
. -
Custom macros can be included in the
.zeolite-module
for your module. This can be useful if your module requires different parameters from one system to another.// my-module/.zeolite-module // (Standard part of .zeolite-module.) path: "." // Define your macros here. expression_map: [ expression_macro { name: USE_DATA_VERSION // Access using $ExprLookup[USE_DATA_VERSION]$. expression: "2020-05-12" // Substituted in as a Zeolite expression. } expression_macro { name: RECURSION_LIMIT expression: 100000 } expression_macro { name: SHOW_LIMIT // All Zeolite expressions are allowed. expression: "limit: " + $ExprLookup[RECURSION_LIMIT]$.formatted() } ] // (Standard part of .zeolite-module.) mode: incremental { }
The
name:
must only contain uppercase letters, numbers, and_
, and theexpression:
must parse as a valid Zeolite expression. This is similar to C++ macros, except that the substitution must be independently parsable as a valid expression, and it can only be used where expressions are otherwise allowed.
-
Zeolite currently uses reference counting rather than a garbage-collection system that might otherwise search for unreferenced objects in the background. While this simplifies the implementation, it is possible to have a reference cycle that prevents cleanup of the involved objects, thereby causing a memory leak.
This can be mitigated by using weak
references in
categories where a cycle is probable or guaranteed. For example, LinkedNode
in
lib/container
is a doubly-linked list, which would create a reference cycle if
both forward and reverse references were non-weak
.