Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Lazy error set types lazyerror{A, ?B} #16147

Closed
rohlem opened this issue Jun 22, 2023 · 5 comments
Closed

Proposal: Lazy error set types lazyerror{A, ?B} #16147

rohlem opened this issue Jun 22, 2023 · 5 comments

Comments

@rohlem
Copy link
Contributor

rohlem commented Jun 22, 2023

(Another update: It just came to my mind that if #2647 got implemented first, then lazily-unreachable errors could be modeled by assigning noreturn payloads to errors. This would already unblock the usage patterns this proposal allows, making the rest of it strictly-speaking a quality-of-life improvement to cut down on repetition to make code more concise.)

Update: The Gist

This should have been my introductory statement all along, taken from my comment below.
If I'm not being clear here, please ask for clarification (and maybe save yourself reading the rest for now).

const file = openFile("abc.txt") catch |e| switch(e) {
  error.WindowsOnlyError => {},
  error.LinuxOnlyError => {},
};

When deciding on a function's error set in status-quo, there are mainly two approaches:

  • Merge all possible error sets of a function (an approach I named SQ-1 below, std does this).
    • In this case, every call site has to handle (or propagate) all errors possible under all comptime conditions involved in the function instantiation (here: on all platforms).
    • We have to then hope that during optimization passes some are determined unreachable, otherwise our executable contains the code of these dead branches.
  • Have a conditional error set (explicit E!T (named SQ-2 below) or deduced !T (named SQ-3 below)).
    • In that case, the switch in this example (that could be used for all conditions/platforms) does not compile in status-quo, because the compiler errors for clauses present that are outside the error set.
      You are instead forced to duplicate the switch, once with only error.WindowOnlyError, once with only error.LinuxOnlyError, and branch to the right one.

That is maybe the main practical reason for this proposal. If you have a solution for this in status-quo, please let me know.
There are other benefits to this proposal, but this might be the primary motivator.

Basically, my proposed idea boils down to telling the compiler that WindowsOnlyError and LinuxOnlyError are lazily-reachable in this error set.
Therefore it's okay (not a compile error) if they appear in the switch as clauses - but they only have to appear if the compiler determines they are actually reachable (the same way it determines the error set in !T).

If this doesn't sound too crazy yet, then here's the full proposal:

Problem statement / Scenario:

Imagine a function f with differing error conditions (=> a differing error set) depending on comptime conditions. For example:

  • error.A can always happen under any comptime configuration
  • error.L can happen under comptime condition cond_l.
    (For example when targeting Linux, but the condition may be arbitrarily complex.)
  • error.W can happen under different comptime condition cond_w - same arbitrary complexity here.

Status-quo modeling options

  • (SQ-1) merge all error sets (currently done by std).
    Problems:
    • The compiler thinks that paths for error.L and error.W are reachable even if they aren't.
    • Callers have to handle all errors, f.e. in exhaustive switches or when propagating them
      (unless they use unreachable etc. to filter their logic based on cond_l and cond_w,
      which is complex and bloats code).
      (Note: applies to callers within your module, but also users downstream.)
  • (SQ-2) construct error set conditionally (error{A} || (if(cond_l) error{L} else error{}) || ...)
    Problems:
    • less readable / harder to understand
    • less maintainable: may go out of sync (the compiler warns about narrowing, not widening)
    • Callers now HAVE to filter their own error handling with unreachable etc. based on the error set
      (or the underlying conditions), since writing f.e. switch clauses for errors outside of an error value's set
      is a compile error.
      (Note: applies to callers within your module, but also users downstream.)
  • (SQ-3) use deduced error sets (in deduced error unions !T)
    Problems:
    • less readable unless you comment the actual error set
      (in which case it's less maintainable, because not checked by the compiler)
    • Callers still HAVE to filter their own error handling just as in SQ-2.
      (Note: applies to callers within your module, but also users downstream.)

None of these are optimal. Zig strives for optimal code, hence this proposal.

Proposal: A lazy-error-set type-placeholder syntax lazyerror{A, ?B}, and deduced-lazy-error-set types.

The semantics are as follows:

  • errors listed without a ? are assumed-reachable.
    (If no elements have a preceding ?, the semantics are the same as for a status-quo error set.)
  • errors listed with a ? are lazily-reachable. Their actual reachability is deduced by the compiler.

Short example showcase:

// status-quo functions with explicit error sets in error unions
fn checkA() error{A}!void {}
fn checkL() error{L}!void {}
fn checkW() error{W}!void {}

// proposed stuff
const E = lazyerror{A, ?L, ?W}; //lazy-error-set type-placeholder
fn f(comptime check_l: bool, comptime check_w: bool) E!void {
  try checkA();
  if(check_l) try checkL();
  if(check_w) try checkW();
}
fn g() E!void {
  try checkA();
}

test "lazy-error-sets naturally allow more optimal, checked caller code" {
  // here e's reachable error set is {A}, so L and W don't need clauses
  f(false, false) catch |e| switch(e){
    error.A => {},
  };
  // but if we want to write clauses for lazily-unreachable errors,
  // so our code covers more ground, that's allowed  (unlike SQ-2, SQ-3).
  const l_enabled = false; //changeable without changing the `switch` below
  const w_enabled = false; //changeable without changing the `switch` below
  f(l_enabled, w_enabled) catch |e| switch(e){
    error.A, error.L, error.W => {},
  };
  // here e's reachable error set is {A, L, W} => the compiler would error if we forgot a clause
  f(true, true) catch |e| switch(e) {
    error.A, error.L, error.W => {},
  };
  // merging works sensibly (natural promotion lazy to assumed-reachable, error to lazyerror)
  const E2 = (lazyerror{?A, ?L} || lazyerror{A, ?W} || error{A});
  try std.testing.expect(E == E2);
}

Details

Note that a lazy-error-set type-placeholder lazyerror{A, ?L, ?W}!void isn't exactly a type, just as !void isn't either. The actual result type is deduced by the compiler, but will always contain all assumed-reachable elements, and never contain an error outside of the given set.

However, unlike with !T, lazy-error-set type-placeholders carry more information, therefore I do think it would be helpful to promote them to actual comptime values:

  • storing and exposing in const-s
    (they're technically not types, so just making a new special category lazyerrorset would be the easiest.)
  • merging them (same operator || as for status-quo error sets)

Note that a deduced-lazy-error-set type must hold more information than a status-quo error set: In addition to the set of reachable errors (assumed-reachable + lazy errors that the compiler determined reachable), it also needs to hold the set of remaining lazy errors for the compiler to check whether a switch clause or value comparison should be a compile error (e.g. to catch mistakes leading to always-dead branches in code).
Deduced-lazy-error-set types would also make sense as comptime values, to allow merging them.

Additional considerations

  • We could reuse keyword error instead of introducing lazyerror in type-placeholder syntax.
    This is how I typed it intially, then I changed it. Now I want to change it back.
    • Con: error{(...)} would then yield either a type or a lazyerrorset, depending on whether there is a ? element.
    • Pro: error{A} and lazyerror{A} are semantically supposed to be equivalent - if we use the same keyword they are syntactically equivalent.
  • I haven't thought of a syntax for deduced-lazy-error-set types yet. I don't anticipate using it in code, but we probably want one for compiler messages.
    • I think using ? here as well would get confusing.
    • Maybe something as simple as error{A}lazilyunreachable{L,W} is good enough.

@N00byEdge
Copy link
Contributor

Isn't

const E = lazyerror{A, ?L, ?W}; //lazy-error-set type-placeholder
fn f(comptime check_l: bool, comptime check_w: bool) E!void {

just equivalent to

fn E(comptime check_l: bool, comptime check_w: bool) type {
  return error{A} || if(check_l) error{L} else error{} || if(check_w) error{W} else error{};
}
fn f(comptime check_l: bool, comptime check_w: bool) E(check_l, check_w)!void {

?
That seems much more explicit, flexible and simple to me even if it is a little bit of boilerplate.

@rohlem
Copy link
Contributor Author

rohlem commented Jun 23, 2023

@N00byEdge They are not equivalent; your code snippet implements the option I listed as SQ-2 via a helper function.
The issue with that solution is that caller code has to cover the error set in their switch exactly, otherwise they get a compile error:

f(false, true) catch |e| switch(e) {
  error.A, error.W => {},
  error.L => {}, //triggers a compile error because L is not a part of the error set
};

A single call site using switch therefore cannot call f in ways that result in differing error sets.

full runnable status-quo example here

// status-quo modelling via narrowest error set (SQ-2)
fn E(comptime check_l: bool, comptime check_w: bool) type {
  //note: We actually need parentheses here, it resolved to a wrong set taken from your comment.
  return error{A} || (if(check_l) error{L} else error{}) || (if(check_w) error{W} else error{});
}
fn f(comptime check_l: bool, comptime check_w: bool) E(check_l, check_w)!void {}

// usage code:
fn usesF(comptime check_l: bool, comptime check_w: bool) void {
  f(check_l, check_w) catch |e| switch(e) { 
    error.A => {}, //correctly always required
    error.L => {}, // providing this is a compile error if !check_l
    // not providing error.W is a compile error if check_w
  };
}
test usesF {
  //usesF(false, false); //compile error
  //usesF(false, true); //compile error
  usesF(true, false); //works
  //usesF(true, true); //compile error
}

// more error-prone, less readable variant, but allowed by the compiler currently:
fn genericallyUsesF(comptime check_l: bool, comptime check_w: bool) void {
  f(check_l, check_w) catch |e| {
    // When writing the proposal I thought these would be compile errors,
    // but apparently == works with errors outside of the set
    // (for now - not sure that it should).
    if(e == error.A) {}
    if(e == error.L) {}
    if(e == error.W) {}
    // Problem: The compiler cannot tell you if a branch is actually always dead code
    if(e == error.Other) {}
    // The simplest way to check for exhaustiveness (to make sure your errors are in sync)
    // I found is to re-construct the error set.
    comptime @import("std").debug.assert(@TypeOf(e) == E(check_l, check_w)); //error set of `f` out-of-sync
    // Though in this example because `f` and `E` are updated at the same time we still won't know;
    // we would actually need to call an "OldE" variant that is left in and only updated after all call sites
    // (or one for each call site).
  };
}
test genericallyUsesF { //these all work
  genericallyUsesF(false, false);
  genericallyUsesF(false, true);
  genericallyUsesF(true, false);
  genericallyUsesF(true, true);
}

I believe (i.e. in my experience) this doesn't scale well:

  • For every narrowest error set you need a construction function like E.
  • Every caller has to construct the exact set for the exhaustiveness check workaround
    (genericallyUsesF in the full example), and for the compiler to check this, this construction
    needs to be independently-updated code.
    • For this to be viable, every caller that modifies the set (removes an error by catching it,
      merges sets by calling several functions) needs their own error construction function as well.

My main issue isn't that I have to write the boilerplate once, it's that

  • the code becomes harder to read (i.e. parsing A, W, L from the definition of E)
  • callers are burdened by restricting their use of switch if they want to be compatible with several configuration's error sets
  • on error set changes the code has to be updated by hand. This is less maintainable than a solution with compiler support.

So to recap, the main benefits of the proposal:

  • Allow callers to write a single exhaustive switch statement that can cover multiple error sets (deduced from the same lazy-error-set type-placeholder). They are guaranteed to get a compile error if they miss a case that is reachable by the call.
  • No repeat of the comptime logic behind error sets between error set definition, function control flow and callers:
    The reachable errors (of lazily-reachable ?-entries) in a function instantiation are automatically determined from its control flow by the compiler.

@matu3ba
Copy link
Contributor

matu3ba commented Jun 24, 2023

Allow callers to write a single exhaustive switch statement that can cover multiple error sets (deduced from the same lazy-error-set type-placeholder). They are guaranteed to get a compile error if they miss a case that is reachable by the call.

  1. Do you have a real life use case for this yet, where the amount of boilerplate can not be reduced by already established methods (most likely due to a significant amount of optional errors based on the target for multiple error classes)? Please provide some code.
  2. To me this feels like hiding the error handling from user, which makes working without tooling harder, because figuring out the origin of error sets becomes harder due to indirection. Resolving indirection via functions is much easier to search for.
  3. With tooling/the compiler server, it would not make much of a difference, because (for example zls or the compiler) can show us the potential error set and the accurate error set. Unless, you can show a significant amount of boilerplate, even in more common cases.

The reachable errors (of lazily-reachable ?-entries) in a function instantiation are automatically determined from its control flow by the compiler.

I don't see the difference to implementing "checking all errors for all targets", but I can't find the issue now. This feels to me merely like a programmer hint on what is target-defined for the documented targets.
And this smells like a local optimum, because we dont enforce that the error set is correct for the documented targets.

@rohlem
Copy link
Contributor Author

rohlem commented Jun 24, 2023

@matu3ba

  1. Do you have a real life use case for this yet ...?
I do - I wanted to keep the OP general, but here are some details:

I'm currently writing a wrapper around SDL.
(It's kind-of-usable but I wanted to wait until I can use Zig's package manager for the entire setup before making it publicly available, since it's a bit tedious to set up right now.)
SDL itself has many rough edges. F.e. something being returned as u32 in one place, but for input only being accepted as u8 in another.

Now if I assert (via @intCast) the input u32 -> u8 that means all my users would by default be vulnerable to a crash for bad data => an error would be better.
But say a user doesn't need the API surface that only accepts u8 - maybe they prefer to keep the full u32 type. In that case the range error becomes unreachable.
Therefore I want to make all types configurable, because I think that provides the best possible interface for every use case.
This means many conditional errors, which need to be propagated through many internal functions.

Now here is a real section of the code.

It is not self-contained (won't build due to missing things), and I already know I'll be laughed at for my long identifiers, but anyway.
I added "//<-" comments so you can Ctrl+F to the imo relevant parts:

// relevant helper functions
const helpers = opaque {
    pub fn SingleErrorSet(comptime @"error": anytype) type { //<- turns error.ABC into error{ABC}
        comptime assertIsErrorValue(@"error");
        return @Type(.{ .ErrorSet = &.{@typeInfo(@TypeOf(@"error")).Error} });
    }
    pub fn intNarrowerThan(comptime A: type, comptime B: type) bool {
        if (comptime (std.math.minInt(A) > std.math.minInt(B))) return true;
        if (comptime (std.math.maxInt(A) < std.math.maxInt(B))) return true;
    }
    pub fn intCastCustomError(comptime Result: type, int_value: anytype, comptime @"error": anytype) IntCastCustomErrorSet(Result, @TypeOf(int_value), @"error")!Result {
        comptime std.debug.assert(@typeInfo(int_value) == .Int);
        return std.math.cast(Result, int_value) orelse return @"error";
    }
    pub fn IntCastCustomErrorSet( //<- returns either a 1- or 0-member error set, depending on the whether cast is narrowing
        comptime Result: type,
        comptime IntValue: type,
        comptime @"error": anytype,
    ) type {
        return if (intNarrowerThan(Result, IntValue)) return SingleErrorSet(@"error") else error{};
    }
};

// internal, tagged-union type for SDL DisplayEvent subtype of SDL Event (which C union) - there are around 20 like this one
pub const DisplayEventConfig = struct {
    DisplayIndex: type,
};
pub const DisplayEventSubEvent = union(enum) {
    pub const OrientationChanged = struct {
        new_orientation: internal.DisplayOrientation,
    };
    orientation_changed: OrientationChanged,
    connected: void,
    disconnected: void,
    moved: void,
};
pub fn DisplayEvent(comptime display_event_config: DisplayEventConfig) type {
    return struct {
        pub const config = display_event_config;
        pub const DisplayIndex = config.DisplayIndex;

        pub const SubEvent = DisplayEventSubEvent;

        display_index: DisplayIndex,
        sub_event: SubEvent,

        pub const FromRawConditionalErrorSet = //<- errors depending on config
            helpers.IntCastCustomErrorSet(DisplayIndex, @TypeOf(@as(c.SDL_DisplayEvent, undefined).display), error.DisplayIndexTypeTooSmallForValue);
        pub const FromRawErrorSetWithoutTopLevelSdlEventTypeValueErrors = //<- errors that the next level (PeripheryEvent) uses for its error set
            error{ UnknownSdlDisplayEventType, UnknownSdlDisplayOrientationValue } || FromRawConditionalErrorSet;
        pub const FromRawErrorSet = //<- the error set of the local fromRaw method
            error{ UnknownSdlEventTypeValue, SdlEventTypeValueNotDisplayEventTypeValue } || FromRawErrorSetWithoutTopLevelSdlEventTypeValueErrors;
        pub fn fromRaw(raw: c.SDL_DisplayEvent) FromRawErrorSet!@This() {
            return switch (try internal.EventType.fromRawInt(raw.type)) {
                else => return error.SdlEventTypeValueNotDisplayEventTypeValue,
                .display_event => .{
                    .display_index = try helpers.intCastCustomError(DisplayIndex, raw.display, error.DisplayIndexTypeTooSmallForValue), //<- conditional error propagated here
                    .sub_event = switch (raw.event) {
                        else => return error.UnknownSdlDisplayEventType,
                        c.SDL_DISPLAYEVENT_ORIENTATION => .{ .orientation_changed = .{ .new_orientation = (try internal.DisplayOrientation.fromRawInt(raw.data1)).? } },
                        c.SDL_DISPLAYEVENT_CONNECTED => .connected,
                        c.SDL_DISPLAYEVENT_DISCONNECTED => .disconnected,
                        c.SDL_DISPLAYEVENT_MOVED => .moved,
                    },
                },
            };
        }
    };
}

// tagged-union grouping for periphery-related SDL events, including SDL DisplayEvent
pub const PeripheryEventConfig = struct {
    DisplayIndex: type,
    JoyDeviceIndex: type,
    AudioDeviceIndex: type,
};
pub fn PeripheryEvent(comptime periphery_event_config: PeripheryEventConfig) type {
    return union(enum) {
        pub const config = periphery_event_config;

        pub const Display = DisplayEvent(.{ .DisplayIndex = config.DisplayIndex });
        pub const Keyboard = KeyboardPeripheryEvent;
        pub const JoyDevice = JoyDevicePeripheryEvent(config.JoyDeviceIndex);
        pub const GameController = GameControllerPeripheryEvent(config.JoyDeviceIndex);
        pub const AudioDevice = AudioDevicePeripheryEvent(config.AudioDeviceIndex);

        display: Display,
        keyboard: Keyboard,
        joy_device: JoyDevice,
        game_controller: GameController,
        audio_device: AudioDevice,

        pub const FromRawErrorSetWithoutTopLevelSdlEventTypeValueErrors = Display.FromRawErrorSetWithoutTopLevelSdlEventTypeValueErrors || JoyDevice.FromRawErrorSetWithoutTopLevelSdlEventTypeValueErrors || GameController.FromRawErrorSetWithoutTopLevelSdlEventTypeValueErrors || AudioDevice.FromRawErrorSetWithoutTopLevelSdlEventTypeValueErrors;
        pub const FromRawErrorSet = error{ UnknownSdlEventTypeValue, SdlEventTypeValueNotPeripheryEventTypeValue } || FromRawErrorSetWithoutTopLevelSdlEventTypeValueErrors;
        pub fn fromRaw(raw: c.SDL_Event) FromRawErrorSet!@This() {
            return switch (try internal.EventType.fromRawInt(raw.type)) {
                else => return error.SdlEventTypeValueNotPeripheryEventTypeValue,
                inline .display_event => |t| .{
                    .display = Display.fromRaw(t.extractDataAssertEventType(raw)) catch |err| switch (err) { //t.extractDataAssertEventType(raw) is safe because t comes from raw
                        error.UnknownSdlEventTypeValue => unreachable, //checked above
                        error.SdlEventTypeValueNotDisplayEventTypeValue => unreachable, //checked above
                        error.DisplayIndexTypeTooSmallForValue => |e| return e, //<- Here is the issue: This is required if the conditional error is part of the set, but becomes a compile error otherwise. An "else" clause also becomes a compile error when all other errors are handled. So there seems no way for me to write this as a `switch`.
                    },
                },
                inline .keyboard_keymap_changed => |t| .{
                    .keyboard = Keyboard.fromRaw(t.extractDataAssertEventType(raw)) catch |err| switch (err) { //t.extractDataAssertEventType(raw) is safe because t comes from raw
                        error.UnknownSdlEventTypeValue => unreachable, //checked above
                        error.SdlEventTypeValueNotKeyboardPeripheryEventTypeValue => unreachable, //checked above
                    },
                },
                .joy_device_connected,
                .joy_device_disconnected,
                .joy_device_remaining_energy_state_changed,
                => .{
                    .joy_device = JoyDevice.fromRaw(raw) catch |err| switch (err) {
                        error.UnknownSdlEventTypeValue => unreachable, //checked above
                        error.SdlEventTypeValueNotJoyDevicePeripheryEventValue => unreachable, //checked above
                        error.JoyDeviceIndexTypeTooSmallForValue => |e| return e, //<- same here - impossible to write as switch
                    },
                },
                inline .game_controller_connected,
                .game_controller_disconnected,
                .game_controller_remapped,
                => |t| .{
                    .game_controller = try GameController.fromRaw(t.extractDataAssertEventType(raw)) catch |err| switch (err) { //t.extractDataAssertEventType(raw) is safe because t comes from raw
                        error.UnknownSdlEventTypeValue => unreachable, //checked above
                        error.SdlEventTypeValueNotGameControllerPeripheryEventTypeValue => unreachable, //checked above
                        error.JoyDeviceIndexTypeTooSmallForValue => |e| return e, //<- same here - impossible to write as switch
                    },
                },
                inline .audio_device_connected,
                .audio_device_disconnected,
                => |t| .{
                    .audio_device = try AudioDevice.fromRaw(t.extractDataAssertEventType(raw)) catch |err| switch (err) { //t.extractDataAssertEventType(raw) is safe because t comes from raw
                        error.UnknownSdlEventTypeValue => unreachable, //checked above
                        error.SdlEventTypeValueNotAudioDevicePeripheryEventTypeValue => unreachable, //checked above
                        error.AudioDeviceIndexTypeTooSmallForValue => |e| return e, //<- same here - impossible to write as switch
                    },
                },
            };
        }
    };
}

Here's a much simpler example that demonstrates the same thing though (EDIT: now also placed at the beginning of the original post):

const file = openFile("abc.txt") catch |e| switch(e) {
  error.WindowsOnlyError => {},
  error.LinuxOnlyError => {},
};

In status-quo, either you merge all error sets (SQ-1, like std does) - then programs do not compile unless they handle (or propagate) all errors under all comptime conditions (here: on all platforms).
We have to then hope that during optimization passes some are determined unreachable, otherwise our executable contains the code of these dead branches.

Or you have a conditional error set (explicit = SQ-2, or deduced !T = SQ-3) - in that case, the switch in this example (that could be used for all conditions/platforms) does not compile in status-quo.


  1. To me this feels like hiding the error handling from user, ..., because figuring out the origin of error sets becomes harder due to indirection. Resolving indirection via functions is much easier to search for.

If I understand you correctly, this is a concern that applies just the same to status-quo deduced error sets using !T, right?
In my current understanding (and please correct me on this), there are actually two concerns present here which (in my experience) don't align naturally:

  • (C1) Finding out what errors are in an error set.
    • The easiest form to read is a written-out error set: error{A, B, C}.
    • However, in practice errors on one level come from the next level of implementation, f.e. DownloadError = NotFoundError || ConnectionError.
      This is the logical way to structure code, but it means it takes longer to find the actual errors.
    • Because conditional error sets need logic/branching, they tend to obfuscate the actual error names by the control flow of choosing/merging them.
      Compare this to a lazy-error-set placeholder: error{A, ?B, ?C}. IMO this is concise and immediately readable, thereby preferable.
  • (C2) Finding out why an error is in an error set.
    • The reason for an error condition is implied by the code that triggers it. Decoupling the logic behind error sets from a function's control flow (SQ-2, f.e. via error set construction function) means error sets can get out-of-sync.
      I recognize this as an issue, but don't think lazy error sets would make a big difference compared to status-quo (where you can already declare an error that actually doesn't occur in code and the compiler doesn't complain). AFAIK the only way to find these unused errors is by Ctrl+F -ing error names throughout the code. Error discoverability (C1) helps with this process.
    • IMO the best approach to documenting error conditions for error-sets is by inline comments next to the error set definition.
      For this I believe concise syntax (of error sets or lazy error sets) to be more readable than between comptime branches that construct error sets.
Example:

// status-quo
/// errors only on Linux
const LinuxErrors = error{LinuxError}
  // we use old API before this version, which may lead to OldLinuxApiError
  || ( if (builtin.os.version.isAtLeast(.{.major = 3, .minor = 2, .patch = 0})) error{} else error{OldLinuxApiError} );
/// errors only on Windows
const WindowsErrors = error{WindowsError};
const ErrorSQ = error{NotFound} || switch(builtin.os.tag) {
  .linux => LinuxErrors,
  .windows => WindowsErrors,
  else => error{}
};

// compare with lazy-error-set placeholder:
const ErrorLE = lazyerror{
  NotFound,
  //only on Linux
  ?LinuxError,
  //we use old API on Linux before version 3.2.0 , which may lead to OldLinuxApiError
  ?OldLinuxApiError,
  //only on Windows
  ?WindowsError,
}

I don't see lazy-error-set types or placeholders having a direct negative effect on reading code without tooling - if you do, please let me know how/why.
(Please also keep in mind we currently allow !T. I believe lazy-error-set type-placeholders being used instead of this to be a uniform improvement - let me know if I'm wrong / you disagree with this.)


  1. [The compiler / ZLS already knows the actual error set.] Unless, you can show a significant amount of boilerplate, even in more common cases.

With approach (SQ-1) of merging all error sets, we actually rob the compiler of this information: Function boundaries (in current semantics, unless you inline) mark runtime code interactions. That means if you declare an error set on a function, the compiler will (at least initially) have to assume that all errors are reachable at the call site. Only during optimizations could it discover that certain paths are actually dead code.

For differing error sets by logic (SQ-2 and SQ-3) there is still the main issue of the compiler currently limiting use of switch regarding errors being in the set - I think I've posted it in 3 examples now, but please let me know if that is still unclear (should have probably been my introductory point looking back).

Generally speaking, I believe the conditional logic behind error sets generally explodes / grows exponentially:

  • when combining functions
    • try a(); try b(); => you are merging their error sets
    • catch only one of the errors => you are removing it from the error set
  • when you combine more complex error conditions, i.e. not just one error set per platform, but based on comptime arguments to a function

This requires error set generation code that should be kept in-sync with the logic of actual functions that can return those errors - but this currently must be done by hand. With lazy-error-set types, the lazy-error portion of the error set will instead be deduced by the compiler.

Maybe there is some exact threshold or definition of "too much boilerplate" for you, or maybe you could qualify a "common case" somehow - let me know and I'm happy to construct / provide more examples.


The reachable errors (of lazily-reachable ?-entries) in a function instantiation are automatically determined from its control flow by the compiler.

I don't see the difference to implementing "checking all errors for all targets", but I can't find the issue now. This feels to me merely like a programmer hint on what is target-defined for the documented targets.

With "the issue", I assume you're referring to a GitHub issue about checking for each compilation target that the code handles the error sets correctly?
The bulk of (my side of the) discussion here is generally concerned with comptime conditions. These can be arguments to functions where a program contains any number of instantiations, they do not have to be dependent on the target platform at all (though that is one possible condition).

And this smells like a local optimum, because we dont enforce that the error set is correct for the documented targets.

With "enforcing correct error sets", do you mean a compile error for when an error set is too wide? Or do you mean making sure all reachable errors are handled?
In either regard I don't see how lazy-error-sets really change the picture at all, please clarify if I'm missing something here.

@rohlem
Copy link
Contributor Author

rohlem commented Jun 24, 2023

For reference, I've now updated the OP with the maybe-main-point about switch being stated up-front.
This means the detailed explanations repeat this point, but at least now this part of the fundamental problem should be more obvious.
Thank you to everyone reading it (or trying to read it) before now regardless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants