-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow returning a value with an error #2647
Comments
I see potential in that. A world where error sets are just regular unions, but given all the syntax-level amenities of today's errors. // a regular-ass type
const InvalidChar = struct {
pos: usize,
};
// an error set containing different types
const ParseError = error {
InvalidChar: InvalidChar,
Overflow, // void
};
// merge like ya would today
const Error = ParseError || error{OutOfMemory};
fn f() void {
parse(something) catch |err| switch (err) {
.InvalidChar => |e| warn("bad character at {}", e.pos),
.Overflow => warn("overflow"),
.OutOfMemory => warn("out of memory"),
};
} Taking it further, perhaps all today's good stuff about errors could be applied to any type, not just unions. Maybe the const SomeError1 = error struct {
val: usize,
reason: []const u8,
};
const SomeError2 = error union(enum) {
OutOfOrder,
OutOfBounds,
OutOfIdeas,
};
// today's, which is sugar for above
const SomeError3 = error {
ResourceExhausted,
DeadlineExceeded,
}; Because you could now "bloat" an error set with types of larger size, this might affect how strongly use of the global error set is discouraged. |
I remember seeing this proposed before but I can't find the issue for it. Maybe it was only on IRC? |
Thank you @CurtisFenner for a well written proposal |
This is just a tagged union. And as they seem so useful, maybe we can add anonymous structs, so we can just use tagged unions instead of multiple return values. Don't worry about the optimizations here. The compiler can handle that. |
There's a previous issue here #572 (just for the record) |
because errors are assigned a unique value, how about allowing for tagged unions to use errors as the tag value? this would avoid adding new syntax to language and making this feature consistent with other constructs in the language. this tangentially relies on #1945. /// stolen from above
const ParseError = union(error) {
InvalidChar: usize,
Overflow, // equivalent to `Overflow: void`
};
pub fn parseU64(buf: []const u8, radix: u8) ParseError!u64 {
// ......
if (digit >= radix) {
return error{ .InvalidChar = index };
}
// ......
}
test "parseU64" {
if (parseU64(str, 10)) |number| {
// ......
} else |err| switch (err) {
error.Overflow => {
// ......
},
error.InvalidChar => |index| {
// ......
}
}
} |
Agreeing with @emoki I'd like some syntactic sugar for multiple arguments to an error switch, if the type is defined in the same tagged union:
|
I think what @emekoi suggested is excellent, as it removes the need for extra syntax and sidesteps the issues of increasing the size of |
I assume this should be: return ParseError{ .InvalidChar = index }; Otherwise I love the idea! |
that's what i wasn't sure about. would you still have to explicitly name the error even when using an inferred error set? or would you just use |
Not a proposal, but something possible currently: here's a variation on OP's "Workaround 2" (the out parameter). A struct member instead of an "out" parameter. It's still not perfect, but this or Workaround 2 is still the most flexible as they make it possible to allocate memory for the error value (e.g. a formatted error message). const Thing = struct {
const ErrorInfo = struct {
message: []u8,
};
error_info: ?ErrorInfo,
// `allocator` could also be a parameter of an init function
fn doSomething(self: *Thing, allocator: ...) !void {
if (bad thing 1) {
self.error_info = ErrorInfo {
.message = try ...allocate a string...,
};
return error.ThingError;
} else if (bad thing 2) {
self.error_info = ErrorInfo {
.message = try ...allocate a different string...,
};
return error.ThingError;
} else {
// happy
}
}
};
fn caller() void {
var thing = Thing.init();
defer thing.deinit(); // free allocated stuff in error_info if present
thing.doSomething(some_allocator) catch |err| {
switch (err) {
error.ThingError => {
// this `.?` is the smelliest part of this idea
std.debug.warn("error: {}\n", thing.error_info.?.message);
},
else => {
// e.g. an OOM error from when we tried to alloc for the error message
std.debug.warn("some other error\n");
},
}
return;
}
std.debug.warn("success\n");
} This might be a solution for std InStream and OutStream which currently have that annoying generic error parameter? Also, for parsers and line numbers specifically, you don't need to include the line number in the error value itself. Just maintain it in a struct member and the caller can pull it out when catching. If these struct members aren't exclusive to failed states, then there's no smell at all here. const Parser = struct {
...
line_index: usize,
parse(self: *Parser) !?Token {
// continually update line_index, return a regular zig error if something goes wrong
}
}; |
I like @emekoi's suggestion here, but I'll note that I'd like to be able to have |
I guess it would actually be |
in your example |
I think the issue here can be summarized by noting that zig has 2 concepts that are tied together that probably don't need to be.
Zig has some nice constructs that make error control flow easy to work with ( Maybe Zig should be able to infer an error set that includes any type, not just error codes? fn foo() !void{
if (...)
return error SomeStruct.init(...);
if (...)
return error.SomeErrorCode;
} |
this c++ Proposal is so cool with this zig Proposal, so maybe a consider.l |
I find that the boost library |
I additionally propose that coercing a |
Is this not typically a job for some kind of interfaces, i.e. allow anything that implements |
Here's a pattern that I would consider to be an alternative to this proposal: Lines 717 to 826 in 5709737
I think it's quite reasonable. Edit: here's usage example at the callsite: zig/src-self-hosted/stage2.zig Lines 677 to 708 in 5709737
|
The problem with returning a value with an error, is that it is the same as returning a polymorphic type, and defining that return type inside the function, instead of in the function signature. While I think we still need inferred return types (#447) for some things, this is an advanced feature, and fancier patterns, like the example, should be required in order to utilize these features. We also need inferred types as a way of sneaking in multiple return values (through the anonymous structs we already have), which LLVM supports, but in C requires an ugly one-use-but-defined struct (and where the C ABI layout of that struct is ignored by the optimization pass). |
I think it's reasonable, but I think it could be better by making minimal changes to the language. Specifically, I think Zig is already expressive enough to more tightly reflect the function's interface in its type signature; we just need to apply the right already-existing features. My two main complaints with what you can currently achieve:
An error set like This (modified) proposal is to optionally transform the "error set" concept into an "error union" concept -- noting that a tagged union with all void field types is essentially the same thing as a enum; ie, we have just strengthened an existing concept (error sets) to another existing concept (tagged unions). I don't think it's necessary to automatically generate the union, as emekoi suggested earlier -- we just use The example would look something like pub const DiagnosticsErr = union(error) {
UnknownCpuFeature: ?[]const u8,
MissingArchitecture: void,
// same as UnknownCpu: void
UnknownCpu,
}
pub fn parse(args: ParseOptions) !Target { // infer DiagnosticsErr!Target
......
// equivalent to `return DiagnosticsErr{.MissingArchitecture={}};`
return error.MissingArchitecture;
......
return DiagnosticsErr{.UnknownCpuFeature = feature_name};
}
var result = Target.parse(.......) catch |err| switch (err) {
error.UnknownCpuFeature => |unknown_feature_name| {
......
},
else => |e| return e,
} I think this is a relatively small change to the language (since it re-uses existing concepts and is totally backwards compatible) to make some situations much clearer without any additional runtime/compiletime cost. |
related: #786 |
I think this has already been captured, but I'm dealing with error values that have resources that must be freed. I'm not sure how this would work with Specifically, I've written a driver for duckdb. When failing to execute a query, duckdb provides a detailed error message. That message is valid until the underlying prepared statement is freed. Whether I dupe the error message or keep a reference to the prepared statement, I need to free something. I'm using a Result / tagged union, but it's a bit cumbersome to consume. |
I agree that something to this affect would be great, particularly in the instance of failed parsing, writing to file, etc... I have also experience similar opaque error messages when parsing json with no hint of where in the json file the error occurred |
I was thinking along the lines of @marler8997's suggestions, but from a perspective that this approach can actually solve the problem of merging conflicting error sets (which is an unsolved problem in the OP's proposal, and I find it a bit difficult to accept it to be unsolved). So, once we allow other types to be used as members of error sets, there is no problem with a conflict, since each type is globally unique in a natural way. The syntax could look e.g. like follows (I do not insist on this specific form, which is based on @marler8997's one, this is just an example) const MyErrorSet = error {
SomeErrorId,
AnotherErrorId,
error ErrorIdWithPayload,
};
const ErrorIdWithPayload = struct {
payload_field1: []const u8,
payload_field2: usize,
};
fn fail() MyErrorSet!void {
return error ErrorIdWithPayload{ .payload_field1 = "", .payload_field2 = 0 };
}
fn func() void {
fail() catch |err| switch(err) {
error.SomeErrorId, error.AnotherErrorId => {},
error ErrorIdWithPayload => |payload| {
std.debug.print("{s} {}\n", .{ payload.payload_field1, payload.payload_field2 });
}
};
} Notice that, differently from const fail_example = @import("fail_example.zig");
fn func() void {
fail_example.fail() catch |err| switch(err) {
error.SomeErrorId, error.AnotherErrorId => {},
error fail_example.ErrorIdWithPayload => |payload| {
std.debug.print("{s} {}\n", .{ payload.payload_field1, payload.payload_field2 });
}
};
} |
Currently i'm doing it the Rust way. It kinda fills what I need from it but it would be better to better builtin to the language as this is, in my opinion, not totally readable and it makes 2-way of managing errors: const std = @import("std");
fn Result(comptime O: type, comptime E: type) type {
return union(enum) {
Ok: O,
Err: E,
};
}
const CreateFolderErr = union(enum) { WriteError: []const u8 };
fn createFolders(dir: std.fs.Dir, paths: []const []const u8) Result(void, CreateFolderErr) {
for (paths) |path| {
dir.makeDir(path) catch {
return .{ .Err = .{ .WriteError = path } };
};
}
return .Ok;
}
pub fn main() !void {
const paths = &[_][]const u8{
"folder1",
"folder1/folder1.1",
"folder2/folder2.1",
};
const dir = try std.fs.cwd().openDir("tmp", .{});
// Should print "Could not write: folder2/folder.2.1", as "folder2" does not exists
switch (createFolders(dir, paths)) {
.Err => |err| switch (err) {
.WriteError => |path| std.debug.print("Could not write: {s}\n", .{path}),
},
.Ok => {},
}
} It would cleary be more readable and more the zig way to do something like this: const std = @import("std");
const CreateFolderErr = union(error) {
WriteError: []const u8
};
fn createFolders(dir: std.fs.Dir, paths: []const []const u8) CreateFolderErr!void {
for (paths) |path| {
dir.makeDir(path) catch {
return .{ .WriteError = path };
// Or more explicit: return error { .WriteError = path };
};
}
return;
}
pub fn main() !void {
const paths = &[_][]const u8{
"folder1",
"folder1/folder1.1",
"folder2/folder2.1",
};
const dir = try std.fs.cwd().openDir("tmp", .{});
// Should print "Could not write: folder2/folder.2.1", as "folder2" does not exists
createFolders(dir, paths) catch |err| switch (err) {
.WriteError => |path| std.debug.print("Could not write: {s}\n", .{path}),
};
} |
I just started with Zig and as far as I can see now, the error handling is absolutely perfect and extremely elegant. (I tried Rust for a while, where error handling is insanely complicated and hardly usable, in my opinion). For more complicated error handling, where feedback (which can be a lot) is needed, maybe it is better to leave this to the application or library programmers to handle in a creative way. |
Perhaps liberror can be a source of inspiration? it's one alternative to additionally returning an arbitrary type - just let the callee do the heavy lifting of setting errstr |
Error codes are for control flow. |
I agree that error codes are good for control flow, but it is super common to not care what the actual error was until much higher in the stack, and there is no idiomatic way to attach that state. For example, consider this API:
Which will save the message in a buffer and write it once the buffer is full to a file with the current date as the name. There is no idiomatic way to do that, which is problematic. |
I've discovered this pattern that I use when I both want to have error control flow (i.e. trigger errdefers) but would also like a "side-channel" for extra error information. Consider this example, foo() catch {
std.log.err("foo failed", .{});
}; We want to add a side-channel to foo. The general pattern is to define a var err: FooError = undefined;
foo(&err) catch {
std.log.info("foo failed with {}", .{err});
};
var err: FooError = undefined;
const result = foo(args..., &err) catch |err| switch (err) {
error.Foo => {
// we know err is initialized because we got error.Foo
defer err.deinit(); // the err could have a deinit function to cleanup resources
std.log.info("err is {}", .{err}); // the error could have a format function for printing
},
// if the error wasn't error.Foo, then err wouldn't be populated
else => |e| return e,
}; And here's an example of what // TIP: sometimes it's useful to make this a union(enum) type
const FooError = struct {
side_channel_data: SideChannelData,
// you could define a deinit function here if you have resources to clean up
pub fn deinit(self: *FooError) void { ... }
// the set function allows the `foo` implementation to both set the error and return
// the proper zig error code in 1 expression
pub fn set(self: *FooError, val: FooError) error{Foo} {
self.* = val;
return error.Foo;
}
// sometimes you might want multiple set* functions
// if this is a more general kind of "Error" shared multiple functions, you
// might want to defined a format function
pub fn format(
self: FooError ,
comptime fmt: []const u8,
options: std.fmt.FormatOptions,
writer: anytype,
) !void {
try writer.print(...);
}
}
fn foo(args..., err: *FooError) !ReturnValue {
if (something_failed) {
// this `set` function will initialize `err` and return error.Foo
return err.set(my_side_channel_info);
}
try bar(); // still ok to use 'try' with other functions
try foo2(err); // we pass `err` to foo2 which could also return error.Foo
} @ayende, In your example with the const std = @import("std");
pub fn main() void {
var err: PersistError = undefined;
persist("my message", &err) catch {
std.log.err("persist failed trying to write file '{s}'", .{err.getFilename()});
};
}
pub const PersistError = struct {
filename_buf: [200]u8,
filename_len: usize,
pub fn getFilename(self: *PersistError) []const u8 {
return self.filename_buf[0..self.filename_len];
}
pub fn setFilename(self: *PersistError, filename: []const u8) error{PersistError} {
self.filename_len = @min(self.filename_buf.len, filename.len);
@memcpy(self.filename_buf[0..self.filename_len], filename);
return error.PersistError;
}
};
pub fn persist(msg: []const u8, err: *PersistError) error{PersistError}!void {
_ = msg;
const something_failed = true;
if (something_failed) {
return err.setFilename("my-filename-with-the-date-in-it");
}
} Here's a couple real-world examples: direct2d-zig: https://github.com/marler8997/direct2d-zig/blob/6c6597a3a80203ee144bd97efc269e33c0653864/ddui.zig#L134 |
I ended up doing something like that, sure. But the key point isn't that this is possible. |
there isn't one and that's part of the freedom of Zig. unless you limit error return values to only be primitives then you have to worry about the allocation strategy of the error return and Zig is intentionally not prescriptive about that in its design. |
I find myself writing a lot of this: var foo_error_data: FooErr = undefined;
foo(&foo_error_data) catch |err| {
error_data = .{.foo = foo_error_data};
return err;
};
var bar_error_data: BarErr = undefined;
bar(&bar_error_data) catch |err| {
error_data = .{.bar = bar_error_data};
return err;
};
var quux_error_data: QuuxErr = undefined;
quux(&quux_error_data) catch |err| {
error_data = .{.quux = quux_error_data};
return err;
}; When I would have liked to write: try foo();
try bar();
try quux(); Is there a nicer way to compose error values with this pattern? |
Good question, yes there is:
I find your example a little unrealistic, because I don't see why you would have 3 different error types. By making them the same, then you can introduce a |
Because I'm calling three different apis that all use this error pattern and have different values associated with their errors. The values are actually useful too, so I don't want to just turn everything into a string error message. In your examples you have AnalysisFail, InnerError, LinkFailure etc. If you have some zig code that you want to analyze, codegen and link, then what error value do you return? Is it always just an opaque string? |
Considering how often we work with errors and how useful it is to have detailed information on the errors is, I don't think the fact that we can work around it is a valid reason to not make it a better experience |
Maybe I missed it somewhere, but is there a technical reason why we seem opposed to enabling errors with payloads? Is it |
The biggest reason why not is lifetime issues.
In the case of
Now what happens? The problem here is that this just doesn't compose. |
Sometimes when a function fails, there is extra information that you have on hand that may help the caller respond to the problem or produce a diagnostic. For example, in the
parseU64
example by andrewrk here,it would be useful for the function could return the position of the invalid character so that the caller could produce a diagnostic message.
Because Zig treats
error
types specially, when using errors you get a bunch of nice features, such as!
error-set inference,try
/catch
, anderrdefer
; you currently lose these features if you want to return extra diagnostic information since that information is no longer an error type.While something like index-of-bad-character is less useful for parsing an integer, getting "bad character" with no location when parsing a 2KiB JSON blob is very frustrating! -- this is the current state of the standard library's JSON parser.
There are currently two workarounds possible today to let this extra information get out, neither of which are very ergonomic and which work against Zig's error types:
Workaround 1: Return a tagged union
You could explicitly return a tagged union that has the extra information:
This is unfortunate in a number of ways. First, because
InvalidChar
is no longer anerror
, you cannot propagate/handle the failure withtry
/catch
. Second, because theInvalidChar
case is no longer anerror
, you cannot useerrdefer
to cleanup partially constructed state in the parser. Finally, calling the function is made messy because it can fail in two separate ways -- either in the error union, or in the explicitly returned union. This means calls that distinguish different errors (as opposed to just propagating withtry
) need nestedswitch
es.Workaround 2: Write to an out parameter
You could also leave the error set alone, and instead expand the contract of
parseU64
to write to an out parameter whenever it returns aInvalidChar
error:However, this makes the function's interface much messier: it now includes mutation, and it makes it impossible to indicate that it's being called in such a way that it cannot fail, since the pointer parameter is required (where previously a
catch unreachable
could handle). Also, it won't be immediately obvious which out parameters are associated with which errors, especially if inferred error sets are being used. In particular, it gives libraries writes the opportunity to sometimes re-use out parameters (in order to prevent function signatures from growing out of hand) and sometimes not (they at least cannot when the types aren't the same).Proposal: Associate each error with a type
EDIT: Scroll down to a comment for a refreshed proposal. It looks essentially the same as here but with a bit more detail. The primary difference is not associating errors with value types, but an error within a particular error-set with a type. This means no changes to the
anyerror
type are necessary.I propose allowing a type to be associated with each error:
The value returned would be available in
switch
s:This allows a function which can fail in multiple ways to associate different value types with different kinds of failures, or just return some plain errors that worked how they did before.
With this proposal, the caller can use inferred error sets to automatically propagate extra information, and the callsite isn't made messy with extra out-parameters/an extra non-error failure handling switch. In addition, all of the features special to errors, like
errdefer
andtry
/catch
, continue to work.Errors in the global set would now be associated with a type, so that the same error name assigned two different types would be given different error numbers.
I'm not sure what happens when you have an error set with the same name twice with different types. This could possibly be a limited case where "overloading" a single name is OK, since instantiating an error is always zero-cost, but I'll ask what others think.
I'm fairly new to Zig, so some of the details may not be quite right, but hopefully the overall concept and proposal makes sense and isn't unfixably broken.
The text was updated successfully, but these errors were encountered: