Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Have Reader/Writer be a runtime instead of comptime interface #13808

Closed
wants to merge 1 commit into from

Conversation

Hejsil
Copy link
Contributor

@Hejsil Hejsil commented Dec 7, 2022

Current version of Reader and Writer takes 3 comptime arguments to specify the type. A Context, Error and read/write function. The drawbacks to this approuch are:

  • No Reader/Writer types are actually the same. A FixedBufferReader.Reader is no the same as a File.Reader
    • This means you have to take anytype everywhere and every different reader/writer will generate a copy of the function taking them.
  • Type names are very long, as they include the arguments in the names. This is made even worse by the fact that readers and writers tend to be composed. This affects:
    • Error messages. They can become very hard to read and understand
    • C backend. Type names become incredibly long, which is one of the contributing factors to why the output is big

This commit changes Reader and Writer to be more like Allocator. They are now a pointer + a function pointer to the function that implementions the read/write. This reduces the arguments taken by Reader and Writer from 3 to 1, which is the error set. We cannot eliminate this. Benifits:

  • Now, at least some Reader/Writer types can be the same. When they share the same error set, they will be the same type.

    • We could also add some functionality to promote a Reader/Writer to use a super set of its own error set. For example, we could have (this is vaporware):
      const a: Reader(error{}) = ...;
      const b: Reader(error{A}) = ...;
      const readers = &[_]Reader(anyerror){
          a.promote(anyerror),
          b.promote(anyerror),
      };
  • Type names are a lot shorter and nesting Readers and Writers will not produce names that include the child reader/writer.

    • Error messages are better:

      const std = @import("std");
      
      fn a(b: usize) void {
          _ = b;
      }
      
      test {
          var counting_writer = std.io.countingWriter(std.io.null_writer);
          // Before: expected type 'usize', found 'io.writer.Writer(*io.counting_writer.CountingWriter(io.writer.Writer(void,error{},(function 'dummyWrite'))),error{},(function 'write'))'
          // After:  expected type 'usize', found 'io.writer.Writer(error{})'
          a(counting_writer.writer());
      }
    • C backend produces less code:

        108.368.735 zig2.after.c
        117.846.808 zig2.before.c
      

Drawbacks:

  • You can no longer access the inner writer through writer.context
    • Seemed a bit like an anti pattern though. You should probably take the concrete reader/writer if you need to access its state
  • You can no longer obtain a Writer for ArrayListUnmanaged(u8). You need to get a managed ArrayList, get the Writer, write what you want and then move the result back with moveToUnmanaged.

I've done the work of getting the compiler compiling but many tests probably wont. I wanna know if this is something that would make sense

Current version of Reader and Writer takes 3 comptime arguments to
specify the type. A Context, Error and read/write function. The
drawbacks to this approuch are:

- No Reader/Writer types are actually the same. A
  FixedBufferReader.Reader is no the same as a File.Reader
  - This means you have to take `anytype` everywhere and every different
    reader/writer will generate a copy of the function taking them.
- Type names are very long, as they include the arguments in the names.
  This is made even worse by the fact that readers and writers tend to
  be composed. This affects:
  - Error messages. They can become very hard to read and understand
  - C backend. Type names become incredibly long, which is one of the
    contributing factors to why the output is big

This commit changes Reader and Writer to be more like Allocator. They
are now a pointer + a function pointer to the function that
implementions the read/write. This reduces the arguments taken by Reader
and Writer from 3 to 1, which is the error set. We cannot eliminate
this. Benifits:

- Now, at least some Reader/Writer types can be the same. When they
  share the same error set, they will be the same type.
  - We could also add some functionality to promote a Reader/Writer to
    use a super set of its own error set. For example, we could have:

      const a: Reader(error{}) = ...;
      const b: Reader(error{A}) = ...;
      const readers = &[_]Reader(anyerror){
          a.promote(anyerror),
          b.promote(anyerror),
      };
- Type names are a lot shorter and nesting Readers and Writers will not
  produce names that include the child reader/writer.
  - Error messages are better:

      const std = @import("std");

      fn a(b: usize) void {
          _ = b;
      }

      test {
          var counting_writer = std.io.countingWriter(std.io.null_writer);
          // Before: expected type 'usize', found 'io.writer.Writer(*io.counting_writer.CountingWriter(io.writer.Writer(void,error{},(function 'dummyWrite'))),error{},(function 'write'))'
          // After:  expected type 'usize', found 'io.writer.Writer(error{})'
          a(counting_writer.writer());
      }

  - C backend produces less code:

      108.368.735 zig2.after.c
      117.846.808 zig2.before.c
@silversquirl
Copy link
Contributor

I'm slightly confused as to why this is a PR rather than a proposal, I feel like this needs a bit more discussion. I'm not sure long type names are really that much of an issue; and the problems you describe that are caused by them can be fixed in other ways (eg. C backend output could use hashes instead of names to make output smaller; errors could show a diff instead of writing the full name multiple times, etc).

The bigger issue imo is the first one you bring up: the fact that different readers/writers are different types. This causes issues for tooling as it's not really possible to provide completions for anytype. However ,this PR doesn't even solve this issue because reader/writer are still comptime-generic because they need an error set.

I'm also worried about the effects this PR might have on runtime performance. Comptime allocators make a lot of sense for readers and writers because the code can be specialized and optimized for the specific usecases - eg. when reading from memory with a FixedBufferAllocator, it's much faster to inline the read function since it's just a memcpy, but when reading from a compressed file through GzipReader it might be better not to inline read so all the gzip code isn't duplicated.
Runtime interfaces not only disallow this level of optimization due to hiding the code from the compiler, but prevent inlining entirely and introduce extra indirection, which can drastically hurt performance in tight loops, which is often where IO-heavy code is.

In summary, this PR doesn't solve the most important issue brought up in the justification, and the issues it does solve could be addressed in other ways that don't impact runtime performance.

@Hejsil
Copy link
Contributor Author

Hejsil commented Dec 7, 2022

I'm slightly confused as to why this is a PR rather than a proposal, I feel like this needs a bit more discussion.

Well, I was mostly just exploring seeing the effects of this kind of change. Since I already had the code I though we might as well discuss around code that exists, compiles and works rather that guessing. Also, as I mentioned, this is not a complete PR. I've intentionally waited doing a lot of grunt work in case this change is unwanted.

However ,this PR doesn't even solve this issue because reader/writer are still comptime-generic because they need an error set.

To some extend, yes this does not solve that issue, but it does move towards some things that could not be done before. For example, the code I posted with an array of Reader(anyerror). There was no way to do this before (ofc, we would need to implement this promote function, but I'm pretty sure that is trivial):

const a: Reader(error{}) = ...;
const b: Reader(error{A}) = ...;
const readers = &[_]Reader(anyerror){
    a.promote(anyerror),
    b.promote(anyerror),
};

Also, with this, you can actually specify that your function takes a Reader:

fn take(reader: Reader(error{NoSpaceLeft})) !void { ... }

This can take any reader that can return this error set. You could even take a Reader that returns no error:

const reader: Reader(error{}) = ...;
take(reader.promote(error{NoSpaceLeft});

And this will avoid generating two versions of the same function. A place where this could reduce the amount of code generated is allocPrint. It will generate two calls to format, each with different writers (CountingWriter and FixedBufferWriter). You could promote the CountingWriter to have the same error set as the FixedBufferWriter, thereby only generating format. Calls to format is one of the big bloaters of zig binaries.

I'm also worried about the effects this PR might have on runtime performance.

There is a point here, but there are a few things to keep in mind:

  • Code size also affects runtime performance. The more code you have in your binary, the more likely you are to jump away from the instructions that are currently in cache
  • Optimizer are quite good at figuring out and inlining vtable calls when it can figure out the concrete type from inlining. This is because this is the way C++ works and llvm has done a lot of work to make this good. I've explored that here (links might need some updating)

I do agree that this is something to be aware of. This is just something that is quite hard to test unless we have some real world program who spends a lot of time in these interfaces and are hurt by this change.

errors could show a diff instead of writing the full name multiple times

Maybe, but that is vaporware at the moment and we have no idea how well that kind change would actually perform.

C backend output could use hashes instead of names to make output smaller

Well, depends if we want the C backend to produce obfuscated or readable code.

@Hejsil Hejsil closed this Dec 7, 2022
@Hejsil Hejsil reopened this Dec 7, 2022
@Hejsil
Copy link
Contributor Author

Hejsil commented Dec 7, 2022

Ops, clicked that close button by mistake

@travisstaloch
Copy link
Contributor

another use case i think this would solve was discussed in this discord issue. that is allowing wrapping readers/writers recursively. with status quo, the following results in an endless semantic analysis.

// test.zig
const std = @import("std");
const B = struct {
    b: *B,

    pub const Error = error{ Overflow, EndOfStream };
    fn deserialize(self: *B, reader: anytype) Error!void {
        var limreader = std.io.limitedReader(reader, 1);
        try self.b.deserialize(limreader.reader());
    }
};

test {
    var b: B = undefined;
    try b.deserialize(std.io.getStdIn().reader());
}
$ zig test /tmp/test.zig
Semantic Analysis [8411] reader... ^C

@Vexu
Copy link
Member

Vexu commented Dec 7, 2022

You could further reduce redundant generic instantiations by implementing base reader and writer types and then using @errSetCast and semantic inlining:

pub const AnyReader = struct {
    context: *anyopaque,
    readFn: *const fn (context: *anyopaque, buffer: []u8) anyerror!usize,

    pub fn readAll(impl: AnyReader, buffer: []u8) anyerror!usize {
        var index: usize = 0;
        while (index != buffer.len) {
            const amt = try impl.readFn(impl.context, buffer[index..]);
            if (amt == 0) return index;
            index += amt;
        }
        return index;
    }
};
pub fn Reader(comptime ReadError: type) type {
    return struct {
        pub const Error = ReadError;
        impl: AnyReader,

        const Self = @This();

        pub inline fn readAll(self: Self, buffer: []u8) Error!usize {
            return self.impl.readAll(buffer) catch |err| @errSetCast(Error, err);
        }
    };
}

This would be much nicer if @errSetCast worked on error unions.

@silversquirl
Copy link
Contributor

Also, with this, you can actually specify that your function takes a Reader:

Can you give some examples of actual code (from existing projects, or the standard library) that could use this? I can't think of any practical cases where the reader is generic but the error set is known.

And this will avoid generating two versions of the same function.

This is a good point and potentially an issue, but in general I'd prefer performance over code size (though the two are often related as you point out).
For the specific cases where it causes issues (like fmt) there may be other options - allocPrint specifically could use a custom writer that either counts or outputs depending on a runtime flag, which would prevent generating duplicate code.

Code size also affects runtime performance

Yes, however in this particular case I doubt it'll have an impact. Code using different types of readers is generally separated in time, so the code is probably already gone from cache next time it's needed anyway. It's much better to have good optimizations in very common tight loops than to optimize for a few cache misses in unusual code.

Optimizer are quite good at figuring out and inlining vtable calls

I'm interested to see how well LLVM can pick up on Zig's vtables since they're not classes and it potentially has less type info about them?

Either way all this performance discussion is purely hypothetical which is bad. We should get some actual numbers on this :)


wrapping readers/writers recursively

Struggling to see a real-world application for this. Generally you'd only want at max one of each type of reader/writer in the stack. Perhaps you could give an example of an actual problem that is helped by this usecase?

@silversquirl
Copy link
Contributor

silversquirl commented Dec 7, 2022

@Vexu that's cool! But still doesn't help when you don't know the error set, which is most of the time in my experience :)

EDIT: well, it helps with code duplication, but not with tooling issues like completions

@Hejsil
Copy link
Contributor Author

Hejsil commented Dec 7, 2022

Can you give some examples of actual code (from existing projects, or the standard library) that could use this? I can't think of any practical cases where the reader is generic but the error set is known.

Yea, this was mostly a "I made it up" kinda deal. I haven't seen any code that could use this, except maybe if they chose to use anyerror

@Vexu
Copy link
Member

Vexu commented Dec 7, 2022

I'm interested to see how well LLVM can pick up on Zig's vtables since they're not classes and it potentially has less type info about them?

Either way all this performance discussion is purely hypothetical which is bad. We should get some actual numbers on this :)

You can find benchmarks from when the Allocator interface was changed in the original pull request here #10055

Vexu that's cool! But still doesn't help when you don't know the error set, which is most of the time in my experience :)

If you don't care about knowing what errors exactly are produced you could make your function take an Any{Reader,Writer} making it non-generic and then use @errSetCast to get the actual possible set of errors at a point where it matters which would again be nicer if it worked on error unions.

// Turning functions like this:
fn func(stuff: Things, writer: anytype) @TypeOf(writer).Error!void
// into this:
fn func(stuff: Things, writer: AnyWriter) anyerror!void
// with the possibility of this
inline fn func(stuff: Things, writer: Writer(Err)) Err!void {
    return funcExtra(stuff, writer.base) catch |err| @errSetCast(err);
}
fn funcExtra(stuff: Things, writer: AnyWriter) anyerror!void

@silversquirl
Copy link
Contributor

You can find benchmarks from when the Allocator interface was changed

Not really sure how this helps, the allocator change was completely different than this one (fieldParentPtr -> *anyopaque, not comptime interface -> runtime)

@travisstaloch
Copy link
Contributor

travisstaloch commented Dec 7, 2022

wrapping readers/writers recursively

Struggling to see a real-world application for this. Generally you'd only want at max one of each type of reader/writer in the stack. Perhaps you could give an example of an actual problem that is helped by this usecase?

@silversquirl I encountered this issue while working on a protobuf library which often used wrapped readers/writers in its generated (de)serialization methods. Some co-recursive generated code wouldn't compile and couldn't figure out why until SpecsGuy pointed out the issue of infinite nested readers in the discord issue linked above.

see #13380

@Vexu
Copy link
Member

Vexu commented Dec 7, 2022

Not really sure how this helps, the allocator change was completely different than this one (fieldParentPtr -> *anyopaque, not comptime interface -> runtime)

It wasn't supposed to be about the performance changes this would have but about LLVM being able to pick up the vtable pattern.

@InKryption
Copy link
Contributor

Might actually be beneficial in other ways, e.g. code autocompletions, given some code like

fn writeStuff(comptime ErrSet: type, writer: std.io.Writer(ErrSet)) ErrSet!void {
    writer.writeAll("foo"); // <- ZLS could actually figure out code completion here, and namespace lookup.
}

Obviously not enough for this to be considered by itself, but it's certainly worth noting.

@Hejsil
Copy link
Contributor Author

Hejsil commented Dec 7, 2022

One thing I just remembered, which might be a really good reason not to do this, is async code. Using function pointers, there is not much way for the compiler to infer frame sizes and all the fun stuff. Bummer

@Jarred-Sumner
Copy link
Contributor

the type names & error messages involving Reader/Writer get really hard to read currently
image

@InKryption
Copy link
Contributor

Alternatively we could provide an "erased reader" which basically does this but using status quo, e.g.

fn ErasedReader(comptime ErrSet: type) type {
    return struct {
        const Self = @This();
        ptr: *anyopaque,
        readFn: *const fn(*anyopaque, []u8) ErrSet!usize,

        pub fn init(inner: anytype) Self {
            const alignment = @alignOf(@TypeOf(inner));
            const gen = struct {
                fn read(ptr: *anyopaque, bytes: []u8) ErrSet!usize {
                    const casted = @ptrCast(@TypeOf(inner), @alignCast(alignment, ptr));
                    return casted.read(bytes);
                }
            };

            return .{
                .ptr = inner,
                .readFn = gen.read,
            };
        }
        pub const Reader = std.io.Reader(Self, ErrSet, read);
        pub fn reader(self: Self) Reader {
            return .{ .context = self };
        }

        fn read(self: Self, bytes: []u8) ErrSet!usize {
            return self.readFn(self.ptr, bytes);
        }
    };
}

@Hejsil
Copy link
Contributor Author

Hejsil commented Dec 8, 2022

I guess that is an option, but I have my doubts that it will be used if it is something opt in. If we only care about making the type names nicer for Reader and Writer without dynamic dispatch, then we could just erase the child type, but not the readFn/writeFn

pub fn Reader(comptime ReadError: type, comptime readFn: fn(*anyopaque, []u8) ReadError!usize) type {
    return struct {
        pub const Error = ReadError;

        context: *anyopaque,
    };
}

This will make it a minor implementer pain, but in exchange, the type names will not include the child Reader/Writer. I think this also solves the nested Reader/Writer in recursive problem.

@Hejsil
Copy link
Contributor Author

Hejsil commented Dec 8, 2022

Yea, this doesn't seem to be the way. Maybe an ErasedReader could have its uses but that is a separate thing. That async does not work with this makes the PR fall apart. Closing

@andrewrk
Copy link
Member

andrewrk commented Oct 3, 2023

To folks who are interested in this thread, please have a look at #17344 and, if it piques your interest, take it for a spin and let me know what you think.

  • Does it affect bloat of your project?
  • Does it affect performance of your project?
  • Does the new, non-generic std.io.AnyReader API help you solve a problem?

I also didn't implement this for Writer yet, so far it is only Reader.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants