- Start Date: 2014-12-07
- RFC PR: rust-lang/rfcs#517
- Rust Issue: rust-lang/rust#21070
This RFC proposes a significant redesign of the std::io
and std::os
modules
in preparation for API stabilization. The specific problems addressed by the
redesign are given in the Problems section below, and the key ideas of the
design are given in Vision for IO.
This RFC was originally posted as a single monolithic file, which made it difficult to discuss different parts separately.
It has now been split into a skeleton that covers (1) the problem
statement, (2) the overall vision and organization, and (3) the
std::os
module.
Other parts of the RFC are marked with (stub)
and will be filed as
follow-up PRs against this RFC.
- Summary
- Table of contents
- Problems
- Detailed design
- Vision for IO
- Revising
Reader
andWriter
- String handling (stub)
- Deadlines (stub)
- Splitting streams and cancellation (stub)
- Modules
- core::io
- Adapters
- Free functions
- [Void]
- Seeking
- Buffering
- Cursor
- The std::io facade
- std::env (stub)
- std::fs (stub)
- std::net (stub)
- std::process (stub)
- std::os
- core::io
- Odds and ends
- Drawbacks
- Alternatives
- Unresolved questions
The io
and os
modules are the last large API surfaces of std
that need to
be stabilized. While the basic functionality offered in these modules is
largely traditional, many problems with the APIs have emerged over time. The
RFC discusses the most significant problems below.
This section only covers specific problems with the current library; see Vision for IO for a higher-level view. section.
One of the most pressing -- but also most subtle -- problems with std::io
is
the lack of atomicity in its Reader
and Writer
traits.
For example, the Reader
trait offers a read_to_end
method:
fn read_to_end(&mut self) -> IoResult<Vec<u8>>
Executing this method may involve many calls to the underlying read
method. And it is possible that the first several calls succeed, and then a call
returns an Err
-- which, like TimedOut
, could represent a transient
problem. Unfortunately, given the above signature, there is no choice but to
simply throw this data away.
The Writer
trait suffers from a more fundamental problem, since its primary
method, write
, may actually involve several calls to the underlying system --
and if a failure occurs, there is no indication of how much was written.
Existing blocking APIs all have to deal with this problem, and Rust
can and should follow the existing tradition here. See
Revising Reader
and Writer
for the proposed solution.
The std::io
module supports "timeouts" on virtually all IO objects via a
set_timeout
method. In this design, every IO object (file, socket, etc.) has
an optional timeout associated with it, and set_timeout
mutates the associated
timeout. All subsequent blocking operations are implicitly subject to this timeout.
This API choice suffers from two problems, one cosmetic and the other deeper:
-
The "timeout" is actually a deadline and should be named accordingly.
-
The stateful API has poor composability: when passing a mutable reference of an IO object to another function, it's possible that the deadline has been changed. In other words, users of the API can easily interfere with each other by accident.
See Deadlines for the proposed solution.
The current io
and os
modules were originally designed when librustuv
was
providing IO support, and to some extent they reflect the capabilities and
conventions of libuv
-- which in turn are loosely based on Posix.
As such, the modules are not always ideal from a cross-platform standpoint, both in terms of forcing Windows programmings into a Posix mold, and also of offering APIs that are not actually usable on all platforms.
The modules have historically also provided no platform-specific APIs.
Part of the goal of this RFC is to set out a clear and extensible story for both
cross-platform and platform-specific APIs in std
. See Design principles for
the details.
Rust has followed the utf8 everywhere approach to its strings. However, at the borders to platform APIs, it is revealed that the world is not, in fact, UTF-8 (or even Unicode) everywhere.
Currently our story for platform APIs is that we either assume they can take or
return Unicode strings (suitably encoded) or an uninterpreted byte
sequence. Sadly, this approach does not actually cover all platform needs, and
is also not highly ergonomic as presently implemented. (Consider os::getev
which introduces replacement characters (!) versus os::getenv_as_bytes
which
yields a Vec<u8>
; neither is ideal.)
This topic was covered in some detail in the Path Reform RFC, but this RFC gives a more general account in String handling.
The stdio
module provides access to readers/writers for stdin
, stdout
and
stderr
, which is essential functionality. However, it also provides a means
of changing e.g. "stdout" -- but there is no connection between these two! In
particular, set_stdout
affects only the writer that println!
and friends
use, while set_stderr
affects panic!
.
This module needs to be clarified. See The std::io facade and [Functionality moved elsewhere] for the detailed design.
There are a few places where io
provides high-level abstractions over system
services without also providing more direct access to the service as-is. For example:
-
The
Writer
trait'swrite
method -- a cornerstone of IO -- actually corresponds to an unbounded number of invocations of writes to the underlying IO object. This RFC changeswrite
to follow more standard, lower-level practice; see RevisingReader
andWriter
. -
Objects like
TcpStream
areClone
, which involves a fair amount of supporting infrastructure. This RFC tackles the problems thatClone
was trying to solve more directly; see Splitting streams and cancellation.
The motivation for going lower-level is described in Design principles below.
The std::io
module is somewhat unusual in that most of the functionality it
proves are used through a few key traits (like Reader
) and these traits are in
turn "lifted" over IoResult
:
impl<R: Reader> Reader for IoResult<R> { ... }
This lifting and others makes it possible to chain IO operations that might produce errors, without any explicit mention of error handling:
File::open(some_path).read_to_end()
^~~~~~~~~~~ can produce an error
^~~~ can produce an error
The result of such a chain is either Ok
of the outcome, or Err
of the first
error.
While this pattern is highly ergonomic, it does not fit particularly well into
our evolving error story
(interoperation or
try blocks), and it is the only
module in std
to follow this pattern.
Eventually, we would like to write
File::open(some_path)?.read_to_end()
to take advantage of the FromError
infrastructure, hook into error handling
control flow, and to provide good chaining ergonomics throughout all Rust APIs
-- all while keeping this handling a bit more explicit via the ?
operator. (See rust-lang#243 for the rough direction).
In the meantime, this RFC proposes to phase out the use of impls for
IoResult
. This will require use of try!
for the time being.
(Note: this may put some additional pressure on at least landing the basic use
of ?
instead of today's try!
before 1.0 final.)
There's a lot of material here, so the RFC starts with high-level goals, principles, and organization, and then works its way through the various modules involved.
Rust's IO story had undergone significant evolution, starting from a
libuv
-style pure green-threaded model to a dual green/native model and now to
a pure native model. Given that
history, it's worthwhile to set out explicitly what is, and is not, in scope for
std::io
For Rust 1.0, the aim is to:
-
Provide a blocking API based directly on the services provided by the native OS for native threads.
These APIs should cover the basics (files, basic networking, basic process management, etc) and suffice to write servers following the classic Apache thread-per-connection model. They should impose essentially zero cost over the underlying OS services; the core APIs should map down to a single syscall unless more are needed for cross-platform compatibility.
-
Provide basic blocking abstractions and building blocks (various stream and buffer types and adapters) based on traditional blocking IO models but adapted to fit well within Rust.
-
Provide hooks for integrating with low-level and/or platform-specific APIs.
-
Ensure reasonable forwards-compatibility with future async IO models.
It is explicitly not a goal at this time to support asynchronous programming models or nonblocking IO, nor is it a goal for the blocking APIs to eventually be used in a nonblocking "mode" or style.
Rather, the hope is that the basic abstractions of files, paths, sockets, and so on will eventually be usable directly within an async IO programing model and/or with nonblocking APIs. This is the case for most existing languages, which offer multiple interoperating IO models.
The long term intent is certainly to support async IO in some form, but doing so will require new research and experimentation.
Now that the scope has been clarified, it's important to lay out some broad
principles for the io
and os
modules. Many of these principles are already
being followed to some extent, but this RFC makes them more explicit and applies
them more uniformly.
Historically, Rust's std
has always been "cross-platform", but as discussed in
Posix and libuv bias this hasn't always played out perfectly. The proposed
policy is below. With this policies, the APIs should largely feel like part of
"Rust" rather than part of any legacy, and they should enable truly portable
code.
Except for an explicit opt-in (see Platform-specific opt-in below), all APIs
in std
should be cross-platform:
-
The APIs should only expose a service or a configuration if it is supported on all platforms, and if the semantics on those platforms is or can be made loosely equivalent. (The latter requires exercising some judgment). Platform-specific functionality can be handled separately (Platform-specific opt-in) and interoperate with normal
std
abstractions.This policy rules out functions like
chown
which have a clear meaning on Unix and no clear interpretation on Windows; the ownership and permissions models are very different. -
The APIs should follow Rust's conventions, including their naming, which should be platform-neutral.
This policy rules out names like
fstat
that are the legacy of a particular platform family. -
The APIs should never directly expose the representation of underlying platform types, even if they happen to coincide on the currently-supported platforms. Cross-platform types in
std
should be newtyped.This policy rules out exposing e.g. error numbers directly as an integer type.
The next subsection gives detail on what these APIs should look like in relation to system services.
How should Rust APIs map into system services? This question breaks down along several axes which are in tension with one another:
-
Guarantees. The APIs provided in the mainline
io
modules should be predominantly safe, aside from the occasionalunsafe
function. In particular, the representation should be sufficiently hidden that most use cases are safe by construction. Beyond memory safety, though, the APIs should strive to provide a clear multithreaded semantics (using theSend
/Sync
kinds), and should use Rust's type system to rule out various kinds of bugs when it is reasonably ergonomic to do so (following the usual Rust conventions). -
Ergonomics. The APIs should present a Rust view of things, making use of the trait system, newtypes, and so on to make system services fit well with the rest of Rust.
-
Abstraction/cost. On the other hand, the abstractions introduced in
std
must not induce significant costs over the system services -- or at least, there must be a way to safely access the services directly without incurring this penalty. When useful abstractions would impose an extra cost, they must be pay-as-you-go.
Putting the above bullets together, the abstractions must be safe, and they should be as high-level as possible without imposing a tax.
- Coverage. Finally, the
std
APIs should over time strive for full coverage of non-niche, cross-platform capabilities.
Rust is a systems language, and as such it should expose seamless, no/low-cost access to system services. In many cases, however, this cannot be done in a cross-platform way, either because a given service is only available on some platforms, or because providing a cross-platform abstraction over it would be costly.
This RFC proposes platform-specific opt-in: submodules of os
that are named
by platform, and made available via #[cfg]
switches. For example, os::unix
can provide APIs only available on Unix systems, and os::linux
can drill
further down into Linux-only APIs. (You could even imagine subdividing by OS
versions.) This is "opt-in" in the sense that, like the unsafe
keyword, it is
very easy to audit for potential platform-specificity: just search for
os::anyplatform
. Moreover, by separating out subsets like linux
, it's clear
exactly how specific the platform dependency is.
The APIs in these submodules are intended to have the same flavor as other io
APIs and should interoperate seamlessly with cross-platform types, but:
-
They should be named according to the underlying system services when there is a close correspondence.
-
They may reveal the underlying OS type if there is nothing to be gained by hiding it behind an abstraction.
For example, the os::unix
module could provide a stat
function that takes a
standard Path
and yields a custom struct. More interestingly, os::linux
might include an epoll
function that could operate directly on many io
types (e.g. various socket types), without any explicit conversion to a file
descriptor; that's what "seamless" means.
Each of the platform modules will offer a custom prelude
submodule,
intended for glob import, that includes all of the extension traits
applied to standard IO objects.
The precise design of these modules is in the very early stages and will likely
remain #[unstable]
for some time.
The io
module is currently the biggest in std
, with an entire hierarchy
nested underneath; it mixes general abstractions/tools with specific IO objects.
The os
module is currently a bit of a dumping ground for facilities that don't
fit into the io
category.
This RFC proposes the revamp the organization by flattening out the hierarchy and clarifying the role of each module:
std
env environment manipulation
fs file system
io core io abstractions/adapters
prelude the io prelude
net networking
os
unix platform-specific APIs
linux ..
windows ..
os_str platform-sensitive string handling
process process management
In particular:
-
The contents of
os
will largely move toenv
, a new module for inspecting and updating the "environment" (including environment variables, CPU counts, arguments tomain
, and so on). -
The
io
module will include things likeReader
andBufferedWriter
-- cross-cutting abstractions that are needed throughout IO.The
prelude
submodule will export all of the traits and most of the types for IO-related APIs; a single glob import should suffice to set you up for working with IO. (Note: this goes hand-in-hand with removing the bits ofio
currently in the prelude, as recently proposed.) -
The root
os
module is used purely to house the platform submodules discussed above. -
The
os_str
module is part of the solution to the Unicode problem; see String handling below. -
The
process
module over time will grow to include querying/manipulating already-running processes, not just spawning them.
The Reader
and Writer
traits are the backbone of IO, representing
the ability to (respectively) pull bytes from and push bytes to an IO
object. The core operations provided by these traits follows a very
long tradition for blocking IO, but they are still surprisingly subtle
-- and they need to be revised.
-
Atomicity and data loss. As discussed above, the
Reader
andWriter
traits currently expose methods that involve multiple actual reads or writes, and data is lost when an error occurs after some (but not all) operations have completed.The proposed strategy for
Reader
operations is to (1) separate out various deserialization methods into a distinct framework, (2) never have the internalread
implementations loop on errors, (3) cut down on the number of non-atomic read operations and (4) adjust the remaining operations to provide more flexibility when possible.For writers, the main change is to make
write
only perform a single underlying write (returning the number of bytes written on success), and provide a separatewrite_all
method. -
Parsing/serialization. The
Reader
andWriter
traits currently provide a large number of default methods for (de)serialization of various integer types to bytes with a given endianness. Unfortunately, these operations pose atomicity problems as well (e.g., a read could fail after reading two of the bytes needed for au32
value).Rather than complicate the signatures of these methods, the (de)serialization infrastructure is removed entirely -- in favor of instead eventually introducing a much richer parsing/formatting/(de)serialization framework that works seamlessly with
Reader
andWriter
.Such a framework is out of scope for this RFC, but the endian-sensitive functionality will be provided elsewhere (likely out of tree).
With those general points out of the way, let's look at the details.
The updated Reader
trait (and its extension) is as follows:
trait Read {
fn read(&mut self, buf: &mut [u8]) -> Result<usize, Error>;
fn read_to_end(&mut self, buf: &mut Vec<u8>) -> Result<(), Error> { ... }
fn read_to_string(&self, buf: &mut String) -> Result<(), Error> { ... }
}
// extension trait needed for object safety
trait ReadExt: Read {
fn bytes(&mut self) -> Bytes<Self> { ... }
... // more to come later in the RFC
}
impl<R: Read> ReadExt for R {}
Following the
trait naming conventions,
the trait is renamed to Read
reflecting the clear primary method it
provides.
The read
method should not involve internal looping (even over
errors like EINTR
). It is intended to faithfully represent a single
call to an underlying system API.
The read_to_end
and read_to_string
methods now take explicit
buffers as input. This has multiple benefits:
-
Performance. When it is known that reading will involve some large number of bytes, the buffer can be preallocated in advance.
-
"Atomicity" concerns. For
read_to_end
, it's possible to use this API to retain data collected so far even when aread
fails in the middle. Forread_to_string
, this is not the case, because UTF-8 validity cannot be ensured in such cases; but if intermediate results are wanted, one can useread_to_end
and convert to aString
only at the end.
Convenience methods like these will retry on EINTR
. This is partly
under the assumption that in practice, EINTR will most often arise
when interfacing with other code that changes a signal handler. Due to
the global nature of these interactions, such a change can suddenly
cause your own code to get an error irrelevant to it, and the code
should probably just retry in those cases. In the case where you are
using EINTR explicitly, read
and write
will be available to handle
it (and you can always build your own abstractions on top).
The proposed Read
trait is much slimmer than today's Reader
. The vast
majority of removed methods are parsing/deserialization, which were
discussed above.
The remaining methods (read_exact
, read_at_least
, push
,
push_at_least
) were removed for various reasons:
-
read_exact
,read_at_least
: these are somewhat more obscure conveniences that are not particularly robust due to lack of atomicity. -
push
,push_at_least
: these are special-cases for working withVec
, which this RFC proposes to replace with a more general mechanism described next.
To provide some of this functionality in a more composition way,
extend Vec<T>
with an unsafe method:
unsafe fn with_extra(&mut self, n: uint) -> &mut [T];
This method is equivalent to calling reserve(n)
and then providing a
slice to the memory starting just after len()
entries. Using this
method, clients of Read
can easily recover the push
method.
The Writer
trait is cut down to even smaller size:
trait Write {
fn write(&mut self, buf: &[u8]) -> Result<uint, Error>;
fn flush(&mut self) -> Result<(), Error>;
fn write_all(&mut self, buf: &[u8]) -> Result<(), Error> { .. }
fn write_fmt(&mut self, fmt: &fmt::Arguments) -> Result<(), Error> { .. }
}
The biggest change here is to the semantics of write
. Instead of
repeatedly writing to the underlying IO object until all of buf
is
written, it attempts a single write and on success returns the
number of bytes written. This follows the long tradition of blocking
IO, and is a more fundamental building block than the looping write we
currently have. Like read
, it will propagate EINTR.
For convenience, write_all
recovers the behavior of today's write
,
looping until either the entire buffer is written or an error
occurs. To meaningfully recover from an intermediate error and keep
writing, code should work with write
directly. Like the Read
conveniences, EINTR
results in a retry.
The write_fmt
method, like write_all
, will loop until its entire
input is written or an error occurs.
The other methods include endian conversions (covered by
serialization) and a few conveniences like write_str
for other basic
types. The latter, at least, is already uniformly (and extensibly)
covered via the write!
macro. The other helpers, as with Read
,
should migrate into a more general (de)serialization library.
To be added in a follow-up PR.
To be added in a follow-up PR.
To be added in a follow-up PR.
Now that we've covered the core principles and techniques used throughout IO, we can go on to explore the modules in detail.
Ideally, the io
module will be split into the parts that can live in
libcore
(most of it) and the parts that are added in the std::io
facade. This part of the organization is non-normative, since it
requires changes to today's IoError
(which currently references
String
); if these changes cannot be performed, everything here will
live in std::io
.
The current std::io::util
module offers a number of Reader
and
Writer
"adapters". This RFC refactors the design to more closely
follow std::iter
. Along the way, it generalizes the by_ref
adapter:
trait ReadExt: Read {
// ... eliding the methods already described above
// Postfix version of `(&mut self)`
fn by_ref(&mut self) -> &mut Self { ... }
// Read everything from `self`, then read from `next`
fn chain<R: Read>(self, next: R) -> Chain<Self, R> { ... }
// Adapt `self` to yield only the first `limit` bytes
fn take(self, limit: u64) -> Take<Self> { ... }
// Whenever reading from `self`, push the bytes read to `out`
#[unstable] // uncertain semantics of errors "halfway through the operation"
fn tee<W: Write>(self, out: W) -> Tee<Self, W> { ... }
}
trait WriteExt: Write {
// Postfix version of `(&mut self)`
fn by_ref<'a>(&'a mut self) -> &mut Self { ... }
// Whenever bytes are written to `self`, write them to `other` as well
#[unstable] // uncertain semantics of errors "halfway through the operation"
fn broadcast<W: Write>(self, other: W) -> Broadcast<Self, W> { ... }
}
// An adaptor converting an `Iterator<u8>` to `Read`.
pub struct IterReader<T> { ... }
As with std::iter
, these adapters are object unsafe and hence placed
in an extension trait with a blanket impl
.
The current std::io::util
module also includes a number of primitive
readers and writers, as well as copy
. These are updated as follows:
// A reader that yields no bytes
fn empty() -> Empty; // in theory just returns `impl Read`
impl Read for Empty { ... }
// A reader that yields `byte` repeatedly (generalizes today's ZeroReader)
fn repeat(byte: u8) -> Repeat;
impl Read for Repeat { ... }
// A writer that ignores the bytes written to it (/dev/null)
fn sink() -> Sink;
impl Write for Sink { ... }
// Copies all data from a `Read` to a `Write`, returning the amount of data
// copied.
pub fn copy<R, W>(r: &mut R, w: &mut W) -> Result<u64, Error>
Like write_all
, the copy
method will discard the amount of data already
written on any error and also discard any partially read data on a write
error. This method is intended to be a convenience and write
should be used
directly if this is not desirable.
The seeking infrastructure is largely the same as today's, except that
tell
is removed and the seek
signature is refactored with more precise
types:
pub trait Seek {
// returns the new position after seeking
fn seek(&mut self, pos: SeekFrom) -> Result<u64, Error>;
}
pub enum SeekFrom {
Start(u64),
End(i64),
Current(i64),
}
The old tell
function can be regained via seek(SeekFrom::Current(0))
.
The current Buffer
trait will be renamed to BufRead
for
clarity (and to open the door to BufWrite
at some later
point):
pub trait BufRead: Read {
fn fill_buf(&mut self) -> Result<&[u8], Error>;
fn consume(&mut self, amt: uint);
fn read_until(&mut self, byte: u8, buf: &mut Vec<u8>) -> Result<(), Error> { ... }
fn read_line(&mut self, buf: &mut String) -> Result<(), Error> { ... }
}
pub trait BufReadExt: BufRead {
// Split is an iterator over Result<Vec<u8>, Error>
fn split(&mut self, byte: u8) -> Split<Self> { ... }
// Lines is an iterator over Result<String, Error>
fn lines(&mut self) -> Lines<Self> { ... };
// Chars is an iterator over Result<char, Error>
fn chars(&mut self) -> Chars<Self> { ... }
}
The read_until
and read_line
methods are changed to take explicit,
mutable buffers, for similar reasons to read_to_end
. (Note that
buffer reuse is particularly common for read_line
). These functions
include the delimiters in the strings they produce, both for easy
cross-platform compatibility (in the case of read_line
) and for ease
in copying data without loss (in particular, distinguishing whether
the last line included a final delimiter).
The split
and lines
methods provide iterator-based versions of
read_until
and read_line
, and do not include the delimiter in
their output. This matches conventions elsewhere (like split
on
strings) and is usually what you want when working with iterators.
The BufReader
, BufWriter
and BufStream
types stay
essentially as they are today, except that for streams and writers the
into_inner
method yields the structure back in the case of a flush error:
// If flushing fails, you get the unflushed data back
fn into_inner(self) -> Result<W, IntoInnerError<Self>>;
pub struct IntoInnerError<W>(W, Error);
impl IntoInnerError<T> {
pub fn error(&self) -> &Error { ... }
pub fn into_inner(self) -> W { ... }
}
impl<W> FromError<IntoInnerError<W>> for Error { ... }
Many applications want to view in-memory data as either an implementor of Read
or Write
. This is often useful when composing streams or creating test cases.
This functionality primarily comes from the following implementations:
impl<'a> Read for &'a [u8] { ... }
impl<'a> Write for &'a mut [u8] { ... }
impl Write for Vec<u8> { ... }
While efficient, none of these implementations support seeking (via an
implementation of the Seek
trait). The implementations of Read
and Write
for these types is not quite as efficient when Seek
needs to be used, so the
Seek
-ability will be opted-in to with a new Cursor
structure with the
following API:
pub struct Cursor<T> {
pos: u64,
inner: T,
}
impl<T> Cursor<T> {
pub fn new(inner: T) -> Cursor<T>;
pub fn into_inner(self) -> T;
pub fn get_ref(&self) -> &T;
}
// Error indicating that a negative offset was seeked to.
pub struct NegativeOffset;
impl Seek for Cursor<Vec<u8>> { ... }
impl<'a> Seek for Cursor<&'a [u8]> { ... }
impl<'a> Seek for Cursor<&'a mut [u8]> { ... }
impl Read for Cursor<Vec<u8>> { ... }
impl<'a> Read for Cursor<&'a [u8]> { ... }
impl<'a> Read for Cursor<&'a mut [u8]> { ... }
impl BufRead for Cursor<Vec<u8>> { ... }
impl<'a> BufRead for Cursor<&'a [u8]> { ... }
impl<'a> BufRead for Cursor<&'a mut [u8]> { ... }
impl<'a> Write for Cursor<&'a mut [u8]> { ... }
impl Write for Cursor<Vec<u8>> { ... }
A sample implementation can be found in a gist. Using one
Cursor
structure allows to emphasize that the only ability added is an
implementation of Seek
while still allowing all possible I/O operations for
various types of buffers.
It is not currently proposed to unify these implementations via a trait. For
example a Cursor<Rc<[u8]>>
is a reasonable instance to have, but it will not
have an implementation listed in the standard library to start out. It is
considered a backwards-compatible addition to unify these various impl
blocks
with a trait.
The following types will be removed from the standard library and replaced as follows:
MemReader
->Cursor<Vec<u8>>
MemWriter
->Cursor<Vec<u8>>
BufReader
->Cursor<&[u8]>
orCursor<&mut [u8]>
BufWriter
->Cursor<&mut [u8]>
The std::io
module will largely be a facade over core::io
, but it
will add some functionality that can live only in std
.
The IoError
type will be renamed to std::io::Error
, following our
non-prefixing convention.
It will remain largely as it is today, but its fields will be made
private. It may eventually grow a field to track the underlying OS
error code.
The std::io::IoErrorKind
type will become std::io::ErrorKind
, and
ShortWrite
will be dropped (it is no longer needed with the new
Write
semantics), which should decrease its footprint. The
OtherIoError
variant will become Other
now that enum
s are
namespaced. Other variants may be added over time, such as Interrupted
,
as more errors are classified from the system.
The EndOfFile
variant will be removed in favor of returning Ok(0)
from read
on end of file (or write
on an empty slice for example). This
approach clarifies the meaning of the return value of read
, matches Posix
APIs, and makes it easier to use try!
in the case that a "real" error should
be bubbled out. (The main downside is that higher-level operations that might
use Result<T, IoError>
with some T != usize
may need to wrap IoError
in a
further enum if they wish to forward unexpected EOF.)
The ChanReader
and ChanWriter
adapters will be left as they are today, and
they will remain #[unstable]
. The channel adapters currently suffer from a few
problems today, some of which are inherent to the design:
- Construction is somewhat unergonomic. First a
mpsc
channel pair must be created and then each half of the reader/writer needs to be created. - Each call to
write
involves moving memory onto the heap to be sent, which isn't necessarily efficient. - The design of
std::sync::mpsc
allows for growing more channels in the future, but it's unclear if we'll want to continue to provide a reader/writer adapter for each channel we add tostd::sync
.
These types generally feel as if they're from a different era of Rust (which
they are!) and may take some time to fit into the current standard library. They
can be reconsidered for stabilization after the dust settles from the I/O
redesign as well as the recent std::sync
redesign. At this time, however, this
RFC recommends they remain unstable.
To be added in a follow-up PR.
To be added in a follow-up PR.
To be added in a follow-up PR.
To be added in a follow-up PR.
To be added in a follow-up PR.
Initially, this module will be empty except for the platform-specific
unix
and windows
modules. It is expected to grow additional, more
specific platform submodules (like linux
, macos
) over time.
To be expanded in a follow-up PR.
The prelude
submodule will contain most of the traits, types, and
modules discussed in this RFC; it is meant to provide maximal
convenience when working with IO of any kind. The exact contents of
the module are left as an open question.
This RFC is largely about cleanup, normalization, and stabilization of our IO libraries -- work that needs to be done, but that also represents nontrivial churn.
However, the actual implementation work involved is estimated to be
reasonably contained, since all of the functionality is already in
place in some form (including os_str
, due to @SimonSapin's
WTF-8 implementation).
The main alternative design would be to continue staying with the
Posix tradition in terms of naming and functionality (for which there
is precedent in some other languages). However, Rust is already
well-known for its strong cross-platform compatibility in std
, and
making the library more Windows-friendly will only increase its appeal.
More radically different designs (in terms of different design principles or visions) are outside the scope of this RFC.
To be expanded in a follow-up PR.