Skip to content

Commit

Permalink
add Rust for Linux material
Browse files Browse the repository at this point in the history
  • Loading branch information
fw-immunant committed Feb 3, 2025
1 parent 6136169 commit fa5cb0b
Show file tree
Hide file tree
Showing 29 changed files with 995 additions and 0 deletions.
32 changes: 32 additions & 0 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -407,6 +407,38 @@
- [Broadcast Chat Application](concurrency/async-exercises/chat-app.md)
- [Solutions](concurrency/async-exercises/solutions.md)

# Rust for Linux

---

- [Welcome](rust-for-linux/welcome.md)
- [Interoperation Requirements](rust-for-linux/basic-requirements.md)
- [Building Kernel Modules](rust-for-linux/modules.md)
- [Type Mapping](rust-for-linux/types.md)
- [Bindings and Safe Interfaces](rust-for-linux/bindings-interfaces.md)
- [Avoiding Bloat](rust-for-linux/bloat.md)
- [Hands-on With Kernel Rust](rust-for-linux/hands-on.md)
- [Rust for Linux](rust-for-linux/rust-for-linux.md)
- [`rust-analyzer` Setup](rust-for-linux/rust-analyzer.md)
- [Macros](rust-for-linux/macros.md)
- [A Rust Kernel Module](rust-for-linux/kernel-module.md)
- [The `module!` Macro](rust-for-linux/modules/module-macro.md)
- [Module Setup and Teardown](rust-for-linux/modules/setup-and-teardown.md)
- [Module Parameters](rust-for-linux/modules/parameters.md)
- [Using Abstractions](rust-for-linux/using-abstractions.md)
- [Complications and Conflicts](rust-for-linux/complications.md)
- [`Pin` and Self-Reference](rust-for-linux/complications/pin.md)
- [The Kernel Rust Safety Model](rust-for-linux/complications/safety.md)
- [Atomic/Task Contexts and Sleep](rust-for-linux/complications/sleeping.md)
- [Memory Models](rust-for-linux/complications/memory-models.md)
- [Separate Compilation and Linking](rust-for-linux/complications/separate-compilation.md)
- [Fallible Allocation](rust-for-linux/complications/fallible-allocation.md)
- [Code Size](rust-for-linux/complications/code-size.md)
- [Documentation](rust-for-linux/complications/kernel-doc.md)
- [Security Mitigations](rust-for-linux/complications/mitigations.md)
- [Async](rust-for-linux/complications/async.md)
- [Next Steps](rust-for-linux/next-steps.md)

# Final Words

---
Expand Down
53 changes: 53 additions & 0 deletions src/rust-for-linux/basic-requirements.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Interoperation Requirements

To use Rust code in Linux, we can start by comparing this situation with C/Rust
interop in userspace.

## Building

In userspace, the most common setup is to use Cargo to compile our Rust and
later integrate into a C build system if needed.
Meanwhile, the Linux Kernel compiles its C code with its custom Kbuild build
system.
In Rust for Linux, the kernel build system invokes the Rust compiler directly,
without Cargo.

## No `libstd`

Unlike typical usage of Rust in userspace, which makes use of the rust standard
library through the `std` crate, Rust in the kernel does not run atop an
operating system, so kernel Rust will have to eschew the standard library.

## Module Support

Much code in the kernel is compiled into kernel modules rather than as part of
the core kernel.
To write kernel modules in Rust we'll need to be able to match the ABI of kernel
modules.

## Safe Wrappers

To reap the benefits of Rust, we want to be able to write as much code as
possible in safe Rust.
This means that we want safe wrappers for as much kernel functionality as
possible.

## Mapping Types

When writing these wrappers, we'll need to refer to the data types of values
passed to and from existing kernel functions in C.
Unlike userspace C, the kernel uses its own set of primitive types rather than
those provided by the C standard.
We'll have to map back and forth between those kernel types and compatible Rust
ones when doing foreign calls.

## Keeping the Kernel Lean

Finally, even the core Rust library assumes a basic level of functionality that
includes some costly operations (e.g. unicode processing) for which the kernel
does not want to pay implementation costs.
To use Rust in the kernel we'll need a way to disable this functionality.

# Outline

{{%segment outline}}
Binary file added src/rust-for-linux/bindgen-mapping.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
71 changes: 71 additions & 0 deletions src/rust-for-linux/bindings-interfaces.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
---
minutes: 18
---

# Bindings and Safe Interfaces

`bindgen` is used to generate low-level, unsafe bindings for C interfaces.

But to reap the benefits of Rust, we want to use safe, foolproof interfaces to unsafe functionality.

Subsystems are expected to implement safe interfaces on top of the low-level generated bindings.
These safe interfaces are exposed as top-level modules within the [`kernel` crate](https://rust.docs.kernel.org/kernel/).
The top-level `bindings` module holds the unsafe `bindgen`-generated bindings,
which are generated from the C headers included by `rust/bindings/bindings_helper.h`.

In Rust for Linux, unsafe `bindgen`-generated bindings should not be used outside the `kernel` crate.
Drivers and other subsystems will make use of the safe abstractions from this crate.

Only a subset of Linux subsystems currently have such abstractions.

It's worth browsing the [list of modules](https://rust.docs.kernel.org/kernel/#modules)
exposed by the `kernel` crate to see what exists currently.
Many of these subsystems have only partial bindings based on the needs of consumers so far.

## Adding a Module

To add a module for some subsystem, first its header must be added to `bindings_helper.h`.
It may be necessary to write some custom code to wrap macros or `inline` functions
that are not automatically handled by `bindgen`; this code lives in the `rust/helpers/` directory.

Then we need to write a safe abstraction using these bindings and exposing them to the rest of kernel Rust.

Some commits from work-in-progress bindings and abstractions
can provide an idea of what it looks like to expose new kernel functionality:

- GPIO Consumer: [fecb4bd73f06bb2cac8e16aca7ef0e2f1b6acb50](https://github.com/Fabo/linux/commit/fecb4bd73f06bb2cac8e16aca7ef0e2f1b6acb50)
- Regmap: [ec0b740ac5ab299e4c86011a0002919e5bbe5c2d](https://github.com/Fabo/linux/commit/ec0b740ac5ab299e4c86011a0002919e5bbe5c2d)
- I2C: [70ed30fcdf8ec62fa91485c3c0a161a9d0194668](https://github.com/Fabo/linux/commit/70ed30fcdf8ec62fa91485c3c0a161a9d0194668)

## Guidelines for Abstractions

Abstractions may not be perfectly safe, but should try to be as safe as possible.
Unsafe functionality exposed should have its safety conditions documented
so that users have guidance on how to use the functionality and justify such use.

Abstractions should also attempt to present relatively idiomatic Rust in their interfaces:
- Follow Rust naming/capitalization conventions while remaining unsurprising to kernel developers.
- Use RAII instead of manual resource management where possible.
- Avoid raw pointers to bound kernel objects in favor of safer, more limited interfaces.

When exposing types from generated bindings, code should make use of the
[`Opaque<T>`](https://rust.docs.kernel.org/kernel/types/struct.Opaque.html) type
along with native Rust references and the
[`ARef<T>`](https://rust.docs.kernel.org/kernel/types/struct.ARef.html) type for types that are inherently reference-counted.
This type links types' built-in reference count operations to the `Clone` and `Drop` traits.

## Submitting the cyclic dependency

We already know that drivers should not use unsafe bindings directly.
But subsystem maintainers may balk if they see patches submitted that add Rust abstractions without motivation or consumers.
But drivers and subsystem abstractions may have to be submitted separately to different maintainers
due to the distributed nature of Linux development.

So how should a developer submit a driver that requires bindings/abstractions for a subsystem not yet exposed to Rust?

There are two main approaches[^1]:

1. Submit the driver as an RFC before submitting the abstractions it relies upon while referencing the RFC as a potential consumer.
2. Submit a stub driver and fill out non-stub functionality as subsystem abstractions land.

[^1]: <https://rust-for-linux.zulipchat.com/#narrow/channel/288089-General/topic/Upstreaming.20a.20driver.20with.20unsave.20C.20API.20calls.3F/near/471677707>
20 changes: 20 additions & 0 deletions src/rust-for-linux/bloat.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
minutes: 5
---

# Avoiding Bloat

Rust for Linux makes use of `libcore` to avoid reimplementing all functionality of the Rust standard library.
But even `libcore` has some functionality built-in that is not portable to all targets the kernel
would like to support or that is not necessary for the kernel while occupying valuable code space.

This includes[^1]:

- Support for math with 128-bit integers
- String formatting for floating-point numbers
- Unicode support for strings

Work is ongoing to make these features optional.
In the meantime, the `libcore` used by Rust for Linux is larger and less portable than it could be.

[^1]: <https://github.com/Rust-for-Linux/linux/issues/514>
39 changes: 39 additions & 0 deletions src/rust-for-linux/complications.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
minutes: 5
---

# Complications and Conflicts

{{%segment outline}}

There are a number of subtleties and unresolved conflicts between the Rust paradigm and the kernel one.
These must be resolved to ship Rust code in the kernel.

Some issues are deeper problems that require additional research and development
before Rust for Linux is ready for the prime-time;
others merely require some additional learning and attention
on behalf of aspiring Rust for Linux developers.

##

Resolving these conflicts involves changes on both sides of the collaboration.
On the Rust side, new features land first in the Nightly edition of the compiler
before being stabilized.

To avoid waiting for stabilization, the kernel uses an
[escape hatch](https://rustc-dev-guide.rust-lang.org/building/bootstrapping/what-bootstrapping-does.html#complications-of-bootstrapping)
to access unstable features even in stable releases of the compiler.
This assists in the goal of eventually deploying Rust for Linux in Linux
distributions that ship only a stable version of the Rust toolchain.

Nonetheless, being able to build Rust for Linux using only stable Rust features
is a significant goal;
the issues blocking this are tracked specifically by both the Rust for Linux
project[^1] and the Rust developers themselves[^2].

In the next slides we'll explore the most significant sources of friction between
Rust and Linux kernel development to be aware of challenges we are likely to encounter
when trying to implement kernel functionality in Rust.

[^1]: <https://github.com/Rust-for-Linux/linux/issues/2>
[^2]: <https://github.com/rust-lang/rust-project-goals/issues/116>
35 changes: 35 additions & 0 deletions src/rust-for-linux/complications/async.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
minutes: 8
---

# Async

The kernel performs many operations concurrently and involves significant amounts of interaction
between CPU cores and other devices.
For this reason, it would be no surprise to see that async Rust would be a fundamental requirement
for using Rust in the kernel.
But the kernel is central arbitrer of most synchronization and is currently written in regular, synchronous C.

Check warning on line 11 in src/rust-for-linux/complications/async.md

View workflow job for this annotation

GitHub Actions / typos

"arbitrer" should be "arbiter" or "arbitrator".

Rust code making use of `async` mostly exists to write composable code that will run atop event loops,
but the Linux kernel is not really organized as an event loop:
user tasks call directly into the kernel; control flow for interrupts is handled by hardware.

As such, `async` support is not critical for most kernel programming tasks.
However, it is possible to view some components of the kernel as async executors,
and some work has been done in this direction.
Wedson Almeida Filho implemented both workqueue-based[^1] and single-threaded async executors as proofs of concept.

There is not a fundamental incompatibility between Rust-for-Linux and Rust `async`,
which is a similar situation to the amenability of `async` to use in embedded Rust programming
(e.g. the Embassy project).

Nonetheless, no killer application of `async` in Rust for Linux has made it a priority.

<details>

[^1]: <https://github.com/Rust-for-Linux/linux/tree/rust/rust/kernel/kasync>

An example of an async server using the kernel async executor may be found
[here](https://github.com/Rust-for-Linux/linux/blob/rust/samples/rust/rust_echo_server.rs).

</details>
48 changes: 48 additions & 0 deletions src/rust-for-linux/complications/code-size.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
---
minutes: 10
---

# Code Size

One pitfall when writing Rust code can be the multiplicative increase in generated machine code when using generics.

For the Linux kernel, which must be suitable for space-limited embedded environments,
keeping code size low is a significant concern.

Experiments with Rust in the kernel so far have shown that Rust code can be of similar code size to C,
but may also be larger in some cases[^1].

## Assessing Bloat

Tools exist to help analyze different source code's contribution to the size of compiled code,
such as [`cargo-bloat`](https://github.com/RazrFalcon/cargo-bloat).

## Shrinking Code Size

The reasons for code bloat vary and are not generally specific to Linux kernel usage of Rust.
The most common causes for code bloat are excessive use of generics and forced inlining.
In general, generics should be prefered over trait objects when writing abstractions

Check warning on line 24 in src/rust-for-linux/complications/code-size.md

View workflow job for this annotation

GitHub Actions / typos

"prefered" should be "preferred".
that are expected to "compile out" or where generating separate code for different types is critical
for performance (e.g. inner loops or arithmetic on values of a generic type).

In other situations, trait objects should be prefered to allow reusing definitions

Check warning on line 28 in src/rust-for-linux/complications/code-size.md

View workflow job for this annotation

GitHub Actions / typos

"prefered" should be "preferred".
without machine-code duplication, which may closer mirror patterns that would be most natural in C.

When accepting generic parameters that get converted to a concrete type before use,
follow the pattern of defining an inner monomorphic function that can be shared[^2]:

```rust
pub fn read_to_string<P: AsRef<Path>>(path: P) -> io::Result<String> {
fn inner(path: &Path) -> io::Result<String> {
let mut file = File::open(path)?;
let size = file.metadata().map(|m| m.len() as usize).ok();
let mut string = String::with_capacity(size.unwrap_or(0));
io::default_read_to_string(&mut file, &mut string, size)?;
Ok(string)
}
inner(path.as_ref())
}
```

[^1]: <https://www.usenix.org/system/files/atc24-li-hongyu.pdf>
[^2]: <https://github.com/rust-lang/rust/blob/ae612bedcbfc7098d1711eb35bc7ca994eb17a4c/library/std/src/fs.rs#L295-L304>
57 changes: 57 additions & 0 deletions src/rust-for-linux/complications/fallible-allocation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
minutes: 13
---

# Fallible Allocation

Allocation in Rust is assumed to be infallible:

```rust
let x = Box::new(5);
```

In the Linux kernel, memory allocation is much more complex.

```C
void * kmalloc(size_t size, int flags)
```
`flags` is one of `GFP_KERNEL`, `GFP_NOWAIT`, `GFP_ATOMIC`, etc.[^1]
The return value must be checked against `NULL` to see whether allocation succeeded.
In Rust for Linux, rather than using the infallible allocation APIs provided by `liballoc`,
the kernel library has its own allocation interfaces:
## `KBox`
```rust
let b = KBox::new(24_u64, GFP_KERNEL)?;
assert_eq!(*b, 24_u64);
```

[`KBox::new`](https://rust.docs.kernel.org/kernel/alloc/kbox/struct.Box.html#tymethod.new)
returns a `Result<Self, AllocError>`.
Here we propagate this error with the `?` operator.

## `KVec`

Similarly, [`KVec`](https://rust.docs.kernel.org/kernel/alloc/kvec/type.KVec.html)
presents a similar API to the standard `Vec`, but where operations that may allocate
take a flags parameter:

```rust
let mut v = KVec::new();
v.push(1, GFP_KERNEL)?;
assert_eq!(&v, &[1]);
```

## `FromIterator`

Because the standard [`FromIterator`](https://doc.rust-lang.org/std/iter/trait.FromIterator.html) trait also involves making new collections
often involving memory allocation, the `.collect()` method on iterators
is not available in Rust for Linux in its original form.
Work is ongoing to design an equivalent API[^2], but for now we do without its convenience.

[^1]: <https://docs.kernel.org/core-api/memory-allocation.html>
[^2]: <https://rust-for-linux.zulipchat.com/#narrow/channel/288089-General/topic/flat_map.20collecting.20with.20Kvec>
23 changes: 23 additions & 0 deletions src/rust-for-linux/complications/kernel-doc.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
minutes: 3
---

# Documentation

Documentation in Rust for Linux is built with the `rustdoc` tool just like for regular Rust code.

Running rustdoc on the kernel is done with the `rustdoc` Make target:

```sh
$ make LLVM=1 rustdoc
```

after which generated docs can be viewed by opening `Documentation/output/rust/rustdoc/kernel/index.html`.

Pre-generated documentation for the current kernel release is available at:

<https://rust.docs.kernel.org/kernel/>

## More information

<https://docs.kernel.org/rust/general-information.html#code-documentation>
Loading

0 comments on commit fa5cb0b

Please sign in to comment.