Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New printing support #120

Merged
merged 1 commit into from
Apr 11, 2021
Merged

New printing support #120

merged 1 commit into from
Apr 11, 2021

Conversation

ojeda
Copy link
Member

@ojeda ojeda commented Mar 18, 2021

Features:

  • Adds macros like info! (with newline) and pr_info! (without
    newline) for the usual levels that the kernel supports.

  • Includes the "continuation" level.

  • Includes automatically prefix with the module name they are called from.

  • Includes, hopefully, pretty detailed docs.

  • Includes aliases to make it easier to find the equivalent to
    println! for newcomers, since we don't have that one anymore.

Implementation:

  • Stack usage should be 1 KiB at most, regardless of the number of
    invocations of the macro in a single function. This should make
    it possible to use even when optimizations are not enabled.
    See code comments for the details.

  • The formatting string is compile-time, shared as a static,
    and always the same. We could perhaps pre-generate them
    per-module to avoid another %.*s in printk(), but I don't
    think it is worth the complexity at the moment, and it would
    increase memory usage anyway. This uses the nightly
    const_panic feature -- added to Rust unstable features needed for the kernel #2.

Naming:

  • The pr_* names are the same as the kernel ones, and behave
    similarly, i.e. no newline at the end.

  • The non-pr_* ones are the same but without the prefix.
    They look better (shorter, no underscore), have the newline like
    println!, and they are what people should almost always use except
    if they need continuation. And they look pretty similar to the Rust
    log crate ones, too!

  • I think we can afford these short names (in the prelude, even!) since
    this is new code and nobody should be defining macros in the kernel
    with such short names, ever!

Future:

  • I will open an issue with a few things that can be added on top,
    similar to the kernel parameters feature. Done: Complete printing support #122.

  • When we have tidy support (or custom clippy lints), we can add
    one to find the pattern:

    pr_*("...\n", ...);
    

    and suggest the non-prefixed, non-newline version instead:

    *("...", "...");
    

Signed-off-by: Miguel Ojeda [email protected]

rust/kernel/print.rs Outdated Show resolved Hide resolved
rust/kernel/print.rs Outdated Show resolved Hide resolved
rust/kernel/print.rs Outdated Show resolved Hide resolved
This was referenced Mar 18, 2021
rust/kernel/prelude.rs Outdated Show resolved Hide resolved
rust/module.rs Outdated Show resolved Hide resolved
rust/kernel/print.rs Show resolved Hide resolved
rust/kernel/print.rs Outdated Show resolved Hide resolved
rust/kernel/print.rs Outdated Show resolved Hide resolved
@ojeda ojeda force-pushed the rust-printk branch 2 times, most recently from eaa8885 to ca19455 Compare April 11, 2021 04:31
@ojeda
Copy link
Member Author

ojeda commented Apr 11, 2021

  • Suggestions from code review.
  • Factorize the generate()s by using the same length for the CONT macro too.
  • More comprehensive tests to ensure we test the different macro cases.
  • Added rust_print.rs sample, updated CI (now we don't need to sed the module parameters tests since the information we need is printed now thanks to the module name being prefixed), etc.
  • Removed the devel! macros because there is no easy way that I know of to set a cfg option within Rust code itself (i.e. the C macro is intended for quick debugging where you can write #define DEBUG temporarily).

Features:

  - Adds macros like `info!` (with newline) and `pr_info!` (without
    newline) for the usual levels that the kernel support.

  - Includes the "continuation" level.

  - Includes automatically the module name in the string.

  - Includes, hopefully, pretty detailed logs.

  - Includes aliases to make it even easier to find the equivalent to
    `println!` for newcomers.

Implementation:

  - Stack usage should be 1 KiB at most, regardless of the number of
    invocations of the macro in a single function. This should make
    it possible to use even when optimizations are not enabled.
    See code comments for the details.

  - The formatting string is compile-time, shared as a `static`,
    and always the same. We could perhaps pre-generate them
    per-module to avoid another `%.*s` in `printk()`, but I don't
    think it is worth the complexity at the moment, and it would
    increase memory usage anyway.

Naming:

  - The `pr_*` names are the same as the kernel ones, and behave
    similarly, i.e. no newline at the end.

  - The non-`pr_*` ones are the same but without the prefix.
    They look better (shorter, no underscore), have the newline like
    `println!`, and they are what people should almost always use except
    if they need continuation. And they look pretty similar to the Rust
    `log` crate ones, too!

  - I think we can afford these short names (in the prelude, even!) since
    this is new code and nobody should be defining macros in the kernel
    with such short names, ever!

Future:

  - I will open an issue with a few things that can be added on top,
    similar to the kernel parameters feature.

  - When we have tidy support (or custom clippy lints), we can add
    one to find the pattern:

        pr_*("...\n", ...);

    and suggest the non-prefixed, non-newline version instead:

        *("...", "...");

Signed-off-by: Miguel Ojeda <[email protected]>
@ojeda
Copy link
Member Author

ojeda commented Apr 11, 2021

A few trivial doc improvements.

@ojeda ojeda merged commit 02c8ac6 into Rust-for-Linux:rust Apr 11, 2021
@ojeda ojeda deleted the rust-printk branch April 11, 2021 14:14
ojeda pushed a commit that referenced this pull request Jan 16, 2023
The inline assembly for arm64's cmpxchg_double*() implementations use a
+Q constraint to hazard against other accesses to the memory location
being exchanged. However, the pointer passed to the constraint is a
pointer to unsigned long, and thus the hazard only applies to the first
8 bytes of the location.

GCC can take advantage of this, assuming that other portions of the
location are unchanged, leading to a number of potential problems.

This is similar to what we fixed back in commit:

  fee960b ("arm64: xchg: hazard against entire exchange variable")

... but we forgot to adjust cmpxchg_double*() similarly at the same
time.

The same problem applies, as demonstrated with the following test:

| struct big {
|         u64 lo, hi;
| } __aligned(128);
|
| unsigned long foo(struct big *b)
| {
|         u64 hi_old, hi_new;
|
|         hi_old = b->hi;
|         cmpxchg_double_local(&b->lo, &b->hi, 0x12, 0x34, 0x56, 0x78);
|         hi_new = b->hi;
|
|         return hi_old ^ hi_new;
| }

... which GCC 12.1.0 compiles as:

| 0000000000000000 <foo>:
|    0:   d503233f        paciasp
|    4:   aa0003e4        mov     x4, x0
|    8:   1400000e        b       40 <foo+0x40>
|    c:   d2800240        mov     x0, #0x12                       // #18
|   10:   d2800681        mov     x1, #0x34                       // #52
|   14:   aa0003e5        mov     x5, x0
|   18:   aa0103e6        mov     x6, x1
|   1c:   d2800ac2        mov     x2, #0x56                       // #86
|   20:   d2800f03        mov     x3, #0x78                       // #120
|   24:   48207c82        casp    x0, x1, x2, x3, [x4]
|   28:   ca050000        eor     x0, x0, x5
|   2c:   ca060021        eor     x1, x1, x6
|   30:   aa010000        orr     x0, x0, x1
|   34:   d2800000        mov     x0, #0x0                        // #0    <--- BANG
|   38:   d50323bf        autiasp
|   3c:   d65f03c0        ret
|   40:   d2800240        mov     x0, #0x12                       // #18
|   44:   d2800681        mov     x1, #0x34                       // #52
|   48:   d2800ac2        mov     x2, #0x56                       // #86
|   4c:   d2800f03        mov     x3, #0x78                       // #120
|   50:   f9800091        prfm    pstl1strm, [x4]
|   54:   c87f1885        ldxp    x5, x6, [x4]
|   58:   ca0000a5        eor     x5, x5, x0
|   5c:   ca0100c6        eor     x6, x6, x1
|   60:   aa0600a6        orr     x6, x5, x6
|   64:   b5000066        cbnz    x6, 70 <foo+0x70>
|   68:   c8250c82        stxp    w5, x2, x3, [x4]
|   6c:   35ffff45        cbnz    w5, 54 <foo+0x54>
|   70:   d2800000        mov     x0, #0x0                        // #0     <--- BANG
|   74:   d50323bf        autiasp
|   78:   d65f03c0        ret

Notice that at the lines with "BANG" comments, GCC has assumed that the
higher 8 bytes are unchanged by the cmpxchg_double() call, and that
`hi_old ^ hi_new` can be reduced to a constant zero, for both LSE and
LL/SC versions of cmpxchg_double().

This patch fixes the issue by passing a pointer to __uint128_t into the
+Q constraint, ensuring that the compiler hazards against the entire 16
bytes being modified.

With this change, GCC 12.1.0 compiles the above test as:

| 0000000000000000 <foo>:
|    0:   f9400407        ldr     x7, [x0, #8]
|    4:   d503233f        paciasp
|    8:   aa0003e4        mov     x4, x0
|    c:   1400000f        b       48 <foo+0x48>
|   10:   d2800240        mov     x0, #0x12                       // #18
|   14:   d2800681        mov     x1, #0x34                       // #52
|   18:   aa0003e5        mov     x5, x0
|   1c:   aa0103e6        mov     x6, x1
|   20:   d2800ac2        mov     x2, #0x56                       // #86
|   24:   d2800f03        mov     x3, #0x78                       // #120
|   28:   48207c82        casp    x0, x1, x2, x3, [x4]
|   2c:   ca050000        eor     x0, x0, x5
|   30:   ca060021        eor     x1, x1, x6
|   34:   aa010000        orr     x0, x0, x1
|   38:   f9400480        ldr     x0, [x4, #8]
|   3c:   d50323bf        autiasp
|   40:   ca0000e0        eor     x0, x7, x0
|   44:   d65f03c0        ret
|   48:   d2800240        mov     x0, #0x12                       // #18
|   4c:   d2800681        mov     x1, #0x34                       // #52
|   50:   d2800ac2        mov     x2, #0x56                       // #86
|   54:   d2800f03        mov     x3, #0x78                       // #120
|   58:   f9800091        prfm    pstl1strm, [x4]
|   5c:   c87f1885        ldxp    x5, x6, [x4]
|   60:   ca0000a5        eor     x5, x5, x0
|   64:   ca0100c6        eor     x6, x6, x1
|   68:   aa0600a6        orr     x6, x5, x6
|   6c:   b5000066        cbnz    x6, 78 <foo+0x78>
|   70:   c8250c82        stxp    w5, x2, x3, [x4]
|   74:   35ffff45        cbnz    w5, 5c <foo+0x5c>
|   78:   f9400480        ldr     x0, [x4, #8]
|   7c:   d50323bf        autiasp
|   80:   ca0000e0        eor     x0, x7, x0
|   84:   d65f03c0        ret

... sampling the high 8 bytes before and after the cmpxchg, and
performing an EOR, as we'd expect.

For backporting, I've tested this atop linux-4.9.y with GCC 5.5.0. Note
that linux-4.9.y is oldest currently supported stable release, and
mandates GCC 5.1+. Unfortunately I couldn't get a GCC 5.1 binary to run
on my machines due to library incompatibilities.

I've also used a standalone test to check that we can use a __uint128_t
pointer in a +Q constraint at least as far back as GCC 4.8.5 and LLVM
3.9.1.

Fixes: 5284e1b ("arm64: xchg: Implement cmpxchg_double")
Fixes: e9a4b79 ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU")
Reported-by: Boqun Feng <[email protected]>
Link: https://lore.kernel.org/lkml/Y6DEfQXymYVgL3oJ@boqun-archlinux/
Reported-by: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Mark Rutland <[email protected]>
Cc: [email protected]
Cc: Arnd Bergmann <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Steve Capper <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
ojeda pushed a commit that referenced this pull request Jan 22, 2024
Like commit 1cf3bfc ("bpf: Support 64-bit pointers to kfuncs")
for s390x, add support for 64-bit pointers to kfuncs for LoongArch.
Since the infrastructure is already implemented in BPF core, the only
thing need to be done is to override bpf_jit_supports_far_kfunc_call().

Before this change, several test_verifier tests failed:

  # ./test_verifier | grep # | grep FAIL
  #119/p calls: invalid kfunc call: ptr_to_mem to struct with non-scalar FAIL
  #120/p calls: invalid kfunc call: ptr_to_mem to struct with nesting depth > 4 FAIL
  #121/p calls: invalid kfunc call: ptr_to_mem to struct with FAM FAIL
  #122/p calls: invalid kfunc call: reg->type != PTR_TO_CTX FAIL
  #123/p calls: invalid kfunc call: void * not allowed in func proto without mem size arg FAIL
  #124/p calls: trigger reg2btf_ids[reg->type] for reg->type > __BPF_REG_TYPE_MAX FAIL
  #125/p calls: invalid kfunc call: reg->off must be zero when passed to release kfunc FAIL
  #126/p calls: invalid kfunc call: don't match first member type when passed to release kfunc FAIL
  #127/p calls: invalid kfunc call: PTR_TO_BTF_ID with negative offset FAIL
  #128/p calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset FAIL
  #129/p calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL
  #130/p calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL
  #486/p map_kptr: ref: reference state created and released on xchg FAIL

This is because the kfuncs in the loaded module are far away from
__bpf_call_base:

  ffff800002009440 t bpf_kfunc_call_test_fail1    [bpf_testmod]
  9000000002e128d8 T __bpf_call_base

The offset relative to __bpf_call_base does NOT fit in s32, which breaks
the assumption in BPF core. Enable bpf_jit_supports_far_kfunc_call() lifts
this limit.

Note that to reproduce the above result, tools/testing/selftests/bpf/config
should be applied, and run the test with JIT enabled, unpriv BPF enabled.

With this change, the test_verifier tests now all passed:

  # ./test_verifier
  ...
  Summary: 777 PASSED, 0 SKIPPED, 0 FAILED

Tested-by: Tiezhu Yang <[email protected]>
Signed-off-by: Hengqi Chen <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants