Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nullness regression ci #1138

Closed
wants to merge 20 commits into from
Closed

Conversation

danobi
Copy link
Contributor

@danobi danobi commented Feb 1, 2025

No description provided.

Tengda Wu and others added 20 commits January 29, 2025 18:42
There are two bpf_link__destroy(freplace_link) calls in
test_tailcall_bpf2bpf_freplace(). After the first bpf_link__destroy()
is called, if the following bpf_map_{update,delete}_elem() throws an
exception, it will jump to the "out" label and call bpf_link__destroy()
again, causing double free and eventually leading to a segfault.

Fix it by directly resetting freplace_link to NULL after the first
bpf_link__destroy() call.

Fixes: 021611d ("selftests/bpf: Add test to verify tailcall and freplace restrictions")
Signed-off-by: Tengda Wu <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Reviewed-by: Leon Hwang <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
Update btf_ext_parse_info() to ensure the core_relo header is present
before reading its fields. This avoids a potential buffer read overflow
reported by the OSS Fuzz project.

Fixes: cf57916 ("libbpf: Support BTF.ext loading and output in either endianness")
Signed-off-by: Tony Ambardar <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://issues.oss-fuzz.com/issues/388905046
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
The runqslower binary from a cross-endian build currently fails to run
because the included skeleton has host endianness. Fix this by passing the
target BPF endianness to the runqslower sub-make.

Fixes: 5a63c33 ("selftests/bpf: Support cross-endian building")
Signed-off-by: Tony Ambardar <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
The `readlink(path, buf, sizeof(buf))` call reads at most sizeof(buf)
bytes and *does not* append null-terminator to buf. With respect to
that, fix two pieces in get_fd_type:

1. Change the truncation check to contain sizeof(buf) rather than
   sizeof(path).
2. Append null-terminator to buf.

Reported by Coverity.

Signed-off-by: Viktor Malik <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Reviewed-by: Quentin Monnet <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
…nction

In commit 1611603 ("bpf: Create argument information for nullable arguments."),
it introduced a "__nullable" tagging at the argument name of a
stub function. Some background on the commit:
it requires to tag the stub function instead of directly tagging
the "ops" of a struct. This is because the btf func_proto of the "ops"
does not have the argument name and the "__nullable" is tagged at
the argument name.

To find the stub function of a "ops", it currently relies on a naming
convention on the stub function "st_ops__ops_name".
e.g. tcp_congestion_ops__ssthresh. However, the new kernel
sub system implementing bpf_struct_ops have missed this and
have been surprised that the "__nullable" and the to-be-landed
"__ref" tagging was not effective.

One option would be to give a warning whenever the stub function does
not follow the naming convention, regardless if it requires arg tagging
or not.

Instead, this patch uses the kallsyms_lookup approach and removes
the requirement on the naming convention. The st_ops->cfi_stubs has
all the stub function kernel addresses. kallsyms_lookup() is used to
lookup the function name. With the function name, BTF can be used to
find the BTF func_proto. The existing "__nullable" arg name searching
logic will then fall through.

One notable change is,
if it failed in kallsyms_lookup or it failed in looking up the stub
function name from the BTF, the bpf_struct_ops registration will fail.
This is different from the previous behavior that it silently ignored
the "st_ops__ops_name" function not found error.

The "tcp_congestion_ops", "sched_ext_ops", and "hid_bpf_ops" can still be
registered successfully after this patch. There is struct_ops_maybe_null
selftest to cover the "__nullable" tagging.

Other minor changes:
1. Removed the "%s__%s" format from the pr_warn because the naming
   convention is removed.
2. The existing bpf_struct_ops_supported() is also moved earlier
   because prepare_arg_info needs to use it to decide if the
   stub function is NULL before calling the prepare_arg_info.

Cc: Tejun Heo <[email protected]>
Cc: Benjamin Tissoires <[email protected]>
Cc: Yonghong Song <[email protected]>
Cc: Amery Hung <[email protected]>
Signed-off-by: Martin KaFai Lau <[email protected]>
Reviewed-by: Amery Hung <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
Some tests can't be run in parallel because they use same namespace
names or veth names.

Create an helper that appends the thread ID to a given string. 8
characters are used for it (7 digits + '\0')

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
Signed-off-by: Martin KaFai Lau <[email protected]>
Link: https://patch.msgid.link/[email protected]
IP_CMD_MAX_LEN and NS_SUFFIX_LEN aren't used anywhere.

Remove these unused defines

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
Signed-off-by: Martin KaFai Lau <[email protected]>
Link: https://patch.msgid.link/[email protected]
check_ping() directly returns a SYS_NOFAIL without any previous
treatment. It's called only once in the file and hardcodes the used
namespace and ip address.

Replace check_ping() with a direct call of SYS_NOFAIL in the test.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
Signed-off-by: Martin KaFai Lau <[email protected]>
Link: https://patch.msgid.link/[email protected]
In the struct veth_configuration, the next_veth string is used to tell
the next virtual interface to which packets must be redirected to. So it
has to match the local_veth string of an other veth_configuration.

Change next_veth type to int to avoid handling two identical strings.
This integer is used as an offset in the network configuration table.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
Signed-off-by: Martin KaFai Lau <[email protected]>
Link: https://patch.msgid.link/[email protected]
configure_network() does two things : it first creates the network
topology and then configures the BPF maps to fit the test needs. This
isn't convenient if we want to re-use the same network topology for
different test cases.

Rename configure_network() create_network().
Move the BPF configuration to the test itself.
Split the test description in two parts, first the description of the
network topology, then the description of the test case.
Remove the veth indexes from the ASCII art as dynamic ones are used

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
Signed-off-by: Martin KaFai Lau <[email protected]>
Acked-by: Stanislav Fomichev <[email protected]>
Link: https://patch.msgid.link/[email protected]
The network topology is held by the config[] table. This 'config' name
is a bit too generic if we want to add other configuration variables.

Rename config[] to net_config[].

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
Signed-off-by: Martin KaFai Lau <[email protected]>
Acked-by: Stanislav Fomichev <[email protected]>
Link: https://patch.msgid.link/[email protected]
The BPF program attached to each veth is hardcoded through the
use of the struct skeletons. It prevents from re-using the initialization
code in new test cases.

Replace the struct skeletons by a bpf_object table.
Add a struct prog_configuration that holds the name of BPF program to
load on a given veth pair.
Use bpf_object__find_program_by_name() / bpf_xdp_attach() API instead of
bpf_program__attach_xdp() to retrieve the BPF programs from their names.
Detach BPF progs in the cleanup() as it's not automatically done by this
API.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
Signed-off-by: Martin KaFai Lau <[email protected]>
Acked-by: Stanislav Fomichev <[email protected]>
Link: https://patch.msgid.link/[email protected]
XDP flags are hardcoded to 0 at attachment.

Add flags attributes to the struct prog_configuration to allow flag
modifications for each test case.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
Signed-off-by: Martin KaFai Lau <[email protected]>
Acked-by: Stanislav Fomichev <[email protected]>
Link: https://patch.msgid.link/[email protected]
The network namespaces and the veth used by the tests have hardcoded
names that can conflict with other tests during parallel runs.

Use the append_tid() helper to ensure the uniqueness of these names.
Use the static network configuration table as a template on which
thread IDs are appended in each test.
Set a fixed size to remote_addr field so the struct veth_configuration
can also have a fixed size.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
Signed-off-by: Martin KaFai Lau <[email protected]>
Link: https://patch.msgid.link/[email protected]
The XDP redirection is tested without any flag provided to the
xdp_attach() function.

Add two subtests that check the correct behaviour with
XDP_FLAGS_{DRV/SKB}_MODE flags

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
Signed-off-by: Martin KaFai Lau <[email protected]>
Acked-by: Stanislav Fomichev <[email protected]>
Link: https://patch.msgid.link/[email protected]
…t_progs'

Bastien Curutchet says:

====================
This patch series continues the work to migrate the *.sh tests into
prog_tests framework.

test_xdp_redirect_multi.sh tests the XDP redirections done through
bpf_redirect_map().

This is already partly covered by test_xdp_veth.c that already tests
map redirections at XDP level. What isn't covered yet by test_xdp_veth is
the use of the broadcast flags (BPF_F_BROADCAST or BPF_F_EXCLUDE_INGRESS)
and XDP egress programs.

This series is the prep work that will be followed up adding test cases
to eventually cover the tests done in test_xdp_redirect_multi.sh:
 - PATCH 1 Add an helper to generate unique names
 - PATCH 2 to 9 rework test_xdp_veth to make it more generic and allow to
   configure different test cases
 - PATCH 10 adds test cases for 'classic' bpf_redirect_map()
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Martin KaFai Lau <[email protected]>
Previously, we were trying to extract constant map keys for all
bpf_map_lookup_elem(), regardless of map type. This is an issue if the
map has a u64 key and the value is very high, as it can be interpreted
as a negative signed value. This in turn is treated as an error value by
check_func_arg() which causes a valid program to be incorrectly
rejected.

Fix by only extracting constant map keys for relevant maps. See next
commit for an example via selftest.

Reported-by: Marc Hartmayer <[email protected]>
Reported-by: Ilya Leoshkevich <[email protected]>
Tested-by: Marc Hartmayer <[email protected]>
Test that very high constant map keys are not interpreted as an error
value by the verifier. This would previously fail.
Refactor get_constant_map_key() to disambiguate the constant key
value from potential error values. In the case that the key is
negative, it could be confused for an error.

It's not currently an issue, as the verifier seems to track s32 spills
as u32. So even if the program wrongly uses a negative value for an
arraymap key, the verifier just thinks it's an impossibly high value
which gets correctly discarded.

Refactor anyways to make things cleaner and prevent potential future
issues.
@danobi danobi closed this Feb 1, 2025
@danobi danobi deleted the nullness_regression-ci branch February 1, 2025 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants