Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

include: util: Add memeq and bool_memcmp #84159

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

theob-pro
Copy link
Contributor

Those two functions are used to compare memory area. bool_memcmp is basically memcmp but return a bool instead of an int. memeq compare the length of the memory area in addition to the bytes themselves.

@pdgendt
Copy link
Collaborator

pdgendt commented Jan 17, 2025

I don't see the benefit of adding these, sorry.

@theob-pro
Copy link
Contributor Author

I don't see the benefit of adding these, sorry.

No worries, it may not be useful for everyone 😀
More context on the discussion I had with @Thalley on this PR: #84018

@pdgendt
Copy link
Collaborator

pdgendt commented Jan 17, 2025

No worries, it may not be useful for everyone 😀 More context on the discussion I had with @Thalley on this PR: #84018

What I mean was that I don't see any added value of introducing what basically is an alias...

if (memeq(buf1, size1, buf2, size2)) { /* ... */ }
/* vs */
if (size1 == size2 && memcmp(buf1, buf2, size1) == 0) { /* ... */ }

@andyross
Copy link
Contributor

Tend to agree with @pdgendt, we don't want to be pushing APIs that compete with well-established controlling idioms (like the ISO C standard memcmp() in this case, but we take stuff from Linux too).

@theob-pro
Copy link
Contributor Author

What I mean was that I don't see any added value of introducing what basically is an alias...

Maybe I am missing something but that file is full of aliases... I don't see the problem with adding new ones

Tend to agree with @pdgendt, we don't want to be pushing APIs that compete with well-established controlling idioms (like the ISO C standard memcmp() in this case, but we take stuff from Linux too).

I struggle to understand the reasoning about APIs that compete, sorry, could you develop? And those function are just more convenient to use. Nobody is forced to use them 😅

@pdgendt
Copy link
Collaborator

pdgendt commented Jan 17, 2025

Maybe I am missing something but that file is full of aliases... I don't see the problem with adding new ones

I disagree. The file, in my opinion, provides common functions people tend to get wrong or inefficient implementations of.

I struggle to understand the reasoning about APIs that compete, sorry, could you develop? And those function are just more convenient to use. Nobody is forced to use them 😅

Nobody is forced to use them, but suddenly memeq is almost globally defined as zephyr/sys/util.h will likely be included in some way or another.

@theob-pro
Copy link
Contributor Author

I disagree. The file, in my opinion, provides common functions people tend to get wrong or inefficient implementations of.

Sorry but I must insist, there is an alias for is_power_of_two, macro for IN_RANGE, MIN and MAX, a macro for casting a pointer to an unsigned integer. Those functions and macro are there for quality of life, for making code more readable in the end.

Nobody is forced to use them, but suddenly memeq is almost globally defined as zephyr/sys/util.h will likely be included in some way or another.

Sorry, I don't understand, what is the problem here? It will not be included in people code if they don't use it

@Thalley
Copy link
Collaborator

Thalley commented Jan 20, 2025

I disagree. The file, in my opinion, provides common functions people tend to get wrong or inefficient implementations of.

Sorry but I must insist, there is an alias for is_power_of_two, macro for IN_RANGE, MIN and MAX, a macro for casting a pointer to an unsigned integer. Those functions and macro are there for quality of life, for making code more readable in the end.

Nobody is forced to use them, but suddenly memeq is almost globally defined as zephyr/sys/util.h will likely be included in some way or another.

Sorry, I don't understand, what is the problem here? It will not be included in people code if they don't use it

I agree with @theob-pro here.

What I mean was that I don't see any added value of introducing what basically is an alias...

If you apply that argument to the existing util.h or util_macro.h, then basically none of the macros or functions should be there. Their whole point is being a semantic alias for common operations that makes it easier for the reader to read the code. Even the macro you added yourself (#66744) wouldn't have been added if you applied that argument.

*
* @returns true if the @p n first bytes of @p m1 and @p m2 are the same, else false
*/
static inline bool bool_memcmp(const void *m1, const void *m2, size_t n)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike the naming of this function.

IMO memeq should be memcmp(m1, m2, n) == 0; and the function that implements the size check too, that should call this function, should have the longer name. Maybe memeqlen?

We could also apply longer names to avoid clashes or confusion with the standard libraries.

How about mem_is_equal and mem_is_equal_with_len?

Copy link
Collaborator

@alwa-nordic alwa-nordic Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be the other way around. I think the default function for comparing two memory slices should allow slices of differing lengths and include the length check, just like it is defined in "most" programming languages (usually called ==). That is: It determines two memory slices to be equal if and only if both their lengths and contents are the same. It determines two memory slices of different lengths as different always, and two slices of zero length as equal always. This is true for C++ (vector ==), Python (bytearray ==), Go (slices.Equal), Rust (Slice ==) and pretty much everything else.

Looking at the man page for memcmp, I conclude that memcmp is defined as taking two memory slices that are known to be of equal length. So as an optimization, it takes only one length argument. (The result of memcmp is the ordering of the two input slices.) So, I argue, memcmp taking one length argument is the special case. The programmer needs to make sure the slices are of equal lengths.

Copy link
Collaborator

@alwa-nordic alwa-nordic Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To illustrate what I mean by saying memcmp taking one length argument is a special case: We can extend the notion of ordering for memory so that it matches strcmp, the shorter memory slice should be ordered first. So the general comparison can be defined as full_memcmp(s1, l1, s2, l2) := l2 - l1 || memcmp(s1, s2, l1).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the man page for memcmp, I conclude that memcmp is defined as taking two memory slices that are known to be of equal length.

The manpage of memcmp specifically states that it compares the first n octets; n here is not a reference to either of the inputs' sizes. This is also the reason why I think a single argument is better since it effectively calls memcmp, as it should only compare the first n octets where n may be equal to or smaller than either of the sizes.

For the Python, C++, etc. reference those are reference to objects which are semantically different than "just memory".

As pointed out earlier, the default case is comparing memory of equal size, as evident by the Zephyr code base. If 9/10 times are only comparing the first n octets, and not the sizes, then that should be the default case. Comparing the size before calling memcmp is the atypical case.

If we were discussing a specific implementation, e.g. a net_buf where we would have a net_buf_equal(const struct net_buf *a, const struct net_buf *b) then I agree that it would make sense to check the length fields first by default.

In any case, this is not a hill I'll die on, so I'm not going to "fight" for one more than the other, but I do think that the simpler and more common case should be the "default".

We could also do something like mem_eq with just a single length field and then obj_eq that takes the size of each "object".

@pdgendt
Copy link
Collaborator

pdgendt commented Jan 20, 2025

Sorry, I don't understand, what is the problem here? It will not be included in people code if they don't use it

I agree with @theob-pro here.

This maybe is less of an issue, but still valid to point out that if anyone uses memeq as a name in downstream code, this will now trigger a compiler warning/error if -Wshadow is active.

What I mean was that I don't see any added value of introducing what basically is an alias...

If you apply that argument to the existing util.h or util_macro.h, then basically none of the macros or functions should be there. Their whole point is being a semantic alias for common operations that makes it easier for the reader to read the code. Even the macro you added yourself (#66744) wouldn't have been added if you applied that argument.

Fair point, but there is a readability difference between the following IMO, especially if wrapped in some bigger macro declaration.

IF_DISABLED(CONFIG_SOME_OPTION, (bool my_option;))
/* vs */
COND_CODE_1(CONFIG_SOME_OPTION, (), (bool my_option;))

@Thalley
Copy link
Collaborator

Thalley commented Jan 20, 2025

This maybe is less of an issue, but still valid to point out that if anyone uses memeq as a name in downstream code, this will now trigger a compiler warning/error if -Wshadow is active.

I agree. I also suggest some naming that is less similar to stdlib in #84159 (comment). I think it's risky to name is memeq even if that is a good name (which is why it is also risky, as that name has been suggested to be added to stdlib and/or specific compilers over the years).

IMO all Zephyr-specific util functions should have a unique prefix that avoid clashes with other commonly named util functions/macros, e.g. with z_memeq or util_memeq, but that's an entirely different discussion :D

@theob-pro theob-pro force-pushed the add-memeq branch 2 times, most recently from cdace90 to ce79867 Compare January 20, 2025 12:06
alwa-nordic
alwa-nordic previously approved these changes Jan 20, 2025
Thalley
Thalley previously approved these changes Jan 20, 2025
Copy link
Collaborator

@Thalley Thalley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to include at least one use of these functions, i.e. replace one of the memcmp(..) == 0 cases with this as an illustration of the purposes of them

* @returns true if the @p n first bytes of @p m1 and @p m2 are the same, else
* false
*/
static inline bool z_util_memeq(const void *m1, const void *m2, size_t n)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

z_ shall not be used for any public APIs, it is a intended to be used for internal/private APIs only.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that documented anywhere?

never made it in the docs, https://github.com/zephyrproject-rtos/zephyr/pull/62198/files. z_ was added long time ago to replace _ usage in the tree, but unfortuantly it was seen by many as the zephyr designated prefix for APIs and we ended up with public APIs using z_....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something that must be documented. It seems that there is quite a bunch of public APIs now using the z_, is there good reasons to continue to have that hidden rule?

To me it seems that without proper documentation some more will definitely be added and then I don't see the point of blocking it here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nashif do you have alternative suggestions for naming?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just util_memeq?

@theob-pro
Copy link
Contributor Author

It would be nice to include at least one use of these functions, i.e. replace one of the memcmp(..) == 0 cases with this as an illustration of the purposes of them

I added two example usage in the Bluetooth subsystem.

@theob-pro theob-pro force-pushed the add-memeq branch 2 times, most recently from 72b1bff to 270b6f4 Compare January 21, 2025 09:12
@theob-pro theob-pro requested a review from nashif January 27, 2025 06:28
`z_util_eq` compare two memory areas and their length. It returns true
if they are equal, else false.

`z_util_memeq` the first n bytes of two memory areas. It returns true if
they are equal, else false.

Signed-off-by: Théo Battrel <[email protected]>
Add two unit tests for the newly added functions `z_util_eq` and
`z_util_memeq`.

Signed-off-by: Théo Battrel <[email protected]>
@theob-pro
Copy link
Contributor Author

Rebased on main to resolve conflict.

Update `bt_irk_eq` to use `z_util_memeq` instead of `memcmp` and the
"disconnect" BabbleSim test to use `z_util_eq` instead of a first
assertion on the size followed by a `memcmp`.

This is done as an example usage of the two new functions.

Signed-off-by: Théo Battrel <[email protected]>
* @returns true if the @p n first bytes of @p m1 and @p m2 are the same, else
* false
*/
static inline bool z_util_memeq(const void *m1, const void *m2, size_t n)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just util_memeq?

*/
static inline bool z_util_eq(const void *m1, size_t len1, const void *m2, size_t len2)
{
return len1 == len2 && z_util_memeq(m1, m2, len1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I recommend shorting the call to z_util_memeq with m1 == m2 || z_util_memeq(m1, m2, len)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Base OS Base OS Library (lib/os) area: Bluetooth Host Bluetooth Host (excluding BR/EDR) area: Bluetooth area: Testsuite Testsuite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants