-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Addr::unchecked a const fn #869
Conversation
while i < input_bytes.len() { | ||
data[i] = input_bytes[i]; | ||
i += 1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while i < input_bytes.len() { | |
data[i] = input_bytes[i]; | |
i += 1; | |
} | |
data[..input_bytes.len()].clone_from_slice(input.as_bytes()); |
I think we can do this here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we are very limited in what is available because it must be const fn
compatible. We can't even rewrite the while
loop to a for i in 0...input_bytes.len()
loop. clone_from_slice
is not a const fn
, so it cannot be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about optimizing it, I looked at the implementation of eg. https://doc.rust-lang.org/core/ptr/fn.copy_nonoverlapping.html but that just point to rust intrinsics, so I guess this is the best we can do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow – seems like we can use copy_nonoverlapping
directly because it became const (rust-lang/rust#79684). For some reason this is not shown in the docs yet. Will try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting approach. This does obviate the need for AddrRef, but at the same time creates a rather large array that must be copied every time we pass it through a function.
I am unsure of the benefits of this over using the AddrRef
implementation I had, making AddrRef::unchecked(&str)
a const fn
and "promoting it" via Addr::from(addrRef)
. That would make the clone (copy data to heap) explicit and work well with the lifetimes we have in storage-plus.
We could then use AddrRef::unchecked("name")
in all the test code and just name.into()
when needed (no more unchecked). Can you explain the benefits of this approach?
#[derive(Serialize, Deserialize, Clone, Debug, PartialEq, Eq, Hash, JsonSchema)] | ||
pub struct Addr(String); | ||
#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash, JsonSchema)] | ||
pub struct Addr(#[schemars(with = "String")] AddrInner); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two level nesting here is for schemars output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eaxtly. If we find a better way to ensure it gets a "string" schema, we can simplify at any time because all the internals are private.
where | ||
E: de::Error, | ||
{ | ||
if v.len() > ADDR_MAX_LENGTH { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to assert this here.
Should we assert this in unchecked
as well? Or would that violate the const fn
(you could just panic there, but it would ensure nothing gets chopped off by accident)
let mut data = [0u8; ADDR_MAX_LENGTH]; | ||
let mut i = 0; | ||
while i < input_bytes.len() { | ||
data[i] = input_bytes[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can buffer overflow if we don't have a length assertion for input
@@ -93,25 +166,25 @@ impl PartialEq<Addr> for String { | |||
|
|||
impl From<Addr> for String { | |||
fn from(addr: Addr) -> Self { | |||
addr.0 | |||
addr.as_str().to_owned() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also call this to_string
? Sometimes it is nice to have explicit names (as_str
, to_string
) and not just from
and into
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to_string
is already implemented via the Display
trait implementation. This implies a ToString
implementation.
} | ||
} | ||
|
||
impl From<&Addr> for String { | ||
fn from(addr: &Addr) -> Self { | ||
addr.0.clone() | ||
addr.as_str().to_owned() | ||
} | ||
} | ||
|
||
impl From<Addr> for HumanAddr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we remove these - let's not support HumanAddr with the new code (and if someone wants that can always do addr.as_str().into()
.
/// The maximum allowed size for bech32 (https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki#bech32) | ||
const ADDR_MAX_LENGTH: usize = 90; | ||
|
||
#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to reflect on the performance changes of this. We should add some guidance on usage.
- Passing
Addr
as an argument copies 94 bytes, rather than a 4 byte pointer (like.clone()
of the past version, but no heap allocations). - Passing
&Addr
is cheap and can be restored toAddr
viacopy()
orclone()
. Addr
/&Addr.as_str()
is cheapAddr.to_string()
requires a heap allocation (not just consuming the original Addr).
With this in mind, I think we can recommend returning Addr
and passing in &Addr
to functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if we pass Addr
references to functions (which is the recommended approach in general), and only return Addr
, I think this would be OK, performance/memory-wise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to play around with this a bit next week. The size of Addr
is also my major (and only) concern here. But since we do not have huge amounts of addresses in memory at the same time and we have a well known size limit, I think it is fair to store the addresses on the stack.
We can always push them to heap by using Box<Addr>
. Then we get what we had before, just without the mutability of a String implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the problem you are trying to solve?
I feel this will just make my life harder and not solve the problem that motivated AddrRef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the problem you are trying to solve?
Make using owned Addr easier, especially literals. Then in many places you suddenly get an Addr->&str conversion instead of the opposite direction.
I'm not claiming this solves all issues and it is well possible that it does not replace what AddrRef
does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make using owned Addr easier, especially literals.
I didn't have any issues there. And am afraid I will have to think way too much about unneeded copies of this struct if it happens. See #872 for another solution that provide simple literals for tests
while i < input_bytes.len() { | ||
data[i] = input_bytes[i]; | ||
i += 1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about optimizing it, I looked at the implementation of eg. https://doc.rust-lang.org/core/ptr/fn.copy_nonoverlapping.html but that just point to rust intrinsics, so I guess this is the best we can do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work.
We don't do that |
This allows you to use
Addr::unchecked("foo")
in all the places where you can use "foo", speciallyconst VARIABLES
. This uses the facts that:Addr
can have a constant size. The memory layout can safely be copied.Addr
is immutable. So it does not matter if we move or copy.