Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function_arguments_typed to return alloy Vec<DynSolType> #4

Closed
wants to merge 8 commits into from

Conversation

wtdcode
Copy link
Contributor

@wtdcode wtdcode commented Jul 4, 2024

As title, this adds a new function function_arguments_typed to return Vec<DynSolType>, which makes it much easier to use because parsing string returned by function_arguments is really a pain.

I ran the benchmark locally, and I didn't notice obvious performance regression. I compared the new implementation against evmole-py and no errors are found.

dataset largest1k (1000 contracts, 24427 functions), evmole-rs:
  time: 1.5s
  bad:  0 functions 0.00%
  good: 24427 functions (100.00%)
  errors:
dataset random50k (50000 contracts, 1171102 functions), evmole-rs:
  time: 51.7s
  bad:  0 functions 0.00%
  good: 1171102 functions (100.00%)
  errors:
dataset vyper (780 contracts, 21244 functions), evmole-rs:
  time: 1.2s
  bad:  0 functions 0.00%
  good: 21244 functions (100.00%)
  errors:

Note the comparison is based on string so this even doesn't break compatibility.

This PR also includes a few fixes.

@cdump
Copy link
Owner

cdump commented Jul 6, 2024

Thanks for PR, I like the idea and will take a look after the EthCC week (after 15 July)

@@ -368,9 +368,27 @@ where
Err(UnsupportedOpError { op }.into())
} else {
let mut data: Vec<u8> = vec![0; size];
let code_len = self.code.len();

let (from_code_length, _) = if src_off + size < code_len {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Why are you returning two values from this if, but always ignoring the second one? Could it be simplified to just the 1st value?
  2. I think a small change to the original code fits better here:
let code_len = self.code.len();
if src_off < code_len {
    let n = std::cmp::min(size, code_len - src_off);
    data[0..n].copy_from_slice(&self.code[src_off..src_off + n]);
}

wdyt?

Copy link
Contributor Author

@wtdcode wtdcode Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I can't remember exactly the reason because I was fixing this in a midnight to solve evmole crashing with a random unverified contract. Anyway, you are right, it can be simplified.
  2. This one is shorter but harder to understand and maintain. I would like to leave it clearly and explicitly as current implementation so that the thing in point 1 above won't happen next time =). I can also leave a few more comments about the rationale of CODECOPY.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's review it again after (1) simplification; it will probably really be much easier to read and understand

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed if src_off < code_len { to the master branch just to make master work while we are here. Feel free to rewrite it according to your proposal if you want.

function_arguments_typed(code, selector, gas_limit)
.into_iter()
.map(|t| t.sol_type_name().to_string())
.join(",")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to keep external deps as minimal as possible now, so I suggest using this variant instead of "itertools":

  .map(|t| t.sol_type_name().to_string())
  .collect::<Vec<String>>()
  .join(",")

We'll have additional allocation, but we can accept that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, though I think rustc should be smart enough to optimize this.

rust/Cargo.toml Outdated
@@ -11,7 +11,10 @@ repository = "https://github.com/cdump/evmole"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
ruint = "1.11.1"
alloy = {version = "0.1", features = ["full"]}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We need only alloy-dyn-abi = {version = "0.7.7" }, not the full alloy package
  2. itertools is not required, see the review comment below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, just my bad habits when playing with the code.

args.mark_not_bool(&path, offset);
}

StepResult{op: op::AND, fa: Some(Element{label: Some(Label::Arg(Val{offset, path, add_val, and_mask: None})), ..}), sa: Some(ot), ..}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I find that the old format for match looks better, but I was unable to configure rust-fmt to keep only that part untouched. So it's okay; I'll probably revert this part by hand after merging or think about another way to configure rust-fmt properly.

selector: &Selector,
gas_limit: u32,
) -> Vec<DynSolType> {
trace!(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this change.
These log messages are not intended to be used by library users; they're just for me (and other developers who need to debug evmole internals).
Logging from libraries is quite a controversial issue in itself, which I think should be used very carefully and with at least providing more context about the log message origin. But in this case, I'm sure it will be better to stick with the old feature=trace and println!.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I can understand it's different taste and I'm neutral with it. Will revert to previous trace feature.

rust/src/lib.rs Outdated
@@ -3,6 +3,7 @@
//! Accuracy and speed comparison with other tools, as well as Python and JavaScript implementations, are available on [GitHub](https://github.com/cdump/evmole/tree/master#benchmark).

pub use arguments::function_arguments;
pub use arguments::function_arguments_typed;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about function_arguments_alloy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are okay with both variants, let me think for some time about _typed vs _alloy, just want to make sure that we will choose the best option for public API

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think function_arguments_alloy is better

};

match self.tinfo {
Some(InfoVal::Array(_)) => format!("{}[]", c),
Some(InfoVal::Array(_)) => {
if q.len() == 1 {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe

                vec![DynSolType::Array(Box::new(
                    if q.len() == 1 {
                        q[0].clone()
                    } else {
                        DynSolType::Tuple(q)
                    }
                ))]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay.

})
.flatten()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From rust clippy:

warning: called `map(..).flatten()` on `Iterator`
  --> src/arguments.rs:89:14
   |
89 |               .map(|k| {
   |  ______________^
90 | |                 self.children
91 | |                     .get(&k)
92 | |                     .map_or(vec![DynSolType::Uint(256)], |val| val.to_alloy_type(false))
93 | |                     .into_iter()
94 | |             })
95 | |             .flatten()
   | |______________________^
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#map_flatten
   = note: `#[warn(clippy::map_flatten)]` on by default
help: try replacing `map` with `flat_map` and remove the `.flatten()`
   |
89 ~             .flat_map(|k| {
90 +                 self.children
91 +                     .get(&k)
92 +                     .map_or(vec![DynSolType::Uint(256)], |val| val.to_alloy_type(false))
93 +                     .into_iter()
94 +             })

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't notice this, will fix.

@cdump
Copy link
Owner

cdump commented Aug 24, 2024

parsing string returned by function_arguments is really a pain

Wrapping the result in () and extracting Vec<DynSolType> from results looks not so ugly:

use  alloy_dyn_abi::DynSolType;

fn main() {
    let code = hex::decode("6080604052348015600e575f80fd5b50600436106030575f3560e01c80632125b65b146034578063b69ef8a8146044575b5f80fd5b6044603f3660046046565b505050565b005b5f805f606084860312156057575f80fd5b833563ffffffff811681146069575f80fd5b925060208401356001600160a01b03811681146083575f80fd5b915060408401356001600160e01b0381168114609d575f80fd5b80915050925092509256").unwrap();

    let a = evmole::function_arguments(&code, &[0x21, 0x25, 0xb6, 0x5b], 0);
    println!("{}", a);

    if let Ok(DynSolType::Tuple(v)) = DynSolType::parse(&format!("({})", &a)) {
        println!("{:?}", v); // type = Vec<DynSolType>
    }
}
uint32,address,uint224
[Uint(32), Address, Uint(224)]

@wtdcode
Copy link
Contributor Author

wtdcode commented Aug 24, 2024

parsing string returned by function_arguments is really a pain

Wrapping the result in () and extracting Vec<DynSolType> from results looks not so ugly:

use  alloy_dyn_abi::DynSolType;

fn main() {
    let code = hex::decode("6080604052348015600e575f80fd5b50600436106030575f3560e01c80632125b65b146034578063b69ef8a8146044575b5f80fd5b6044603f3660046046565b505050565b005b5f805f606084860312156057575f80fd5b833563ffffffff811681146069575f80fd5b925060208401356001600160a01b03811681146083575f80fd5b915060408401356001600160e01b0381168114609d575f80fd5b80915050925092509256").unwrap();

    let a = evmole::function_arguments(&code, &[0x21, 0x25, 0xb6, 0x5b], 0);
    println!("{}", a);

    if let Ok(DynSolType::Tuple(v)) = DynSolType::parse(&format!("({})", &a)) {
        println!("{:?}", v); // type = Vec<DynSolType>
    }
}
uint32,address,uint224
[Uint(32), Address, Uint(224)]

This doesn’t work for certain types string returned by evmole.

@cdump
Copy link
Owner

cdump commented Aug 24, 2024

Do you have any examples? This is probably a bug inside evmole that I need to fix independently of this PR

@wtdcode
Copy link
Contributor Author

wtdcode commented Aug 24, 2024

Do you have any examples? This is probably a bug inside evmole that I need to fix independently of this PR

This is generally related to the nested types, where the string returned by evmole is not accepted by alloy, though the abi is correct by my manual checking. I don't have exact example for that because this happens pretty frequently when I decoded more than 50M onchain contracts across various evm-compatible chains.

@wtdcode
Copy link
Contributor Author

wtdcode commented Aug 24, 2024

I have resolved the conflicts and updated according to reviews. I just forgot this PR =).

@wtdcode
Copy link
Contributor Author

wtdcode commented Aug 24, 2024

Also add a testcase for previous crashing bytecode.

cdump pushed a commit that referenced this pull request Aug 24, 2024
@cdump
Copy link
Owner

cdump commented Aug 24, 2024

the string returned by evmole is not accepted by alloy

I've tested it on all mainnet contracts and didn't find any errors; that's strange.

@cdump
Copy link
Owner

cdump commented Aug 24, 2024

Merged to master with reverting some code-format changes: 2ee8495

@cdump cdump closed this Aug 24, 2024
@wtdcode
Copy link
Contributor Author

wtdcode commented Aug 24, 2024

the string returned by evmole is not accepted by alloy

I've tested it on all mainnet contracts and didn't find any errors; that's strange.

I would assume it's due to some updates, either alloy or emvole? I notice this breakage after one cargo update. But anyway, thanks for merging this =).

@wtdcode
Copy link
Contributor Author

wtdcode commented Aug 24, 2024

Merged to master with reverting some code-format changes: 2ee8495

You may want a custom rustfmt style? I just run cargo fmt automatically.

@cdump
Copy link
Owner

cdump commented Aug 24, 2024

due to some updates, either alloy or emvole

maybe, but anyway function returning Vec is better

want a custom rustfmt style

I just don't like rustfmt's style for match cases; I find the current version more readable. For other parts, rustfmt is okay. Contributors should use it by default, and I will either roll back the match-case style, find another way to configure it, or convince myself that rustfmt's style is also acceptable.

Thanks for PR :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants