Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added experimental ELF/Mach-O/PE compatibility using gimli/object. #74

Merged
merged 17 commits into from
Jul 10, 2018

Conversation

data-pup
Copy link
Member

@data-pup data-pup commented Jul 2, 2018

This PR aims to introduce the ability to parse subroutines from an ELF/Mach-O binary, using the gimli/object crates.

Fixes #69.

@data-pup data-pup requested a review from fitzgen July 2, 2018 20:46
@data-pup
Copy link
Member Author

data-pup commented Jul 2, 2018

Travis failed on the nightly's wasm job, with this error:

error: Could not compile `miniz-sys`.
warning: build failed, waiting for other jobs to finish...
error: build failed

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for diving into this @data-pup! 🎉

I think we can fix the wasm compilation issues with:

  • making a "dwarf" cargo feature for this DWARF/etc support

  • enabling the "dwarf" feature for the twiggy CLI, but disabling it for twiggy-wasm-api

A bunch of inline questions and nitpicks below, but overall this is looking super sharp.

Thanks again @data-pup !

twiggy-traits = { version = "=0.2.0", path = "../traits" }
typed-arena = "1.3.0"
twiggy-ir = { version = "0.2.0", path = "../ir" }
twiggy-traits = { version = "0.2.0", path = "../traits" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a reason to remove the = bound on the version here? That helps us ensure that we are exactly matching twiggy-* crates together.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! This was definitely accidental. I ran into some conflicts while rebasing this work before opening this PR, good eye! I'll fix that :)

parity-wasm = "0.28.0"
twiggy-ir = { version = "=0.2.0", path = "../ir" }
twiggy-traits = { version = "=0.2.0", path = "../traits" }
typed-arena = "1.3.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do typed-arena = "1.4" here

// Use the abbreviations to create an entries cursor, and move it to the root.
let abbrevs = self
.abbreviations(&debug_abbrev)
.expect("Could not find abbreviations");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should propagate errors with ? here, and maybe use failure::ResultExt::context if we want to provide an extra context message.

.abbreviations(&debug_abbrev)
.expect("Could not find abbreviations");
let mut die_cursor = self.entries(&abbrevs);
assert!(die_cursor.next_dfs().unwrap().is_some());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should assert! this -- if we get malformed input, that should result in an error returned and formatted to the user, not a panic.

ir/ir.rs Outdated
Scope(Scope),

/// Subroutine item.
Subroutine(Subroutine),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this intended to be different from Code? I would have expected parsing a DW_TAG_subroutine to create a Code IR item kind.

gimli::DW_TAG_member => None,
// Variant entries. (Section 5.7.10)
gimli::DW_TAG_variant => None,
gimli::DW_TAG_variant_part => None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should do an exclusive match here, because we explicitly don't care about most of these things. It would be easier to do:

match die.tag() {
    // Match things we care about, and do what we will do with them.
    gimli::DW_TAG_something_we_care_about => do_stuff(die),

    // And then just skip everything else.
    _ => return None,
}

/// This struct represents the extra items required by the Parse trait's
/// `parse_items` method. This is constructed by the compilation unit's
/// own implementation of `parse_items`.
pub struct DIEItemsExtra<'unit, R>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Rust style is to only capitalize the first letter of an acronym inside an identifier: DieItemsExtra

.entity_size(addr_size, dwarf_version, rnglists)?
.map(|size| ir::Item::new(id, name, size as u32, kind))
}
_ => None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we early return Ok(()) here, then item doesn't need to be an Option and we can avoid the if let Some below. This would be a nice little code clean up.


// Load the sections of the file containing debugging information.
let debug_abbrev: gimli::DebugAbbrev<_> = load_section(&arena, self, endian);
let _debug_aranges: gimli::DebugAranges<_> = load_section(&arena, self, endian);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we aren't using the section, then I don't think it makes sense to load it. We can just drop this line.

};

// Load the sections of the file containing debugging information.
let debug_abbrev: gimli::DebugAbbrev<_> = load_section(&arena, self, endian);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idle thought: it may eventually make sense to make all the sections be Extra parameters for this parse implementation, and then that way we wouldn't need to duplicate the loads of each section (which potentially means decompressing sections multiple times).

Definitely not something we need to worry about yet, however!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried taking a swing at that, and ended up running into some errors regarding unconstrained type parameters. I'm sure this is doable, it's just a little outside of my previous Rust knowledge. Would happily create an issue and follow up on that once the rest of the things here are taken care of, if you think that makes sense :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up issue is fine :)

@data-pup
Copy link
Member Author

data-pup commented Jul 3, 2018

Thanks for the feedback! Figured I'd answer some of the questions above in a single comment.

Re: ir::ItemKind variants, the Type kind is definitely superfluous. That's a holdover from some earlier iterations of this work, I can take that out. Same goes for Scope, I can remove that as well. Your point about the Subroutine variant is also a good one, I can adjust the code so that this creates a Code item.

Thanks for the other points, I think those all make a lot of sense! I'll make some edits to take care of those issues, and push a new commit once that's ready. Thanks for the feeback 😄

@data-pup
Copy link
Member Author

data-pup commented Jul 6, 2018

Pushed some commits to address the various notes above. The point about features is still a little new to me however. I can read up on those tomorrow and see if I can take care of that detail. Would that be a feature in the parser/Cargo.toml file specifically?

Update: Took a swing at creating a "dwarf" feature, but it seemed to stall the build 😢 If you have any advice on what I may be doing wrong here, let me know! Haven't used Cargo's features functionality before.

ir::Item::new(id, name, size as u32, kind)
} else {
return Ok(());
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added early returns here, I think it helped clean things up a little. It's worth noting that the entity_size method here returns an Option because (if I understand correctly) not all DW_AT_subprogram DIEs are describing a definition. These DIEs don't include the same location attributes, in which case this method returns None.


[features]
default = ["dwarf"]
dwarf = ["fallible-iterator", "gimli", "object", "typed-arena"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this looks correct. Not sure why builds are hanging now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like maybe this was just a Travis CI infra issue -- I've restarted the job to see.

@@ -1,10 +1,12 @@
#[cfg(feature = "dwarf")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be easier to define a macro like:

macro_rules! if_dwarf {
    ( $( $x:item )* ) => {
        $(
            #[cfg(feature = "dwarf")]
            $x
        )*
    }
}

if_dwarf! {
    // ... definitions and things for when dwarf is enabled go in here ...
}

and then you could use it throughout the rest of the crate rather than adding the #[cfg(...)]s manually on every single definition.

@data-pup data-pup force-pushed the binary-compat branch 2 times, most recently from d98c96d to 8c9216f Compare July 6, 2018 17:57
@data-pup
Copy link
Member Author

data-pup commented Jul 10, 2018

I think I ended up finding a cleaner way around this. Because nothing in parser/object_parse would ever be used unless we've enabled the 'dwarf' feature, we can use the cfg attribute on that mod statement specifically. This keeps the various #[cfg(feature = "dwarf")] statements localized to parser.rs, and eliminates the need to have these scattered about in the object file parsing code.

Edit: One minor detail I did notice, is that because this work introduces a dependency on gimli to the traits crate, it'd be nice to only include that dependency if we're using the dwarf feature. Not totally sure how to do that atm, but I can cycle back on that once this does build successfully :)

@fitzgen
Copy link
Member

fitzgen commented Jul 10, 2018

Edit: One minor detail I did notice, is that because this work introduces a dependency on gimli to the traits crate, it'd be nice to only include that dependency if we're using the dwarf feature. Not totally sure how to do that atm, but I can cycle back on that once this does build successfully :)

The way to do this is to make gimli and object be optional dependencies, and have the dwarf feature enable them:

[features]
dwarf = ["gimli", "object"]

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great! Just one tiny nitpick left before we merge :)

@@ -13,6 +13,7 @@ path = "./traits.rs"

[dependencies]
failure = "0.1.1"
gimli = "0.16.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dependency should be optional, and only enabled when the parser has the dwarf feature enabled.

traits/traits.rs Outdated
@@ -57,6 +58,15 @@ enum ErrorInner {

#[fail(display = "Regex error: {}", _0)]
Regex(#[cause] regex::Error),

#[fail(display = "Gimli error: {}", _0)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this variant should be cfged for when gimli is enabled.

traits/traits.rs Outdated
@@ -99,6 +110,14 @@ impl From<regex::Error> for Error {
}
}

impl From<gimli::Error> for Error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is going to need the cfg too -- sorry I forgot to point it out earlier!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem! Pardon, I should've seen that too. Fixing that now!

@fitzgen fitzgen merged commit 19076f9 into rustwasm:master Jul 10, 2018
@fitzgen
Copy link
Member

fitzgen commented Jul 10, 2018

🎉 🎈 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants