-
Notifications
You must be signed in to change notification settings - Fork 201
Added initial Rust codegen-meta implementation. #403
Added initial Rust codegen-meta implementation. #403
Conversation
lib/codegen-meta/src/base/mod.rs
Outdated
@@ -0,0 +1,3 @@ | |||
//! Definitions for the base Cretonne language. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed a few instances of the name "Cretonne" snuck through, I can rectify these.
process::exit(1); | ||
}); | ||
|
||
if let Err(err) = meta::gen_types::generate("new_types.rs", &out_dir) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specifically would be the line that changes when we want to have the codegen-meta crate emit a types.rs
file instead, and replace the version currently being emitted by the Python code in codegen/meta/
😄
Build isn't quite done yet, but I noticed these problems in the 1.25 build. I can fix these today, should be fairly straightforward.
As for the nightly job failing, I'm much less sure how to fix that:
Any other feedback on this PR in general would be very appreciated though :) |
lib/codegen-meta/src/base/types.rs
Outdated
} | ||
|
||
pub struct IntIterator { | ||
index: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could just be an u8.
lib/codegen-meta/src/base/types.rs
Outdated
3 => Some(Int::I64), | ||
_ => None, | ||
}; | ||
self.index += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about release mode overflow? (Call .next()
usize::MAX
times and you get I8
again) You could replace the _ => None,
with _ => return None,
to prevent this.
Thanks @bjorn3! Didn't know about that early return trick for iterators, but that makes a lot of sense. That's cleaned up now 👍 Added a few other fixes, including fmt::Debug implementations for the different ValueType variants, to give the same output as the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! Here are some comments from an initial read through the code.
lib/codegen-meta/Cargo.toml
Outdated
authors = ["The Cranelift Project Developers"] | ||
version = "0.15.0" | ||
description = "DSL for cranelift-codegen code generator library" | ||
license = "Apache-2.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cranelift's license is now "Apache-2.0 WITH LLVM-exception".
lib/codegen-meta/src/cdsl/types.rs
Outdated
type Item = LaneType; | ||
fn next(&mut self) -> Option<Self::Item> { | ||
if let b @ Some(_) = self.bool_iter.next() { | ||
b.map(LaneType::from) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be written as
if let Some(b) = self.bool_iter.next() {
LaneType::from(b)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the method returns an option, this could also rewritten as:
if let Some(b) = self.bool_iter.next() {
Some(LaneType::from(b))
I can write it that way, so it isn't quite as opaque :)
lib/codegen-meta/src/cdsl/mod.rs
Outdated
/// Check if `x` is a power of two. | ||
fn _is_power_of_two(x: u8) -> bool { | ||
x > 0 && x & (x - 1) == 0 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know the Python code has this, but for Rust we should use the standard library's is_power_of_two
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, should I just go ahead and remove both this and the next_power_of_two
functions from this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
lib/codegen-meta/src/cdsl/mod.rs
Outdated
} | ||
|
||
res + 1 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for next_power_of_two
.
lib/codegen-meta/Cargo.toml
Outdated
name = "cranelift-codegen-meta" | ||
authors = ["The Cranelift Project Developers"] | ||
version = "0.15.0" | ||
description = "DSL for cranelift-codegen code generator library" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a minor point, I'm thinking I'd like to move away from "DSL" terminology in general. It feels like part of the problem with the Python code is the DSL approach, which leans a little toward being its own specialized world and a little away from being idiomatic Python code. This makes working within the system more streamlined, but it also makes debugging the system and changing how the system works harder.
So how about "Metaprogram for cranelift-codegen code generator library"?
lib/codegen-meta/src/cdsl/types.rs
Outdated
} | ||
} | ||
|
||
/// Return the name of this type for other Rust source files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Rust" is a little less unambiguous now that the metaprogram is Rust too :-). I think it's enough to say "generated Rust source files" in the comment here.
lib/codegen-meta/src/cdsl/types.rs
Outdated
|
||
/// Get the name of this vector type. | ||
pub fn name(&self) -> String { | ||
format!("{}X{}", self.base.name(), self.lanes,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lower-case 'x' in the python code, which I think is a little easier to read.
lib/codegen-meta/src/cdsl/types.rs
Outdated
pub fn number(&self) -> u8 { | ||
let b = f64::from(self.base.number()); | ||
let l = (self.lanes as f64).log2(); | ||
let num = 16_f64 * l + b; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this computation without using floating point? I guess it's a little more work because we need a helper:
fn floor_log2(x: u64) -> u32 {
63 - x.leading_zeros()
}
but it would mean that we won't need to add an exception for the float_arithmetic
clippy lint that we enable in some configurations.
Also, we should give the "16" constant a name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this a little bit. After re-reading this comment I think I had an idea:
// Vector types are encoded with the lane type in the low 4 bits and log2(lanes)
// in the high 4 bits, giving a range of 2-256 lanes.
The reason we are calculating 16 * log_2(number_of_lanes)
is to shift that value into the top four bits. Maybe it would be more legible if we used <<
instead? It's probably also worth reiterating that numbering comment above the VectorType's number method too.
lib/codegen-meta/src/gen_types.rs
Outdated
//! `lib/codegen/ir/types.rs`. The file provides constant definitions for the | ||
//! most commonly used types, including all of the scalar types. | ||
//! | ||
//! This ensures that Python and Rust use the same type numbering. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ensures that the metaprogram and the generated program see the same type numbering :-).
/// Given a multi-line string, split it into a sequence of lines after | ||
/// stripping a common indentation. This is useful for strings defined with | ||
/// doc strings. | ||
fn parse_multiline(s: &str) -> Vec<String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially concerned that this is too Python-specific, but looking at the way Rust handles multi-line literals, we may indeed want to write Rust code for this in much the way we write Python:
fmt.doc_comment("
An instruction format
Every opcode has a corresponding instruction format
which is represented by both the `InstructionFormat`
and the `InstructionData` enums.
")
which would then want the same logic of stripping common leading whitespace, and so on. Does that sound right?
lib/codegen-meta/Cargo.toml
Outdated
authors = ["The Cranelift Project Developers"] | ||
version = "0.15.0" | ||
description = "DSL for cranelift-codegen code generator library" | ||
license = "Apache-2.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please add a copy of the LICENSE file under lib/codegen-meta (this is a recent change).
Added a series of commits to address the various points in your notes :) I wasn't sure about the (Update: Checked through the travis failures, I can fix the 1.25 issues today.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, with just one stylistic comment below. Also, there's a minor merge conflict with a version number change in one of the Cargo.toml files.
I'm thinking once that's squared away, I'd like to merge this. I expect we'll stick with new_types.rs
for a little while yet, so we can get some experience with build-time crates in various settings, but if things go smoothly, then we can think about flipping that switch too :-).
lib/codegen-meta/src/error.rs
Outdated
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { | ||
match self { | ||
&ErrorInner::Msg(ref s) => write!(f, "{}", s), | ||
&ErrorInner::IoError(ref e) => write!(f, "{}", e), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and elsewhere, you can change this to match *self
, and then you don't need a &
on the front of each match arm. This also makes it more consistent with the rest of the code in Cranelift.
5663f1e
to
46d4bb1
Compare
Took care of the style issue mentioned :) Sounds good! I'll leave the generated file as 'new_types.rs`. |
Thanks! I'm excited about where we'll go with this :-) |
Me too! It was a lot of fun to work on this. If you have any ideas for next steps, I'd love to try and help out with some of that as well! We can talk about that back in #342 😄 |
This PR contains the first steps towards implementing the DSL in Rust, as previously discussed in #342. This intends to replace the
gen_types.py
file, and will emit a file namednew_types.rs
in the same directory as the existingtypes.rs
file, when thecodegen
crate is built.Thanks for the mentoring @sunfishcode! Had a lot of fun putting this together.