Added initial Rust codegen-meta implementation. #403

data-pup · 2018-07-16T19:08:22Z

This PR contains the first steps towards implementing the DSL in Rust, as previously discussed in #342. This intends to replace the gen_types.py file, and will emit a file named new_types.rs in the same directory as the existing types.rs file, when the codegen crate is built.

Thanks for the mentoring @sunfishcode! Had a lot of fun putting this together.

data-pup · 2018-07-16T19:10:02Z

lib/codegen-meta/src/base/mod.rs

@@ -0,0 +1,3 @@
+//! Definitions for the base Cretonne language.


Just noticed a few instances of the name "Cretonne" snuck through, I can rectify these.

data-pup · 2018-07-16T19:12:51Z

lib/codegen/build.rs

+            process::exit(1);
+        });
+
+    if let Err(err) = meta::gen_types::generate("new_types.rs", &out_dir) {


This specifically would be the line that changes when we want to have the codegen-meta crate emit a types.rs file instead, and replace the version currently being emitted by the Python code in codegen/meta/ 😄

data-pup · 2018-07-16T19:38:33Z

Build isn't quite done yet, but I noticed these problems in the 1.25 build. I can fix these today, should be fairly straightforward.

error[E0658]: non-reference pattern used to match a reference (see issue #42640)

no method named try_for_each found for type ...

As for the nightly job failing, I'm much less sure how to fix that:

LLVM ERROR: IO failure on output stream: No space left on device
error: Could not compile `syntex_syntax`.

Any other feedback on this PR in general would be very appreciated though :)

bjorn3 · 2018-07-17T08:53:22Z

lib/codegen-meta/src/base/types.rs

+}
+
+pub struct IntIterator {
+    index: usize,


This could just be an u8.

bjorn3 · 2018-07-17T08:54:44Z

lib/codegen-meta/src/base/types.rs

+            3 => Some(Int::I64),
+            _ => None,
+        };
+        self.index += 1;


What about release mode overflow? (Call .next() usize::MAX times and you get I8 again) You could replace the _ => None, with _ => return None, to prevent this.

data-pup · 2018-07-17T13:08:16Z

Thanks @bjorn3! Didn't know about that early return trick for iterators, but that makes a lot of sense. That's cleaned up now 👍

Added a few other fixes, including fmt::Debug implementations for the different ValueType variants, to give the same output as the __repr__ methods in the original Python, as well as making changes for compatibility with v1.25.

sunfishcode

Thanks for working on this! Here are some comments from an initial read through the code.

sunfishcode · 2018-07-17T16:11:39Z

lib/codegen-meta/Cargo.toml

+authors = ["The Cranelift Project Developers"]
+version = "0.15.0"
+description = "DSL for cranelift-codegen code generator library"
+license = "Apache-2.0"


Cranelift's license is now "Apache-2.0 WITH LLVM-exception".

sunfishcode · 2018-07-17T16:21:26Z

lib/codegen-meta/src/cdsl/types.rs

+    type Item = LaneType;
+    fn next(&mut self) -> Option<Self::Item> {
+        if let b @ Some(_) = self.bool_iter.next() {
+            b.map(LaneType::from)


Could this be written as

if let Some(b) = self.bool_iter.next() { LaneType::from(b)

?

Since the method returns an option, this could also rewritten as:

if let Some(b) = self.bool_iter.next() { Some(LaneType::from(b))

I can write it that way, so it isn't quite as opaque :)

sunfishcode · 2018-07-17T16:27:48Z

lib/codegen-meta/src/cdsl/mod.rs

+/// Check if `x` is a power of two.
+fn _is_power_of_two(x: u8) -> bool {
+    x > 0 && x & (x - 1) == 0
+}


I know the Python code has this, but for Rust we should use the standard library's is_power_of_two.

In that case, should I just go ahead and remove both this and the next_power_of_two functions from this file?

sunfishcode · 2018-07-17T16:28:12Z

lib/codegen-meta/src/cdsl/mod.rs

+    }
+
+    res + 1
+}


Same for next_power_of_two.

sunfishcode · 2018-07-17T16:48:08Z

lib/codegen-meta/Cargo.toml

+name = "cranelift-codegen-meta"
+authors = ["The Cranelift Project Developers"]
+version = "0.15.0"
+description = "DSL for cranelift-codegen code generator library"


As a minor point, I'm thinking I'd like to move away from "DSL" terminology in general. It feels like part of the problem with the Python code is the DSL approach, which leans a little toward being its own specialized world and a little away from being idiomatic Python code. This makes working within the system more streamlined, but it also makes debugging the system and changing how the system works harder.

So how about "Metaprogram for cranelift-codegen code generator library"?

sunfishcode · 2018-07-17T17:51:43Z

lib/codegen-meta/src/cdsl/types.rs

+        }
+    }
+
+    /// Return the name of this type for other Rust source files.


"Rust" is a little less unambiguous now that the metaprogram is Rust too :-). I think it's enough to say "generated Rust source files" in the comment here.

sunfishcode · 2018-07-17T17:57:22Z

lib/codegen-meta/src/cdsl/types.rs

+
+    /// Get the name of this vector type.
+    pub fn name(&self) -> String {
+        format!("{}X{}", self.base.name(), self.lanes,)


This is a lower-case 'x' in the python code, which I think is a little easier to read.

sunfishcode · 2018-07-17T18:04:42Z

lib/codegen-meta/src/cdsl/types.rs

+    pub fn number(&self) -> u8 {
+        let b = f64::from(self.base.number());
+        let l = (self.lanes as f64).log2();
+        let num = 16_f64 * l + b;


Can we do this computation without using floating point? I guess it's a little more work because we need a helper:

fn floor_log2(x: u64) -> u32 { 63 - x.leading_zeros() }

but it would mean that we won't need to add an exception for the float_arithmetic clippy lint that we enable in some configurations.

Also, we should give the "16" constant a name.

I thought about this a little bit. After re-reading this comment I think I had an idea:

// Vector types are encoded with the lane type in the low 4 bits and log2(lanes) // in the high 4 bits, giving a range of 2-256 lanes.

The reason we are calculating 16 * log_2(number_of_lanes) is to shift that value into the top four bits. Maybe it would be more legible if we used << instead? It's probably also worth reiterating that numbering comment above the VectorType's number method too.

sunfishcode · 2018-07-17T18:15:07Z

lib/codegen-meta/src/gen_types.rs

+//! `lib/codegen/ir/types.rs`. The file provides constant definitions for the
+//! most commonly used types, including all of the scalar types.
+//!
+//! This ensures that Python and Rust use the same type numbering.


This ensures that the metaprogram and the generated program see the same type numbering :-).

sunfishcode · 2018-07-17T18:37:22Z

lib/codegen-meta/src/srcgen.rs

+/// Given a multi-line string, split it into a sequence of lines after
+/// stripping a common indentation. This is useful for strings defined with
+/// doc strings.
+fn parse_multiline(s: &str) -> Vec<String> {


I was initially concerned that this is too Python-specific, but looking at the way Rust handles multi-line literals, we may indeed want to write Rust code for this in much the way we write Python:

fmt.doc_comment(" An instruction format Every opcode has a corresponding instruction format which is represented by both the `InstructionFormat` and the `InstructionData` enums. ")

which would then want the same logic of stripping common leading whitespace, and so on. Does that sound right?

sunfishcode · 2018-07-17T19:54:29Z

lib/codegen-meta/Cargo.toml

+authors = ["The Cranelift Project Developers"]
+version = "0.15.0"
+description = "DSL for cranelift-codegen code generator library"
+license = "Apache-2.0"


Also, please add a copy of the LICENSE file under lib/codegen-meta (this is a recent change).

data-pup · 2018-07-17T23:39:00Z

Added a series of commits to address the various points in your notes :)

I wasn't sure about the doc_comment note above however. That makes sense to me, should I change anything regarding the parse_multilines function if so? Let me know if there are any other details I should change! Thanks again for the feedback.

(Update: Checked through the travis failures, I can fix the 1.25 issues today.)

sunfishcode

This looks good, with just one stylistic comment below. Also, there's a minor merge conflict with a version number change in one of the Cargo.toml files.

I'm thinking once that's squared away, I'd like to merge this. I expect we'll stick with new_types.rs for a little while yet, so we can get some experience with build-time crates in various settings, but if things go smoothly, then we can think about flipping that switch too :-).

sunfishcode · 2018-07-18T20:41:16Z

lib/codegen-meta/src/error.rs

+    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
+        match self {
+            &ErrorInner::Msg(ref s) => write!(f, "{}", s),
+            &ErrorInner::IoError(ref e) => write!(f, "{}", e),


Here and elsewhere, you can change this to match *self, and then you don't need a & on the front of each match arm. This also makes it more consistent with the rest of the code in Cranelift.

data-pup · 2018-07-19T14:40:16Z

Took care of the style issue mentioned :) Sounds good! I'll leave the generated file as 'new_types.rs`.

sunfishcode · 2018-07-19T16:57:42Z

Thanks! I'm excited about where we'll go with this :-)

data-pup · 2018-07-19T18:05:33Z

Me too! It was a lot of fun to work on this. If you have any ideas for next steps, I'd love to try and help out with some of that as well! We can talk about that back in #342 😄

Added initial Rust codegen-meta implementation.

c05527a

data-pup commented Jul 16, 2018

View reviewed changes

bjorn3 reviewed Jul 17, 2018

View reviewed changes

data-pup added 6 commits July 17, 2018 07:26

Replace 'Cretonne' in comments.

4e3205a

Prevent iterator overflow.

565516b

1.25.0 compatibility changes.

4944d8c

Implemented debug traits for type variants.

69f34e0

Added consistent comments.

61378a8

Cleaned up a loop via clippy fix.

194ec5e

sunfishcode reviewed Jul 17, 2018

View reviewed changes

data-pup added 2 commits July 17, 2018 15:43

Added new license to codegen-meta Cargo.toml

2f42b13

Edited lane type iterator next method.

9d90806

sunfishcode reviewed Jul 17, 2018

View reviewed changes

data-pup added 14 commits July 17, 2018 16:46

Removed functions that are not needed in Rust, and edited desc.

8842b28

Debug trait derived for valuetype.

b9a1835

Added comments for iterator types in the base types submodule.

ebed1c9

Numbering is now handled in the cdsl/types.rs file.

7c39c33

Moved type number logic into cdsl/types.

bba565b

Repeating the lane change cleanup.

f842b07

Removed codegen-meta crate from codegen deps.

a62db2e

Typo fix.

07e2919

Addressing a patch note.

1a8f003

Addressing patch note.

1d121c5

Lowercase in vector names.

2c1232d

Fixing a comment bug.

d8e9423

Added a copy of the license file.

73b6da6

Formatting changes.

352a859

Cleaned up the vector type numbering.

551b419

1.25 compatibility.

9534bed

sunfishcode reviewed Jul 18, 2018

View reviewed changes

Fixed pattern match arms.

46d4bb1

data-pup force-pushed the dsl-gentypes-rs branch from 5663f1e to 46d4bb1 Compare July 19, 2018 14:01

sunfishcode merged commit 5e93243 into bytecodealliance:master Jul 19, 2018

sunfishcode mentioned this pull request Jul 20, 2018

Port the DSL to Rust #342

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added initial Rust codegen-meta implementation. #403

Added initial Rust codegen-meta implementation. #403

data-pup commented Jul 16, 2018

data-pup Jul 16, 2018

data-pup Jul 16, 2018

data-pup commented Jul 16, 2018 •

edited

Loading

bjorn3 Jul 17, 2018

bjorn3 Jul 17, 2018

data-pup commented Jul 17, 2018

sunfishcode left a comment

sunfishcode Jul 17, 2018

sunfishcode Jul 17, 2018

data-pup Jul 17, 2018

sunfishcode Jul 17, 2018

data-pup Jul 17, 2018

sunfishcode Jul 17, 2018

sunfishcode Jul 17, 2018

sunfishcode Jul 17, 2018

sunfishcode Jul 17, 2018

sunfishcode Jul 17, 2018

sunfishcode Jul 17, 2018

data-pup Jul 17, 2018

sunfishcode Jul 17, 2018

sunfishcode Jul 17, 2018

sunfishcode Jul 17, 2018

data-pup commented Jul 17, 2018 •

edited

Loading

sunfishcode left a comment

sunfishcode Jul 18, 2018

data-pup commented Jul 19, 2018

sunfishcode commented Jul 19, 2018

data-pup commented Jul 19, 2018

		@@ -0,0 +1,3 @@
		//! Definitions for the base Cretonne language.

+                  }
+                  res + 1
+              }

Added initial Rust codegen-meta implementation. #403

Added initial Rust codegen-meta implementation. #403

Conversation

data-pup commented Jul 16, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

data-pup commented Jul 16, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

data-pup commented Jul 17, 2018

sunfishcode left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

data-pup commented Jul 17, 2018 • edited Loading

sunfishcode left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

data-pup commented Jul 19, 2018

sunfishcode commented Jul 19, 2018

data-pup commented Jul 19, 2018

data-pup commented Jul 16, 2018 •

edited

Loading

data-pup commented Jul 17, 2018 •

edited

Loading