Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide syntax warnings to Java #10645

Merged
merged 7 commits into from
Jul 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 49 additions & 23 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
import org.enso.compiler.core.ir.Name;
import org.enso.compiler.core.ir.Pattern;
import org.enso.compiler.core.ir.Type;
import org.enso.compiler.core.ir.Warning;
import org.enso.compiler.core.ir.expression.Application;
import org.enso.compiler.core.ir.expression.Case;
import org.enso.compiler.core.ir.expression.Comment;
Expand All @@ -35,6 +36,7 @@
import org.enso.syntax2.Base;
import org.enso.syntax2.DocComment;
import org.enso.syntax2.Line;
import org.enso.syntax2.Parser;
import org.enso.syntax2.TextElement;
import org.enso.syntax2.Token;
import org.enso.syntax2.Tree;
Expand Down Expand Up @@ -918,7 +920,9 @@ yield switch (op.codeRepr()) {
var lhs = unnamedCallArgument(app.getLhs());
var rhs = unnamedCallArgument(app.getRhs());
var loc = getIdentifiedLocation(app);
yield applyOperator(op, lhs, rhs, loc);
var ir = applyOperator(op, lhs, rhs, loc);
attachTranslatedWarnings(ir, app);
yield ir;
}
};
}
Expand Down Expand Up @@ -1197,6 +1201,14 @@ yield translateFunction(fun, name, isOperator, fun.getArgs(), fun.getBody(),
};
}

private void attachTranslatedWarnings(IR ir, Tree tree) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With following signature

private static <T extends IR> IR attachTranslatedWarnings(T ir, Tree tree);

one can write yield attachTranslatedWarnings(ir, app).

for (var warning : tree.getWarnings()) {
var message = Parser.getWarningMessage(warning);
var irWarning = new Warning.Syntax(ir, message);
ir.diagnostics().add(irWarning);
}
}

private Operator applyOperator(Token.Operator op, CallArgument lhs, CallArgument rhs,
Option<IdentifiedLocation> loc) {
var name = new Name.Literal(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -144,4 +144,19 @@ object Warning {
override def diagnosticKeys(): Array[Any] = Array(ir.name)
}

case class Syntax(ir: IR, message: String) extends Warning {

/** @return a human-readable description of this error condition.
*/
override def message(source: (IdentifiedLocation => String)): String =
message

/** The location at which the diagnostic occurs. */
override val location: Option[IdentifiedLocation] = ir.location

/** The important keys identifying identity of the diagnostic
*/
override def diagnosticKeys(): Array[Any] = Array()
}

}
73 changes: 64 additions & 9 deletions lib/rust/parser/debug/src/bin/bench_parse.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
//! Parses Enso sources, measuring time spent in the parser.

#![feature(test)]
// === Non-Standard Linter Configuration ===
#![allow(clippy::option_map_unit_fn)]
#![allow(clippy::precedence)]
Expand All @@ -11,24 +12,78 @@



// =============
// === Tests ===
// =============
// ===========
// === CLI ===
// ===========

fn main() {
let args = std::env::args().skip(1);
let parser = enso_parser::Parser::new();
let parse_time: std::time::Duration = args
.map(|path| {
let code = std::fs::read_to_string(path).unwrap();
let mut code = code.as_str();
if let Some((_meta, code_)) = enso_parser::metadata::parse(code) {
code = code_;
}
let code = read_source(path).unwrap();
let start = std::time::Instant::now();
std::hint::black_box(parser.run(code));
std::hint::black_box(parser.run(&code));
start.elapsed()
})
.sum();
println!("Parse time: {} ms", parse_time.as_millis());
}

fn read_source(path: impl AsRef<Path>) -> io::Result<String> {
let code = fs::read_to_string(path)?;
Ok(if let Some((_meta, code)) = enso_parser::metadata::parse(&code) {
code.to_owned()
} else {
code
})
}



// ===============================
// === `cargo bench` interface ===
// ===============================

extern crate test;

use std::fs::DirEntry;
use std::fs::{self};
use std::io;
use std::path::Path;

fn visit_files<F: FnMut(&DirEntry)>(dir: &Path, f: &mut F) -> io::Result<()> {
if dir.is_dir() {
for entry in fs::read_dir(dir)? {
let entry = entry?;
let path = entry.path();
if path.is_dir() {
visit_files(&path, f)?;
} else {
f(&entry);
}
}
}
Ok(())
}

#[bench]
fn bench_std_lib(b: &mut test::Bencher) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that there is a framework for benchmarks in the engine. We currently have benchmarks for the engine compiler, like InlineCOmpilerBenchmark. We don't have benchmarks for parser yet, but they should not be difficult to implement.

Is this benchmark run automatically on the CI, or is it supposed to be run manually? Shouldn't we just add parser benchmarks to the already existing engine runtime-benchmarks infrastructure?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to have an EnsoParser benchmark to measure the overhead of JNI calls when doing TreeToIr. Probably just a copy of ManyLocalVarsBenchmark that doesn't call compiler, but only TreeToIr.

let mut sources = vec![];
visit_files(Path::new("../../../../distribution/lib"), &mut |dir_ent| {
let path = dir_ent.path();
if let Some(ext) = path.extension() {
if ext == "enso" {
sources.push(read_source(path).unwrap())
}
}
})
.unwrap();
let parser = enso_parser::Parser::new();
b.bytes = sources.iter().map(|s| s.len() as u64).sum();
b.iter(|| {
for source in &sources {
test::black_box(parser.run(source));
}
});
}
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,8 @@ private Parser(long stateIn) {

private static native long getMetadata(long state);

private static native String getWarningTemplate(int warningId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance of native methods isn't comparable to non-native ones. If we call this method frequently (for each Tree?), we may see slowdowns. The design of the Tree serde was:

  • pass source to parser via native call
  • get byte[] array back
  • deserialize the array via Java-only means
  • never calls native method again

This getWarningMessage (as well as getUuidHigh and getUuidLow) are compromising the design and will become a bottleneck once (when we remove other, bigger bottlenecks).

Copy link
Contributor Author

@kazcw kazcw Jul 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Rust side, I implemented warnings with a table of messages with the idea being we can transfer the table all at once, ideally at build time. This method is an initial unoptimized implementation that was much simpler. The method is called for each warning encountered, so this temporary solution will not have any performance impact when warnings are not numerous.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment isn't really related to this PR, but I don't know where else to record these findings. Anyway it illustrates what I meant by

will become a bottleneck once (when we remove other, bigger bottlenecks).

I have just performed a measurement on 2024.3.1-nightly.2024.7.27 and executed:

enso$ ./built-distribution/enso-engine-*/enso-*/bin/enso --run test/Base_Tests/ --profiling-path start.npss Nic

that is a nice benchmark for parser as it needs to parse 118 .enso source files in test/Base_Tests and then it executes almost no Enso code. Following hotspots were identified in start.npss main thread by VisualVM:

HotSpots

There is still five more bottlenecks ahead of Parser.parseTree, but still it takes 1.3s to parse those files...


static native long getUuidHigh(long metadata, long codeOffset, long codeLength);

static native long getUuidLow(long metadata, long codeOffset, long codeLength);
Expand Down Expand Up @@ -131,6 +133,10 @@ public Tree parse(CharSequence input) {
return Tree.deserialize(message);
}

public static String getWarningMessage(Warning warning) {
return getWarningTemplate(warning.getId());
}

@Override
public void close() {
freeState(state);
Expand Down
2 changes: 1 addition & 1 deletion lib/rust/parser/jni/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ license-file = "../../LICENSE"
[dependencies]
enso-prelude = { path = "../../prelude" }
enso-parser = { path = "../" }
jni = "0.19.0"
jni = "0.21.0"

[lib]
name = "enso_parser"
Expand Down
Loading
Loading