Extended IR #147

aalexandrov · 2016-01-18T16:52:39Z

This is a meta-issue that should track the planned changes in the IR.

Throughout the discussion in this issue, we will use the following schema based on the MostConsistentClassifier workflow:

// product types
case class Profile(id: Long, name: String, surname: String)
case class Email(id: Long, ip: Int, content: String)
case class Server(ip: IP, isBlacklisted: Boolean)
case class Classifier(model: Long, weights: SparseVector[Double])

Motivation

Most of the transformation-based optimizations in Emma can only be applied if certain conditions apply. Deciding whether the conditions are satisfied typically involves some form of dataflow analysis on top of lifted code.

Example: Fold-Group Fusion

Consider the following code that produces two aggregates over the emails grouped by their ip value:

for {
  g <- emails.groupBy(_.ip)
} yield {
  val agg1 = g.values.sum()
  val agg2 = g.values.map(_.content.size()).sum()
  (g.key, agg1, agg2)
}

The code is eligible to fold-group fusion (FGF) as all g.values uses appear within the context of chain of (zero or more) map applications ending in a fold.

While (FGF) of the variation above works at the moment, the following variations will not be considered eligible, as at the moment there is no direct and simple way to consider constant propagation in the Scala expressions referenced in comprehended terms:

// structural typing on the argument
for {
  (key, values) <- emails.groupBy(_.ip)
} yield {
  val agg1 = values.sum()
  val agg2 = values.map(_.content.size()).sum()
  (key, agg1, agg2)
}

// projection aliases
for {
  g <- emails.groupBy(_.ip)
} yield {
  val values = g.values
  val agg1 = values.sum()
  val agg2 = values.map(_.content.size()).sum()
  (g.key, agg1, agg2)
}

Example: Join Ordering

Another scenario where a better IR might be beneficial is join ordering. Consider the following code

for {
  e <- emails
  s <- people
  r <- people
  if from(e.content) == s.email
  if to(e.content) == s.email
} yield {
  val x1 = e1 // depends on e and s
  val x2 = e2 // depends on x1 and r
  f(x2)
}

In order to reduce the amount of communication cost when this expression is compiled to a join cascade, we might want to apply x1 right after we join e and s, prune all extra fields, and only after that join with r.

In effect, this means that the head expressions is split into multiple parts and parts of it might be applied on top of a rhs of aggregator during the combinator rewrite phase.

Approach

In order to make the compiler oblivious to such code modifications, I propose to extend the current comprhension IR in the following strategy.

Target a limited subset of Scala expressions based on the usage patterns observed in emma-examples. Lift, normalize, and transform the whole quoted three rather than only the comprehended expressions.
Use a flavour of an SSA-like IR which simplifies answers based on use-def and def-use chains. A good candidate seems to be direct style in let-normal form for a good overview and a short introduction to this IR, please refer to Chapter 6 in the SSA Book.
Restructure the code in a way which will allow us to use it both for macro-based and reflection-based compilation.

Refactoring Plan

The refactoring of the code should be based on the following refactoring plan. Each item will be tracked by a separate issue:

Immutable comphrehensions
Consistent comprehension naming and nomenclature
Simplify macro-level IR
Design a universe-agnostic, holistic IR
Simplify runtime-level IR in emma.ir
Scala AST to Let Normal Form translation

The text was updated successfully, but these errors were encountered:

joroKr21 · 2016-01-18T17:07:26Z

Actually we could use a somewhat hacky solution to bring the separate hierarchies together. Since Scala's Constant(Literal(_)) is polymorphic we could wrap the comprehension nodes in constants and then use a modified form of traversal that works on union types (or possibly do 2 passes - one for Scala nodes and one for Emma nodes).

Resolves #153. Adapted the ComprehensionModel nodes so all fields are immutable. The only place where mutability is left is in the Generator type, as otherwise we would have to also rewrite the FoldGroupFusion optimization, which won't be trivial and at the moment seems unnecessary. The Generator type should be fixed as we make further progress on #147.

joroKr21 · 2016-12-06T17:58:15Z

I think we can close this now

aalexandrov · 2016-12-09T14:32:35Z

Alright. I've linked the issue on the wiki for future reference.

aalexandrov added the IR label Jan 18, 2016

aalexandrov self-assigned this Jan 18, 2016

aalexandrov added this to the Feb 2016 milestone Jan 18, 2016

aalexandrov mentioned this issue Jan 18, 2016

Unify traversal schemes #148

Closed

aalexandrov mentioned this issue Jan 20, 2016

Consistent comprehension naming and nomenclature #149

Closed

aalexandrov added the MACROS label Jan 21, 2016

aalexandrov mentioned this issue Jan 22, 2016

Immutable Comphrehensions #153

Closed

aalexandrov mentioned this issue Jan 25, 2016

Simplify macro-level IR #155

Closed

aalexandrov modified the milestones: Feb 2016, Mar 2016 Feb 26, 2016

joroKr21 mentioned this issue Mar 7, 2016

The expression problem #35

Closed

aalexandrov modified the milestones: Mar 2016, Apr 2016 Apr 4, 2016

aalexandrov removed the MACROS label Apr 5, 2016

aalexandrov modified the milestone: Apr 2016 Apr 5, 2016

aalexandrov closed this as completed Dec 9, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extended IR #147

Extended IR #147

aalexandrov commented Jan 18, 2016 •

edited

Loading

joroKr21 commented Jan 18, 2016

joroKr21 commented Dec 6, 2016

aalexandrov commented Dec 9, 2016

Extended IR #147

Extended IR #147

Comments

aalexandrov commented Jan 18, 2016 • edited Loading

Motivation

Example: Fold-Group Fusion

Example: Join Ordering

Approach

Refactoring Plan

joroKr21 commented Jan 18, 2016

joroKr21 commented Dec 6, 2016

aalexandrov commented Dec 9, 2016

aalexandrov commented Jan 18, 2016 •

edited

Loading