Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Scala frontend #570

Merged
merged 21 commits into from
Aug 8, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
5c33e9b
Set up Scala frontend initially
robinmaisch Jul 25, 2022
3738c8c
Merge branch 'master-remote' into scala
robinmaisch Jul 25, 2022
e1ecedc
Add more token types, properly integrated Scala module in project
robinmaisch Jul 27, 2022
4286a3c
Minimize complete test file, add MEMBER token
robinmaisch Jul 30, 2022
fa1dec2
Add complete sample file, refined Token extraction
robinmaisch Aug 3, 2022
43a9f4f
Merge branch 'master-remote' into scala
robinmaisch Aug 3, 2022
2a8b9f4
Adapt to interface changes
robinmaisch Aug 3, 2022
40f917a
Reformat
robinmaisch Aug 3, 2022
4631803
Remove redundant method implementations
robinmaisch Aug 3, 2022
38cd72f
Add README, remove redundant parts
robinmaisch Aug 3, 2022
cedac83
Introduce ScalaTokenConstants.numberOfTokens
robinmaisch Aug 3, 2022
e9e2977
Add more documentation and examples, implement operator filter
robinmaisch Aug 6, 2022
baf4136
Merge branch 'master-remote' into scala
robinmaisch Aug 6, 2022
6e8e5b9
Update (and sort) CLI dependencies; update Parser.scala in test files
robinmaisch Aug 6, 2022
2180e2a
Check access modifiers, remove redundant methods
robinmaisch Aug 6, 2022
a2cd9f4
Remove duplicate match case
robinmaisch Aug 6, 2022
aec0132
Make README more precise
robinmaisch Aug 6, 2022
b6b0b3b
Remove redundant parts of POM, upgrade dependencies to Scala 2.13
robinmaisch Aug 7, 2022
e82aa5e
Update README after dependency upgrade
robinmaisch Aug 7, 2022
66f2f78
Fix bug
robinmaisch Aug 7, 2022
5fde60f
Handle tuple assignments, add custom control structure example, fix a…
robinmaisch Aug 8, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Usage: JPlag [ options ] [ <root-dir> ... ] [ -new <new-dir> ... ] [ -old <old-d

named arguments:
-h, --help show this help message and exit
-l {java,python3,cpp,csharp,golang,kotlin,rlang,char,text,scheme} Select the language to parse the submissions (default: java)
-l {java,python3,cpp,csharp,golang,kotlin,rlang,scala,char,text,scheme} Select the language to parse the submissions (default: java)
-bc BC Path of the directory containing the base code (common framework used in all submissions)
-v {quiet,long} Verbosity of the logging (default: quiet)
-d Debug parser. Non-parsable files will be stored (default: false)
Expand Down
10 changes: 9 additions & 1 deletion jplag.cli/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,20 @@
</dependency>
<dependency>
<groupId>de.jplag</groupId>
<artifactId>rlang</artifactId>
<artifactId>golang</artifactId>
</dependency>
<dependency>
<groupId>de.jplag</groupId>
<artifactId>kotlin</artifactId>
</dependency>
<dependency>
<groupId>de.jplag</groupId>
<artifactId>rlang</artifactId>
</dependency>
<dependency>
<groupId>de.jplag</groupId>
<artifactId>scala</artifactId>
</dependency>
<dependency>
<groupId>de.jplag</groupId>
<artifactId>scheme</artifactId>
Expand Down
92 changes: 92 additions & 0 deletions jplag.frontend.scala/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# JPlag Scala language frontend

The JPlag Scala frontend allows the use of JPlag with submissions in Scala. <br>
It is based on the [Scalameta library](https://scalameta.org/) parser, and is adapted from
the [CodeGra-de Scala frontend](https://github.com/CodeGra-de/jplag/tree/master/jplag.frontend.scala) for JPlag, both
licensed under BSD-3.

### Scala specification compatibility

The dependencies only allow compatibility up to Scala 2.13.8 (January 2022), so the new syntactical features of Scala 3
are not supported yet.

As of now, Scalameta is not available for Scala 3 yet (see the [GitHub issue](https://github.com/scalameta/scalameta/issues/2485)),
so the upgrade needs to wait. It would seem that once this frontend is equipped with Scalameta for Scala 3, it will be able to handle both Scala 2 and 3 equally as [the syntax is backwards compatible](https://scala-lang.org/2019/12/18/road-to-scala-3.html#:~:text=Scala%203%20is%20backwards%20compatible%20with%20Scala%202) for the most part.

### Token Extraction

#### General

The choice of tokens is intended to be similar to the Java or C# frontends. Specifically, among others, it includes a
range of nesting structures (class and method declarations, control flow expressions) as well as variable declaration,
object creation, assignment, and control flow altering keywords. <br>
Blocks are distinguished by their context, i.e. there are separate `TokenConstants` for `if` blocks, `for` blocks, class
bodies, method bodies, array constructors, and the like.

More syntactic elements of Scala may turn out to be helpful to include in the future, especially those that are newly
introduced.

#### Problem in Scala (1): Method calls

The syntax of Scala allows to omit the parentheses when calling methods without arguments. These method calls are
indistinguishable from member references. This makes the system vulnerable to attacks where an empty set of brackets is
simply added after the member name. To address this, method calls with no arguments are not marked with an `APPLY`
token, even though they are recognizable as method calls.

```scala
myObject.member // may be member reference or method call
// gets MEMBER token

myObject.member2() // must be method call
// gets MEMBER token

myObject.member3(arg1, arg2) // must be method call
// gets APPLY MEMBER ARG ARG tokens
```

#### Problem in Scala (2): Operators

Operators are implemented as regular method calls. Additionally, custom operators on objects/classes can be defined,
possibly overloading existing ones like `+`, `&=` etc.

In other frontends, operations are not assigned tokens but "regular" method calls are. This calls for the task to try to
distinguish operations from what we understand as "regular" method calls. This is not entirely possible with only
parsing information, so we decided to go about this problem as follows:

- Calls to methods with an identifier that is used as an operator are NOT treated as a method call. This is accomplished
by comparing to a hard-coded list of standard operators on numbers, booleans, lists, and types (although type
operators cannot be used in the same contexts as the others). This applies in infix and dot notation.
- Calls to methods with any other identifier, be it alphanumerical, symbolic or any combination, are treated as method
calls and are assigned `APPLY` and `ARG` tokens if applicable, see (1).

#### Problem in Scala (3): `return` is optional and discouraged

In Scala, the use of the `return` keyword is regarded as a bad smell because it may disrupt the control flow in ways
unintended by the less experienced Scala developer.
Instead, like any block of code, the method body is evaluated to the last expression that is evaluated.

```scala
def power(base: Int, exponent: Int): Int = {
if (exponent == 0) 1 // mark this return value?
else if (exponent == 1) base // and this one?
else if (exponent % 2 == 0)
((i: Int) => i*i)(power(base, exponent / 2)) // and this one?
else base * power(base, exponent - 1) // and this one?
}
```
That raises the question whether to try and mark these more implicit return values, so that the output of this frontend
would be consistent with others.

To determine all possible return values, semantic information about control structures is necessary which may be tedious
to extract from the AST, but possible (e.g. by means of a stack mechanic).
On the other hand, "the last expression of a block evaluated" does not hold the same _syntactical_ weight to it as a return
statement.

For the moment, implicit block values are neglected.

### Usage

To use the Scala frontend, add the `-l scala` flag in the CLI, or use a `JPlagOption` object set
to `LanguageOption.SCALA` in the Java API as described in the usage information in
the [readme of the main project](https://github.com/jplag/JPlag#usage)
and [in the wiki](https://github.com/jplag/JPlag/wiki/1.-How-to-Use-JPlag).
73 changes: 73 additions & 0 deletions jplag.frontend.scala/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>

<parent>
<groupId>de.jplag</groupId>
<artifactId>aggregator</artifactId>
<version>${revision}</version>
</parent>
<artifactId>scala</artifactId>

<properties>
<scala.version>2.13.8</scala.version>
<scala.compat.version>2.13</scala.compat.version>
</properties>

<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>de.jplag</groupId>
<artifactId>frontend-utils</artifactId>
<version>${revision}</version>
</dependency>

<!-- https://mvnrepository.com/artifact/org.scalameta/scalameta -->
<dependency>
<groupId>org.scalameta</groupId>
<artifactId>scalameta_${scala.compat.version}</artifactId>
<version>4.5.9</version>
</dependency>

<!-- Test -->
<dependency>
<groupId>de.jplag</groupId>
<artifactId>frontend-testutils</artifactId>
<version>${revision}</version>
<type>test-jar</type>
<scope>test</scope>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<!-- see http://davidb.github.com/scala-maven-plugin -->
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.3.2</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
<configuration>
<args>
<arg>-dependencyfile</arg>
<arg>${project.build.directory}/.scala_dependencies</arg>
</args>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/java</testSourceDirectory>
</build>

</project>
23 changes: 23 additions & 0 deletions jplag.frontend.scala/src/main/scala/de/jplag/scala/Language.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
package de.jplag.scala

import de.jplag.TokenList

import java.io.File

class Language extends de.jplag.Language {
private val parser = new Parser
private final val fileExtensions = Array(".scala", ".sc")

override def suffixes: Array[String] = fileExtensions

override def getName = "Scala parser"

override def getShortName = "scala"

override def minimumTokenMatch = 8

override def parse(dir: File, files: Array[String]): TokenList = this.parser.parse(dir, files)

override def hasErrors: Boolean = this.parser.hasErrors

}
Loading