Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scala frontend in development -- question #568

Closed
robinmaisch opened this issue Aug 3, 2022 · 4 comments
Closed

Scala frontend in development -- question #568

robinmaisch opened this issue Aug 3, 2022 · 4 comments
Labels
question A question, so neither a bug nor a enhancement proposal.
Milestone

Comments

@robinmaisch
Copy link
Contributor

I cannot quite seem to remember the solution that we came up with for this problem, so I'll just quickly put it up for discussion again.

In Scala, operators are just standard methods. They can be called in infix notation or standard notation.

val sum = 1 + 2   // evaluates to (1).+(2)

In other frontends, operations are not assigned tokens, but merthod calls are. The task would be to try to distingush between the two.
I came up with these solutions, all of which come with a downside.

  • treat all methods as such, even operators
    -> more tokens are generated than by other frontends, danger of cluttering; but is consistent

  • filter out the operators and only assign a token to "real" methods

    • by a list: +, -, *, /, %, ...
    • by a pattern, e.g. [\w_]+
      -> both versions allow to hide method calls, as even standard operators can be overloaded. Is this a risk worth taking?
  • decide whether an operator is just standard or a custom implementation
    -> that requires type information not available to the parser

  • "reject" solutions that use invalid method identifiers
    -> this is not the responsibility of JPlag as I understand it

I lack the experience with typical attack patterns and the sensitivity of the algorithm to judge what's best. Please help!

@tsaglam
Copy link
Member

tsaglam commented Aug 4, 2022

I would say we ignore the risk that someone overloads operators. Furthermore, I would say custom operators should be treated as methods, while the standard operators are filtered out (by list preferably), even if that includes filtering out overloaded standard operators. These design decisions need to be properly documented in the frontend readme.

@tsaglam tsaglam added this to the v4.0.0 milestone Aug 4, 2022
@tsaglam tsaglam added enhancement Issue/PR that involves features, improvements and other changes minor Minor issue/feature/contribution/change question A question, so neither a bug nor a enhancement proposal. and removed enhancement Issue/PR that involves features, improvements and other changes minor Minor issue/feature/contribution/change labels Aug 4, 2022
@sebinside
Copy link
Member

Yes, this is what we discussed. The risk of attack is mitigated by only filtering a defined and limited number of operators (&&, +, and so on) and treating all other combinations of special characters like methods. In this way, both attack surfaces of A) hiding behind overwritten operators and B) hiding behind fake method calls are minimized.

@robinmaisch
Copy link
Contributor Author

robinmaisch commented Aug 5, 2022

Great! This should apply regardless of whether these standard operators are called in infix or dot notation, right?
EDIT: It might not be easy to figure out the method name, because the complete term up to the opening parenthesis is treated as the called function.

val sum = myObject.reference.chain.that.leads.up.to.a.number.attribute.+(1)
          ^ ------ the parser thinks this is the function name ------- ^

@tsaglam
Copy link
Member

tsaglam commented Aug 5, 2022

@robinmaisch

Great! This should apply regardless of whether these standard operators are called in infix or dot notation, right?

Yes!

EDIT: It might not be easy to figure out the method name, because the complete term up to the opening parenthesis is treated as the called function.

Can't you just split the name with a dot and take the last substring? Or are there cases where this does not work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question A question, so neither a bug nor a enhancement proposal.
Projects
None yet
Development

No branches or pull requests

3 participants