Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New ore filter implementation #1561

Merged
merged 46 commits into from
Apr 7, 2023
Merged

New ore filter implementation #1561

merged 46 commits into from
Apr 7, 2023

Conversation

Tictim
Copy link
Contributor

@Tictim Tictim commented Mar 4, 2023

What

This PR replaces previous ore dictionary filter system with a brand new implementation of ore filter expression parser and interpreter. This PR features a newly designed language based on previous ore filter expression, with a cheeky name "oreglob". (ore + glob)

Implementation Details

Language Specification

Grammar of the lauguage in EBNF is as follows.

oreGlob = [ FLAG ], [ or ], EOF

or  = and, { '|', { '|' }, and }
and = xor, { '&', { '&' }, xor }
xor = not, { '^', not }

not = '!', { '!' }, '(', [ or ], [ ')', [ not ] ]
    | { '!' }, primary, [ not ]

primary = LITERAL
        | '(', [ or ], [ ')' ]
        | ( '*' | '?' ), { '*' | '?' }

FLAG = '$', CHARACTER - WHITESPACE, { CHARACTER - WHITESPACE }
LITERAL = CHARACTER - WHITESPACE, { CHARACTER - WHITESPACE }

WHITESPACE = ' ' | '\t' | '\n' | '\r'

CHARACTER = ? every character with codepoint of <= 0xFFFF ?

This grammar set is very similar to previous ore filter expression, for example:

  • | denotes logical OR.
  • & denotes logical AND.
  • ^ denotes logical XOR.
  • ! denotes logical negation.
  • * denotes wildcard; i.e. 0 or more characters.

A number of changes and additions are also present, namely:

  • Optional enclosing parenthesis and lenient checking of redundant operators is now a part of the language spec.
  • The binary operators have precedence; for example, in a | b & c and a & b | c, logical AND is computed first in both scenario.
  • ! has an additional rule to prevent possible case of confusion.
    • If two or more matches follow ! token, it will be grouped as one match before negation; for example, !*plate* is equivalent to !(*plate*).
    • If ( directly follows after !, the following matches after enclosing parenthesis are not grouped. For example, !(ore)Gold does not mean !(oreGold); it is a syntactic sugar to treat negation on a group as same as negation inside a group. (!(a) == (!a))
  • New keyword, ?, matches any one character. Using this, if you want input with 1 or more characters, then you could write this: ?*
  • Empty group () denotes nothing, i.e. zero characters. This is logically equivalent to !(?*).
  • Escape character (\) allows keyword characters to be included in literals.

Outside of grammars, the language contains some behavior changes, including:

  • OreGlob is, by default, case insensitive. This means that the expression ingotiron can match ingotIron and so on. Users can opt-out from this and make the match case sensitive again, by inserting $c (compilation flag) at the start of expression.
  • Though not in the form of language spec, ore filter are now able to match items with no ore dictionary if it can match empty string. This change has a possiblity of breaking certain setups, for example setups with * as expression will start to match items with no oredict.

A simple parser, OreGlobParser, was created using top-down approach.

Interpreter Design

The interpreter, which inspired by this article, uses set of states to track the match result. Match is considered as success if any of the final output state is equal to input string's length, indicating the state chewed through entire string while evaluating each OreGlob nodes.

More detailed explanation is written on NodeInterpreter.

User Convenience Feature.

The goal of OreGlob is not just to clean up quirks of old implementation, but generally provide much nicer usability for users including those without extensive knowledge on programming languages. One way of achieving such goal is literally telling players the compiled result of expressions and errors.

The compiler may output error and warnings along with OreGlob instance, which can be displayed by text. The OreGlob instance itself can be translated to formatted string, which looks like below.

Input: dust*Gold | (plate* & !*Double*)
Output: 
one of...
> 'dust'
  ... followed by anything
  ... followed by 'Gold'
> or anything that is...
  > 'plate'
    ... followed by anything
  > and not:
      anything
      ... followed by 'Double'
      ... followed by anything

API layer

I have tried to hide implementation detail behind abstraction layer as much as possible, to account for future rewrite potential. For now, only 7 classes are defined in API package, with 3 of them being inner classes.

UI changes

Ore filter UI has been redesigned to accommodate new features available with OreGlob, as well as to provide better quality-of-life features.

Potential Compatibility Issues

As the new language spec contains numerous behavior changes, it is possible that some of the setups will behave differently. Users should be notified about this potential issue before updating.

@brachy84 brachy84 added the type: refactor Suggestion to refactor a section of code label Mar 4, 2023
Tictim added 26 commits March 4, 2023 21:34
Permit empty input as valid oreglob expression
Some node optimizations
Fix 'n or more' counting one more N on some scenario
Tests
Rename NotNode to GroupNode, make it respect inverted flag
…rted any char, any char or more & inverted any char or more; fix some tokens getting included in literals
Tictim added 4 commits March 14, 2023 00:15
# Conflicts:
#	src/main/java/gregtech/GregTechMod.java
- 'inversion' -> 'not'/'negation'; Turns out 'logical inversion' means completely different thing oops
- 'nothing' -> 'empty'
- 'something' -> 'nonempty'
- 'impossible' -> 'nothing'
@Tictim Tictim marked this pull request as ready for review March 14, 2023 05:27
@Tictim Tictim changed the title OreGlob prototype New ore filter implementation Mar 14, 2023
Tictim added 6 commits March 14, 2023 14:31
- using $ without providing any compilation flags produces error
- consecutive compilation flag blocks at the beginning correctly displays "compilation flags in middle of expression" error
- fixed $ not getting formatted if the compilation flag is terminated with EOF
@LAGIdiot
Copy link
Contributor

Good work on this PR! Just now I am testing it so I will leave few comments on what I think needs addressing.

@LAGIdiot
Copy link
Contributor

When there is no string in text field there is no indication what will happened.

It would be good to have something like question mark on slot top right corner. And 3 dot slot could mention something like expression is empty nothing will be filtered.

image

@Tictim Tictim mentioned this pull request Mar 26, 2023
Copy link
Contributor

@LAGIdiot LAGIdiot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this PR! I feel that now it is completed. Implementation looks reasonable and new behavior is great. From ingame test no problem with upgrading old filters found and adding new filters now is way easier.

@TechLord22 TechLord22 merged commit 690788e into GregTechCEu:master Apr 7, 2023
@Tictim Tictim deleted the oreglob branch April 7, 2023 01:51
MrKono added a commit to MrKono/GregTech that referenced this pull request Apr 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: refactor Suggestion to refactor a section of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants