Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release v0.4.0: Switch to ANTLR grammar for Java #80

Merged
merged 11 commits into from
Apr 19, 2022

Conversation

fizruk
Copy link
Member

@fizruk fizruk commented Apr 17, 2022

Edit: this PR now also contains version update and CHANGELOG.

  • Switch to ANTLR grammar (see #80):
    • Supports Java 17 (allegedly)
    • Verified on Hadoop project (~1.8 million LoC without comments, commit ec0ff1d)
    • Concrete syntax generated by ANTLR is mapped to abstract tree syntax (see parser.TreeMappings.kt)
  • Translate unknown syntax into placeholder expressions

Here's a sketch of an alternative (backup) parser for J2EO, using ANTLR grammar for Java from antlr/grammars-v4.

The documentation for this grammar claims to support Java 17, and their benchmarks show that it is somewhat tested on large projects, such as Spring Framework, ElasticSearch, and Log4j. Unfortunately, these benchmarks do not show if the parser works correctly, so it is possible that some adjustments will be required. Still, this grammar can work at least as a backup.

Now, to make use of this grammar, we need to map ANTLR's generated concrete syntax tree to J2EO's abstract syntax tree. This is currently work-in-progress:

  • AltAnnotationQualifiedName
  • Annotation
  • AnnotationConstantRest
  • AnnotationMethodOrConstantRest
  • AnnotationMethodRest
  • AnnotationTypeBody
  • AnnotationTypeDeclaration
  • AnnotationTypeElementDeclaration
  • AnnotationTypeElementRest
  • Arguments
  • ArrayCreatorRest
  • ArrayInitializer
  • Block
  • BlockStatement
  • CatchClause
  • CatchType
  • ClassBody
  • ClassBodyDeclaration
  • ClassCreatorRest
  • ClassDeclaration
  • ClassOrInterfaceModifier
  • ClassOrInterfaceType
  • ClassType
  • CompilationUnit
  • ConstDeclaration
  • ConstantDeclarator
  • ConstructorDeclaration
  • CreatedName
  • Creator
  • DefaultValue
  • ElementValue
  • ElementValueArrayInitializer
  • ElementValuePair
  • ElementValuePairs
  • EnhancedForControl
  • EnhancedForControlControl
  • EnumBodyDeclarations
  • EnumConstant
  • EnumConstants
  • EnumDeclaration
  • ExplicitGenericInvocation
  • ExplicitGenericInvocationSuffix
  • Expression
  • ExpressionList
  • FieldDeclaration
  • FinallyBlock
  • FloatLiteral
  • ForControl
  • ForInit
  • ForInitControl
  • FormalParameter
  • FormalParameterList
  • FormalParameters
  • GenericConstructorDeclaration
  • GenericInterfaceMethodDeclaration
  • GenericMethodDeclaration
  • GuardedPattern
  • Identifier
  • ImportDeclaration
  • InnerCreator
  • IntegerLiteral
  • InterfaceBody
  • InterfaceBodyDeclaration
  • InterfaceCommonBodyDeclaration
  • InterfaceDeclaration
  • InterfaceMemberDeclaration
  • InterfaceMethodDeclaration
  • InterfaceMethodModifier
  • LambdaBody
  • LambdaExpression
  • LambdaLVTIList
  • LambdaLVTIParameter
  • LambdaParameters
  • LastFormalParameter
  • Literal
  • LocalTypeDeclaration
  • LocalVariableDeclaration
  • MemberDeclaration
  • MethodBody
  • MethodCall
  • MethodDeclaration
  • Modifier
  • ModuleBody
  • ModuleDeclaration
  • ModuleDirective
  • NonWildcardTypeArguments
  • NonWildcardTypeArgumentsOrDiamond
  • PackageDeclaration
  • ParExpression
  • Pattern
  • Primary
  • PrimitiveType
  • QualifiedName
  • QualifiedNameList
  • ReceiverParameter
  • RecordBody
  • RecordComponent
  • RecordComponentList
  • RecordDeclaration
  • RecordHeader
  • RequiresModifier
  • Resource
  • ResourceSpecification
  • Resources
  • Statement
  • SuperSuffix
  • SwitchBlockStatementGroup
  • SwitchExpression
  • SwitchLabel
  • SwitchLabeledRule
  • SwitchRuleOutcome
  • TypeArgument
  • TypeArguments
  • TypeArgumentsOrDiamond
  • TypeBound
  • TypeDeclaration
  • TypeList
  • TypeParameter
  • TypeParameters
  • TypeType
  • TypeTypeOrVoid
  • VariableDeclarator
  • VariableDeclaratorId
  • VariableDeclarators
  • VariableInitializer
  • VariableModifier

@fizruk fizruk marked this pull request as ready for review April 19, 2022 07:59
@fizruk fizruk changed the title Switch to ANTLR grammar for Java Release v0.4.0: Switch to ANTLR grammar for Java Apr 19, 2022
@IamMaxim
Copy link
Member

@rultor merge

@rultor
Copy link
Contributor

rultor commented Apr 19, 2022

@rultor merge

@IamMaxim OK, I'll try to merge now. You can check the progress of the merge here

@rultor rultor merged commit ae9c13f into polystat:master Apr 19, 2022
@rultor
Copy link
Contributor

rultor commented Apr 19, 2022

@rultor merge

@IamMaxim Done! FYI, the full log is here (took me 2min)

@fizruk fizruk deleted the antlr-grammar branch April 19, 2022 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants