Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lexer: replace version comments with feature-ids mechanism #777

Merged
merged 7 commits into from
Mar 31, 2020

Conversation

tangenta
Copy link
Contributor

What problem does this PR solve?

Executable special comments like /*T!40000 xxx */ are not suitable for TiDB. Due to the nature of branch-release, new features can be cherry-picked to different major versions TiDB. As a result, a TiDB with a higher version may not support the features in a TiDB with a lower version.

To solve this problem, this PR gives a mechanism to manage the features that need to support forward compatibility:

  • For the features that need to be downgrade compatible, assign a unique identifier to each of them, called feature-id.
  • The parsers in different release branches maintain their own featureMap, which contains a set of currently supported feature-ids.
  • The pattern /*T![feature_id] xxx */ is used to wrap the key part during logical synchronization.
  • To decide whether a statement with special comments should be parsed, parser can search feature_id in featureMap.
  • In rare cases, a comment can have multiple feature ids like /*T![feature_id1,feature_id2,...] xxx */. It can only be parsed if all feature IDs in the comment can be found in the featureMap.

What is changed and how it works?

Add scanFeatureIDs() to scan feature ids in special comments.

Check List

Tests

  • Unit test
  • Integration test

Code changes

  • Has exported variable/fields change

Side effects

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation

@tangenta tangenta requested a review from a team March 16, 2020 15:50
@ghost ghost requested review from kennytm and removed request for a team March 16, 2020 15:50
@codecov
Copy link

codecov bot commented Mar 16, 2020

Codecov Report

Merging #777 into master will increase coverage by 0.05%.
The diff coverage is 92.45%.

@@            Coverage Diff             @@
##           master     #777      +/-   ##
==========================================
+ Coverage   78.05%   78.11%   +0.05%     
==========================================
  Files          40       40              
  Lines       14669    14703      +34     
==========================================
+ Hits        11450    11485      +35     
+ Misses       2537     2536       -1     
  Partials      682      682

lexer.go Outdated Show resolved Hide resolved
lexer.go Outdated Show resolved Hide resolved
lexer.go Outdated Show resolved Hide resolved
lexer_test.go Outdated Show resolved Hide resolved
misc.go Outdated Show resolved Hide resolved
misc.go Outdated Show resolved Hide resolved
@tiancaiamao
Copy link
Collaborator

The parser should not care about the detail of what /*T!40000 xxx */ means.
It just parses that value as an opaque data, and return it to TiDB.

If we change to FeaturesMap that means the code bind with the parser.
It's no longer a black box for the parser.

Copy link
Contributor

@AilinKid AilinKid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

lexer.go Outdated

func (s *Scanner) scanFeatureIDs() (featureIDs []string) {
pos := s.r.pos()
const init, expectAlpha, alpha = 0, 1, 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

state name should more clear?

@tangenta
Copy link
Contributor Author

@tiancaiamao Do you have any suggestions to deal with the problem mentioned in description?

How about putting the FeaturesMap into a part of the parser driver? Let users decide what should be parsed.

@tiancaiamao
Copy link
Collaborator

Do you have any suggestions to deal with the problem mentioned in the description?

The idea sounds good. I just have concerns about the compatibility issue.
When the TiDB uses this new feature, what happens if the customer rolling updates from an old cluster?

How about putting the FeaturesMap into a part of the parser driver? Let users decide what should be parsed.

Much better.

@tangenta
Copy link
Contributor Author

IMO, it doesn't affect rolling updates.

Currently, only auto_random feature is using the special comments like /*T!40000 xxx */. I think compatibility is not the problem, because it is an experimental feature :)

In later TiDB versions, the syntax or semantics of AUTO_RANDOM might change.
https://pingcap.com/docs/dev/reference/sql/attributes/auto-random/

@kennytm
Copy link
Contributor

kennytm commented Mar 23, 2020

in the worst case we could support both /*T!12345 ... */ and /*T![abc,def] ... */ since the two syntax are mutually exclusive.

misc.go Outdated Show resolved Hide resolved
@tiancaiamao
Copy link
Collaborator

If none of our users is actually using /*T!40000 xxx */, it doesn't affect rolling updates.

lexer.go Outdated Show resolved Hide resolved
lexer.go Outdated Show resolved Hide resolved
if commentVersion <= CommentCodeCurrentVersion {
// in '/*T!', try to match the pattern '/*T![feature1,feature2,...]'.
features := s.scanFeatureIDs()
if SpecialCommentsController.ContainsAll(features) {
s.inBangComment = true
return s.scan()
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the feature id is not included in the list?

Copy link
Contributor Author

@tangenta tangenta Mar 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will parse it as normal comments. For now, the behavior is:
FeatureMap: {"xxx", "yyy"}

/*T![xxx,yyy] part_of_sql */  # all the feature ids matched
/*T![xxx,zzz] part_of_sql */  # one feature id mismatched
/*T![xxx part_of_sql */       # no feature id detected
/*T!part_of_sql */            # no feature id detected

are equivalant to

part_of_sql

[xxx part_of_sql
part_of_sql

respectively.

@tiancaiamao
Copy link
Collaborator

LGTM
Please fix the integration test.

@tangenta
Copy link
Contributor Author

This PR should be merged together with pingcap/tidb#15412.

@kennytm kennytm merged commit 8dce7a4 into pingcap:master Mar 31, 2020
tangenta added a commit to tangenta/parser that referenced this pull request Apr 21, 2020
* lexer: replace version comments with feature-ids mechanism

* tests: retain the original version digit test

* fix scanVersionDigits

* add SpecialCommentsController to decide what comments can be parsed

* only ignore comments with unsupported feature-id

* remove debug log
AilinKid pushed a commit to AilinKid/parser that referenced this pull request Apr 24, 2020
* lexer: replace version comments with feature-ids mechanism

* tests: retain the original version digit test

* fix scanVersionDigits

* add SpecialCommentsController to decide what comments can be parsed

* only ignore comments with unsupported feature-id

* remove debug log
tangenta added a commit to tangenta/parser that referenced this pull request Apr 28, 2020
tiancaiamao pushed a commit to tiancaiamao/parser that referenced this pull request Apr 27, 2021
* lexer: replace version comments with feature-ids mechanism

* tests: retain the original version digit test

* fix scanVersionDigits

* add SpecialCommentsController to decide what comments can be parsed

* only ignore comments with unsupported feature-id

* remove debug log
lyonzhi pushed a commit to lyonzhi/parser that referenced this pull request Apr 25, 2024
* lexer: replace version comments with feature-ids mechanism

* tests: retain the original version digit test

* fix scanVersionDigits

* add SpecialCommentsController to decide what comments can be parsed

* only ignore comments with unsupported feature-id

* remove debug log
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants