Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync-diff-inspector: skip validation for tables that exist only upstream or downstream and print skipped information in summary and progress #693

Merged
merged 21 commits into from
Jan 30, 2023

Conversation

liumengya94
Copy link
Contributor

@liumengya94 liumengya94 commented Dec 29, 2022

What problem does this PR solve?

Issue Number: ref #692

What is changed and how it works?

  • Add parameter skip-non-existing-table, which defaults to false.
  • Record and skip validation for tables that exist only upstream or downstream.
  • Show skipped information and row count comparison for missing tables in summary.txt.

image

  • Print progress information about skipped tables.

image

Check List

Tests

  • Unit test
  • Integration test

Code changes

  • Has exported function/method change
  • Has exported variable/fields change

Side effects

  • None

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation
  • Need to be included in the release note

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Dec 29, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • lichunzhu

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@CLAassistant
Copy link

CLAassistant commented Dec 29, 2022

CLA assistant check
All committers have signed the CLA.

@liumengya94
Copy link
Contributor Author

/run-all-tests

@liumengya94
Copy link
Contributor Author

/cc @lichunzhu

@ti-chi-bot ti-chi-bot requested a review from lichunzhu December 30, 2022 01:27
sync_diff_inspector/diff.go Outdated Show resolved Hide resolved
sync_diff_inspector/diff.go Outdated Show resolved Hide resolved
sync_diff_inspector/report/report.go Outdated Show resolved Hide resolved
sync_diff_inspector/source/source.go Outdated Show resolved Hide resolved
sync_diff_inspector/progress/progress.go Outdated Show resolved Hide resolved
sync_diff_inspector/progress/progress.go Outdated Show resolved Hide resolved
tests/sync_diff_inspector/table_skip/run.sh Show resolved Hide resolved
@liumengya94 liumengya94 changed the title sync-diff-inspector: skip validation for tables that exist only upstream or downstream and print skipped information in summary sync-diff-inspector: skip validation for tables that exist only upstream or downstream and print skipped information in summary and progress Jan 5, 2023
Comment on lines 423 to 428
func AllTableExist(tableDiffs *common.TableDiff) bool {
if tableDiffs.NeedSkippedTable == common.AllTableExistFlag {
return true
}
return false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func AllTableExist(tableDiffs *common.TableDiff) bool {
if tableDiffs.NeedSkippedTable == common.AllTableExistFlag {
return true
}
return false
}
func AllTableExist(needSkipTable int) bool {
return tableDiffs.NeedSkippedTable == common.AllTableExistFlag
}

Comment on lines 300 to 310
var isEqual, isSkip bool
isAllTableExist := tables[tableIndex].NeedSkippedTable
if source.AllTableExist(tables[tableIndex]) {
var err error
isEqual, isSkip, err = df.compareStruct(ctx, tableIndex)
if err != nil {
return errors.Trace(err)
}
} else {
isEqual, isSkip = false, true
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var isEqual, isSkip bool
isAllTableExist := tables[tableIndex].NeedSkippedTable
if source.AllTableExist(tables[tableIndex]) {
var err error
isEqual, isSkip, err = df.compareStruct(ctx, tableIndex)
if err != nil {
return errors.Trace(err)
}
} else {
isEqual, isSkip = false, true
}
isEqual, isSkip, isAllTableExist := false, true, tables[tableIndex].NeedSkippedTable
if source.AllTableExist(tables[tableIndex]) {
var err error
isEqual, isSkip, err = df.compareStruct(ctx, tableIndex)
if err != nil {
return errors.Trace(err)
}
}

@@ -24,12 +24,12 @@ import (

func TestProgress(t *testing.T) {
p := NewTableProgressPrinter(4, 0)
p.RegisterTable("1", true, true)
p.RegisterTable("1", true, true, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use constant instead of 1 0.

ChunkMap map[string]*ChunkResult `json:"chunk-result"` // `ChunkMap` stores the `ChunkResult` of each chunk of the table
UpCount int64 `json:"up-count"` // `UpCount` is the number of rows in the table from upstream
DownCount int64 `json:"down-count"` // `DownCount` is the number of rows in the table from downstream
TableSkipped int `json:"table-skipped"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe TableLack to differ from DataSkip

sync_diff_inspector/report/report.go Show resolved Hide resolved
sync_diff_inspector/report/report.go Show resolved Hide resolved
@@ -104,7 +110,7 @@ func TestReport(t *testing.T) {
report.CalculateTotalSize(ctx, db)

// Test Table Report
report.SetTableStructCheckResult("test", "tbl", true, false)
report.SetTableStructCheckResult("test", "tbl", true, false, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

tests/sync_diff_inspector/table_skip/run.sh Outdated Show resolved Hide resolved
Copy link
Contributor

@lichunzhu lichunzhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

if rangeInfo.ChunkRange.Type == chunk.Empty {
dml.node.State = checkpoints.IgnoreState
// for tables that don't exist upstream or downstream
if !source.AllTableExist(tableDiff) {
upCount, _ := dbutil.GetRowCount(ctx, df.upstream.GetDB(), schema, table, "", nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if upstream has multiple mysql sources?

Comment on lines 138 to 144
if isExist == common.UpstreamTableLackFlag {
state = TABLE_STATE_NOT_EXSIT_UPSTREAM | TABLE_STATE_REGISTER
} else if isExist == common.DownstreamTableLackFlag {
state = TABLE_STATE_NOT_EXSIT_DOWNSTREAM | TABLE_STATE_REGISTER
} else {
state = TABLE_STATE_RESULT_FAIL_STRUCTURE_DONE | TABLE_STATE_REGISTER
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if isExist == common.UpstreamTableLackFlag {
state = TABLE_STATE_NOT_EXSIT_UPSTREAM | TABLE_STATE_REGISTER
} else if isExist == common.DownstreamTableLackFlag {
state = TABLE_STATE_NOT_EXSIT_DOWNSTREAM | TABLE_STATE_REGISTER
} else {
state = TABLE_STATE_RESULT_FAIL_STRUCTURE_DONE | TABLE_STATE_REGISTER
}
switch isExist {
case common.UpstreamTableLackFlag:
state = TABLE_STATE_NOT_EXSIT_UPSTREAM | TABLE_STATE_REGISTER
case common.DownstreamTableLackFlag:
state = TABLE_STATE_NOT_EXSIT_DOWNSTREAM | TABLE_STATE_REGISTER
default:
state = TABLE_STATE_RESULT_FAIL_STRUCTURE_DONE | TABLE_STATE_REGISTER
}

isEqual, isSkip, isAllTableExist := false, true, tables[tableIndex].NeedSkippedTable
if source.AllTableExist(tables[tableIndex]) {
isEqual, isSkip, isAllTableExist := false, true, tables[tableIndex].TableLack
if common.AllTableExist(tables[tableIndex].TableLack) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if common.AllTableExist(tables[tableIndex].TableLack) {
if common.AllTableExist(isAllTableExist) {

Comment on lines 425 to 426
upCount := df.upstream.GetCountAndCrc32(ctx, rangeInfo).Count
downCount := df.downstream.GetCountAndCrc32(ctx, rangeInfo).Count
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need count here. However, this function will also compute checksum. I'm afraid this will affect the efficiency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I defined a new interface GetCountForLackTable in the next commit(34293bb).

var count int64
if matchSources != nil {
for _, ms := range matchSources {
count, err = dbutil.GetRowCount(ctx, ms.DBConn, ms.OriginSchema, ms.OriginTable, "", nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error will be returned in ChecksumInfo. Is that expected?

Comment on lines 339 to 341
var isMatched bool
// get all tables from all source db instance
if f.MatchTable(targetSchema, targetTable) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var isMatched bool
// get all tables from all source db instance
if f.MatchTable(targetSchema, targetTable) {
isMatched := f.MatchTable(targetSchema, targetTable)
if isMatched {

Comment on lines 250 to 255
var isMatched bool
if f.MatchTable(targetSchema, targetTable) {
// if match the filter, we should respect it and check target has this table later.
sourceTablesAfterRoute[uniqueId] = struct{}{}
isMatched = true
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified.

Copy link
Contributor

@lichunzhu lichunzhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lichunzhu
Copy link
Contributor

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 8d7c343

@ti-chi-bot
Copy link
Member

@liumengya94: Your PR was out of date, I have automatically updated it for you.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@liumengya94
Copy link
Contributor Author

/run-unit-test

@ti-chi-bot ti-chi-bot merged commit b4be17a into pingcap:master Jan 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants