-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
global sort: check duplicate key at client side #59659
Conversation
Signed-off-by: lance6716 <[email protected]>
Hi @lance6716. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Signed-off-by: lance6716 <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #59659 +/- ##
================================================
+ Coverage 72.9805% 73.5826% +0.6021%
================================================
Files 1694 1729 +35
Lines 468596 481101 +12505
================================================
+ Hits 341984 354007 +12023
+ Misses 105568 105270 -298
- Partials 21044 21824 +780
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest lgtm
if needDupCheck { | ||
if lastKey4DupCheck != nil && bytes.Equal(lastKey4DupCheck, k) { | ||
return errors.Errorf("duplicate key found: %s", hex.EncodeToString(lastKey4DupCheck)) | ||
} | ||
lastKey4DupCheck = k | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add in loadBatchRegionData
where there is a sort phase, more simpler to check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the sorting of loadBatchRegionData
there's no place to iterate all keys, so I need to add an iteration. I think it's better for performance to use current iteration here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems no performance penalty, we already do a byte compare bytes.Compare(e.memKVsAndBuffers.keys[i], e.memKVsAndBuffers.keys[k]) < 0
when sort, but I'm not sure, sorty's less function have 4 params
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorty do concurrent sorting in multiple goroutines, but we will need a single thread iteration and check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems working
func TestSorty(t *testing.T) {
arr := make([]int, 0, 1<<20)
arr = append(arr, 1)
for i := 0; i < 1<<20; i++ {
arr = append(arr, rand.Int())
}
arr = append(arr, 1)
var dupFound atomic.Bool
sorty.Sort(len(arr), func(i, k, r, s int) bool {
res := cmp.Compare(arr[i], arr[k])
if res == 0 {
dupFound.Store(true)
}
if res < 0 {
if r != s {
arr[r], arr[s] = arr[s], arr[r]
}
}
return false
})
fmt.Println(dupFound.Load())
}
=== RUN TestSorty
true
--- PASS: TestSorty (1.04s)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check the guarantee of sorty. Not sure if it will split [1,2,3,3,4,5] into [1,2,3] and [3,4,5], the duplicate value is split into 2 sorting group so the sorting function can't find duplication
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess sorty
still need to compare or merge those 2 parts, else how would it keep the whole slice sorted?
Signed-off-by: lance6716 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: D3Hunter, GMHDBJD The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
In response to a cherrypick label: new pull request created to branch |
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: close #59650
Problem Summary:
What changed and how does it work?
server side may return duplicate key error in IMPORT INTO + UK use case, but it's better to handle them at client side
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.