-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Enable secondary index for compound filter conditions #3417
base: develop
Are you sure you want to change the base?
feat: Enable secondary index for compound filter conditions #3417
Conversation
…check-compound-filter-condition
…check-compound-filter-condition
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #3417 +/- ##
===========================================
- Coverage 78.28% 78.27% -0.01%
===========================================
Files 392 393 +1
Lines 36045 36106 +61
===========================================
+ Hits 28217 28260 +43
- Misses 6163 6185 +22
+ Partials 1665 1661 -4
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 19 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
question: Do you know why the coverage takes a hit of -23% 😢? @islamaliev EDIT: I noticed most of ci coverage test actions failing and therefore only some reports were submitted, maybe thats why. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only reviewed the first few files and need to go and eat :)
Looks good so far, just documentation requests.
…check-compound-filter-condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is looking good, I'm continuing my review but it is taking a while so I thought I'd submit the outstanding comments now.
// It also takes a propExists boolean to indicate if the property exists in the data. | ||
// It's needed because the behavior of the operators can change if the property doesn't exist. | ||
// For example, _ne operator should return true if the property doesn't exist. | ||
// This can also be used in the future if we introduce operators line _has. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: This example is particularly useful - thanks Islam!
f.doc.MergeProperties(encDoc) | ||
if f.indexDesc.Unique && !hasNilField { | ||
f.currentDocID = immutable.Some(string(res.value)) | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: This else
block looks odd to me. Why are you uncertain of the type?
todo: The bytes else-if block is untested. Please either remove it, or test it.
return nil, nil | ||
filter.TraverseProperties( | ||
f.indexFilter.Conditions, | ||
func(prop *mapper.PropertyIndex, condMap map[connor.FilterKey]any) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: IMO this is way too large a function to declare inline, I struggled to find where it ended. Please make it a named function and add a line or so documenting what it is trying to do.
EDIT: I see that right at the very end of this very large inline function you mutate a variable (found
) belonging to the encompassing scope. I think this needs rework to allow you name this function, perhaps you will need to rename TraverseProperties
and return a boolean from it, I am not sure, but please change this as it is a bit hard to follow IMO (especially due to the mutation of found
).
} | ||
break jsonPathLoop | ||
jsonPath = jsonPath.AppendProperty(prop.Name) | ||
condMap = filterVal.(map[connor.FilterKey]any) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: Why is this cast safe (no ok
check)? Please add your answer as a code comment.
var matcher valueMatcher | ||
// we have a separate branch for null matcher because default matching behavior | ||
// is what we need: for filter `_ne: null` it will match all non-null values | ||
if v.IsNull() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: This would be a lot more readable if you returned early instead of the current if-else-if-else nesting.
The current format forces the reader to read the entire function if they only care about a single if block, for example if I am debugging an issue with _, ok := v.Number()
, I am forced to read through and check that matcher
is not later overwritten or otherwise interacted with, whereas if it returned early I could instead leave this function and proceed further with my investigation.
for example:
if v.IsNull() {
return &jsonNullMatcher{matchNull: condition.op == opEq}
}
if jsonVal, ok := v.Number(); ok {
return &jsonComparingMatcher[float64]{
value: jsonVal,
getValueFunc: func(j client.JSON) (float64, bool) { return j.Number() },
evalFunc: getCompareValsFunc[float64](condition.op),
}
}
... etc
} | ||
|
||
var matcher valueMatcher | ||
if v, ok := condition.val.Int(); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Same as above, I think this would be easier to read if you returned early instead of the large if-else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good Islam! I'm nearly done reviewing but really need to eat :)
|
||
// The below properties are only held in state in order to temporarily adhear to the [Fetcher] | ||
// The below properties are only held in state in order to temporarily adhere to the [Fetcher] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:) thanks
if err != nil { | ||
return err | ||
} | ||
if indexFetcher != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: Why would indexFetcher
be nil here? You just called the constructor, and with it's current name (and documentation) it definitely should not be returning nil. Please remove this check, or rename and redocument newIndexFetcher
.
EDIT: I've just seen the documentation below, that documentation should probably be incorporated in to the renamed newIndexFetcher
func docs.
} | ||
|
||
// the index fetcher might not have been created if there is no efficient way to use fetch indexes | ||
// with given filter conditions. In this case we fall back to the prefix fetcher |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: This was a very useful comment thank you!
// } | ||
// } | ||
// | ||
// The callback would receive path=["author", "books", "title"] and value="Sample" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: This is excellent documentation and really helped me understand the functions here, thanks Islam :)
// } | ||
// | ||
// The callback would receive path=["author", "books", "title"] and value="Sample" | ||
func TraverseFields(conditions map[string]any, f func([]string, any) bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: The return value of this function is unused, and unexplained by the documentation. Please remove it.
switch t := value.(type) { | ||
case map[string]any: | ||
for k, v := range t { | ||
if !isKeyOp(k) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: There seems to be little benefit in inverting the if-else by negating isKeyOp(k)
, if you remove the !
it will reduce the cognitive load on the reader slightly.
newPath := make([]string, len(path), len(path)+1) | ||
copy(newPath, path) | ||
newPath = append(newPath, k) | ||
if !traverseFields(newPath, k, v, f) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: k
and v
are short lived, but here, combined with the longer scope f
the shortened variables read a little like algebra. I suggest renaming them to key
and value
.
Same goes for the other similar areas in this function.
var f fetcher.Fetcher | ||
if cid.HasValue() { | ||
f = new(fetcher.VersionedFetcher) | ||
} else { | ||
f = fetcher.NewDocumentFetcher() | ||
|
||
if index.HasValue() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: I was worried you'd have to put this back when you moved the index prop to init in the fetcher. Thank you very much for cleaning this up.
} | ||
return immutable.None[client.IndexDescription]() | ||
slices.SortFunc(indexCandidates, func(a, b client.IndexDescription) int { | ||
switch { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: This looks like it is a re-implementation of strings.Compare
, please use strings.Compare
instead, or document why you are not using it, and what the differences between this and that function are.
return true | ||
}) | ||
if len(indexCandidates) == 0 { | ||
return immutable.Option[client.IndexDescription]{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: I think None
is more descriptive than the default, which kind of looks like it has (or might have) a value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed! Overall it looks really good Islam, the provided documentation was very useful when reading the code. I think all my requests are/were fairly localised, hopefully they all make sense to you :)
Thanks for resolving the issue.
} | ||
|
||
// we store child's own filter in case an index kicks in and replaces it with it's own filter | ||
join.subFilter = getScanNode(childSide.plan).filter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion:
return invertibleTypeJoin{
docMapper: docMapper{parent.documentMapping},
parentSide: parentSide,
childSide: childSide,
skipChild: skipChild,
// we store child's own filter in case an index kicks in and replaces it with it's own filter
subFilter: getScanNode(childSide.plan).filter
}
@@ -699,7 +718,7 @@ func (join *invertibleTypeJoin) Next() (bool, error) { | |||
return true, nil | |||
} | |||
|
|||
func (join *invertibleTypeJoin) nextJoinedSecondaryDoc() (bool, error) { | |||
func (join *invertibleTypeJoin) fetchRelatedSecondaryDocWithChildren(primaryDoc core.Doc) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Thanks for this name change
@@ -721,28 +742,33 @@ func (join *invertibleTypeJoin) nextJoinedSecondaryDoc() (bool, error) { | |||
join.encounteredDocIDs = append(join.encounteredDocIDs, secondaryDocID) | |||
} | |||
|
|||
hasDoc, err := fetchDocWithID(secondSide.plan, secondaryDocID) | |||
//secondaryDocOpt, err := fetchDocWithID(secondSide.plan, secondaryDocID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: Please remove the commented out code
@@ -167,7 +167,7 @@ func TestArrayIndex_WithFilterOnIndexedArrayUsingNone_ShouldUseIndex(t *testing. | |||
}, | |||
testUtils.Request{ | |||
Request: makeExplainQuery(req), | |||
Asserter: testUtils.NewExplainAsserter().WithIndexFetches(9), | |||
Asserter: testUtils.NewExplainAsserter().WithIndexFetches(0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: This is surprising, why has this changed? It looks like it is no longer using the index.
todo: If the change is correct, please update the name and/or document the test, as it doesnt make any sense to me atm.
}, | ||
}, | ||
} | ||
|
||
testUtils.ExecuteTestCase(t, test) | ||
} | ||
|
||
func TestJSONIndex_WithNeFilterAgainstNonNullValue_ShouldFetchNullValues(t *testing.T) { | ||
type testCase struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: I am really not a fan of bundling multiple tests into the same test. It makes debugging much harder and typically involves the repeated commenting and uncommenting of test cases in order to deal with one aspect at a time.
Please break this up.
@@ -48,7 +48,7 @@ func TestQueryWithIndex_IfIndexFilterWithRegular_ShouldFilter(t *testing.T) { | |||
}, | |||
testUtils.Request{ | |||
Request: makeExplainQuery(req), | |||
Asserter: testUtils.NewExplainAsserter().WithFieldFetches(3).WithIndexFetches(3), | |||
Asserter: testUtils.NewExplainAsserter().WithIndexFetches(3), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Why have the field fetches here been removed?
@@ -49,7 +49,7 @@ func TestQueryWithIndex_WithNonIndexedFields_ShouldFetchAllOfThem(t *testing.T) | |||
}, | |||
testUtils.Request{ | |||
Request: makeExplainQuery(req), | |||
Asserter: testUtils.NewExplainAsserter().WithFieldFetches(1).WithIndexFetches(1), | |||
Asserter: testUtils.NewExplainAsserter().WithIndexFetches(1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Regarding all the tests that had non-zero field fetches removed, why have the field fetches been removed?
@@ -804,3 +804,213 @@ func TestQueryWithIndex_WithFilterOn2Relations_ShouldFilter(t *testing.T) { | |||
|
|||
testUtils.ExecuteTestCase(t, test) | |||
} | |||
|
|||
func TestQueryWithIndex_WithNeFilterAgainstNonNilValue_ShouldFetchNilValues(t *testing.T) { | |||
type testCase struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Same as other similar comment, I have a preference for this to be broken up.
@@ -163,3 +163,129 @@ func TestQueryJSON_WithNotEqualFilterWithNullValue_ShouldFilter(t *testing.T) { | |||
|
|||
testUtils.ExecuteTestCase(t, test) | |||
} | |||
|
|||
func TestQueryJSON_WithNeFilterAgainstNonNullValue_ShouldFetchNullValues(t *testing.T) { | |||
type testCase struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Same suggestion as other comments, I have a preference for this to be broken up - it is multiple tests pretending to be a single test and this creates problems for anyone using it.
Relevant issue(s)
Resolves #3299
Description
Utilize secondary indexes even when compound filter conditions are present.
For this to work new filter traversing utility function is introduced that can be configured to different needs.
And not that indexes are exposed to more complex conditions they started to produce more false positive docs that weren't checked by the filter because the index fetcher was not part of the new fetcher chain.
Make index fetcher implement new fetcher interface so that the documents it fetches can be checked against the scanner filter and permissions.
Change behavior of connor to recognize if a field exists. It's need to distinguish if _ne filter returns false because 2 values are different or becuase the document doesn't have the field.
Make fieldFetched explain metric count all fields fetched, not only fields that were requested.