Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for aggregate filters on inline arrays #622

Merged
merged 5 commits into from
Jul 13, 2022

Conversation

AndrewSisley
Copy link
Contributor

Relevant issue(s)

Resolves #392

Description

Add support for aggregate filters on inline arrays.

Does not add and/or support - this has been broken out to #621 as it will only expand on the functionality added in this PR - it doesn't modify the syntax, and keeping it out keeps things a little simpler.

@AndrewSisley AndrewSisley added feature New feature or request area/query Related to the query component action/no-benchmark Skips the action that runs the benchmark. labels Jul 11, 2022
@AndrewSisley AndrewSisley added this to the DefraDB v0.3 milestone Jul 11, 2022
@AndrewSisley AndrewSisley requested a review from a team July 11, 2022 21:09
@AndrewSisley AndrewSisley self-assigned this Jul 11, 2022
@codecov
Copy link

codecov bot commented Jul 11, 2022

Codecov Report

Merging #622 (ad2394b) into develop (4bafe70) will decrease coverage by 0.01%.
The diff coverage is 79.06%.

❗ Current head ad2394b differs from pull request most recent head ce8d4dc. Consider uploading reports for the commit ce8d4dc to get more accurate results

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #622      +/-   ##
===========================================
- Coverage    56.87%   56.85%   -0.02%     
===========================================
  Files          122      122              
  Lines        14598    14661      +63     
===========================================
+ Hits          8302     8336      +34     
- Misses        5586     5608      +22     
- Partials       710      717       +7     
Impacted Files Coverage Δ
query/graphql/planner/sum.go 77.57% <50.00%> (-2.43%) ⬇️
query/graphql/planner/count.go 80.80% <70.73%> (-8.75%) ⬇️
connor/connor.go 92.30% <100.00%> (ø)
query/graphql/mapper/mapper.go 87.79% <100.00%> (-1.56%) ⬇️
query/graphql/mapper/targetable.go 53.84% <100.00%> (+3.84%) ⬆️
query/graphql/schema/generate.go 83.09% <100.00%> (+0.22%) ⬆️
query/graphql/schema/manager.go 97.05% <100.00%> (+0.12%) ⬆️


if source.Filter != nil {
switch array := property.(type) {
case []core.Doc:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking forward to getting generics lol

@@ -158,3 +158,7 @@ func (t *Targetable) cloneTo(index int) *Targetable {
OrderBy: t.OrderBy,
}
}

func (t *Targetable) AsTargetable() (*Targetable, bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: in the file requestable.go should probably add these checks if we want Targetable and AggregateTarget to satisfy Requestable (which they currently do):

	_ Requestable = (*Targetable)(nil)
	_ Requestable = (*AggregateTarget)(nil)

question: I can see the benefit in turning the embedded type Field, in Targetable struct into a named type because all Requestable methods that Field implements are dropped into Targetable making it satisfy Requestable interface implicitly?. If in future a dev removes Targetable.AsTargetable() method (which returns true) then the method Targetable.Field.AsTargetable() will automatically take over now returning false. If Field is named then we would force the user to be explicit about which method to use? Also if we do want Targetable to satisfy Requestable then perhaps for readability it might be better to have those methods explicitly defined and not inherited by the embedded type?

Copy link
Contributor Author

@AndrewSisley AndrewSisley Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the file requestable.go should probably add these checks if we want Targetable and AggregateTarget to satisfy Requestable (which they currently do)

Good shout - will check/add (might already do)

  • compiler stuff

dropped into Targetable making it satisfy Requestable interface implicitly...

This is deliberate and desirable IMO

we would force the user to be explicit about which method to use

The user should not have a choice here, or have to worry about it.

perhaps for readability it might be better to have those methods explicitly defined and not inherited by the embedded type?

Maybe - I'm 50-50 here, there are better ways of handling this in other languages. Inheritance in data-structures is something I like quite a lot, but there is no good way of doing this in Go atm (maybe when we get generics, but even then I'm not sure due to Golangs embedding mechanic - I never looked at whether they allow generic constraints to account for embedded types). I don't think this is worth really paying much attention to atm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the file requestable.go should probably add these checks if we want Targetable and AggregateTarget to satisfy Requestable (which they currently do)

Rejected on further look. Targetable and AggregateTarget are not Requestable and should not implement that interface (other than accidentally), that they implement some of their functions is coincidence and not a requirement (this coincidence is used here, but not required).

@@ -99,13 +99,55 @@ func (n *countNode) Next() (bool, error) {
// v.Len will panic if v is not one of these types, we don't want it to panic
case reflect.Array, reflect.Chan, reflect.Map, reflect.Slice, reflect.String:
length := v.Len()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion(non-blocking): Since you are already changing this function, can you just call v.Len() on line 161 and remove length?

Copy link
Contributor Author

@AndrewSisley AndrewSisley Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I can move this now

  • move len

@@ -99,13 +99,55 @@ func (n *countNode) Next() (bool, error) {
// v.Len will panic if v is not one of these types, we don't want it to panic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: I feel like this entire for loop can be more simpler, consider the following:

	for _, source := range n.aggregateMapping {
		property := n.currentValue.Fields[source.Index]

		v := reflect.ValueOf(property)
		// isValueOfPropertyValid is true if v is one of the reflect types that can call v.Len() without panic.
		isValueOfPropertyValid := false
		switch v.Kind() {
		case reflect.Array, reflect.Chan, reflect.Map, reflect.Slice, reflect.String:
			isValueOfPropertyValid = true
		}

		if !isValueOfPropertyValid {
			continue
		}

		// v.Len() will panic if isValueOfPropertyValid is false.
		if source.Filter == nil {
			count = count + v.Len()
			continue
		}

		for i := 0; i < v.Len(); i++ {
			var passed bool
			var err error

			switch array := property.(type) {
			case []core.Doc:
				passed, err = mapper.RunFilter(array[i], source.Filter)
			case []bool:
				passed, err = mapper.RunFilter(array[i], source.Filter)
			case []int64:
				passed, err = mapper.RunFilter(array[i], source.Filter)
			case []float64:
				passed, err = mapper.RunFilter(array[i], source.Filter)
			case []string:
				passed, err = mapper.RunFilter(array[i], source.Filter)
			}

			if err != nil {
				return false, err
			}
			if passed {
				count += 1
			}
		}
	}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggestion contains less code duplication, but I'm not sure it is simpler, and I'm really not a fan of isValueOfPropertyValid :)

Breaking up the switch cases breaks up the code-flow and makes reading it harder IMO. You also dont want to move switch array := property.(type) { inside the array loop, as that is a waste of resources.

This can be shrunk when we have generics, but until then I'm happy as-is

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: you don't have to do a type switch. You can simply get the value of the property and pass it to RunFilter.

for _, source := range n.aggregateMapping {
		// track if property is a Slice or Array
		isIterable := false

		property := n.currentValue.Fields[source.Index]
		v := reflect.ValueOf(property)
		// isValueOfPropertyValid is true if v is one of the reflect types that can call v.Len() without panic.
		switch v.Kind() {
		case reflect.Array, reflect.Slice:
			isIterable = true
		case reflect.Chan, reflect.Map, reflect.String:
		default:
			continue
		}

		// v.Len() will panic if isValueOfPropertyValid is false.
		if source.Filter == nil {
			count = count + v.Len()
			continue
		}
        
		if isIterable {
			for i := 0; i < v.Len(); i++ {
				passed, err := mapper.RunFilter(v.Index(i).Interface(), source.Filter)
				if err != nil {
					return false, err
				}
				if passed {
					count += 1
				}
			}
		}
	}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:) I tried something very similar briefly (is a cleaner way that avoids isIterable and just appends the length) - but I missed the Interface() call and shrugged it off pretty quick as not working.

Again, this is trivial to shrink with generics (avoiding the runtime cost of reflect/casting/etc) - so I really don't think it matters at all. Do you have a strong preference to using something similar to the above in the short-term, or are you happy to wait for generics?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have too strong of a preference, but since you are already changing the function it would be nice to clean it up until we get generics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference in cost between changing it now (I have no plans to do so unless you guys really want), and changing it in a month or two is minimal.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your initial version works and is readable. I don't like the repetition but if you think it will be cleaned up with generics in the near future, then just go with whatever you prefer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ticket added in case I forget/get-lazy #633

@AndrewSisley AndrewSisley force-pushed the sisley/feat/I392-agg-inline-array-filter branch from c61e322 to 12c2c45 Compare July 12, 2022 14:04
@@ -223,10 +223,24 @@ func (n *sumNode) Next() (bool, error) {
}
case []int64:
for _, childItem := range childCollection {
passed, err := mapper.RunFilter(childItem, source.Filter)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: perhaps can be simplified too similar to the countNode.Next() suggestion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my response to that comment. I'll apply anything that comes out of that convo here though if suitable

@@ -291,15 +291,28 @@ func (g *Generator) createExpandedFieldAggregate(
) {
for _, aggregateTarget := range f.Args {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Maybe the for loop can be one less nested-level with:

	for _, aggregateTarget := range f.Args {
		target := aggregateTarget.Name()
		var filterTypeName string
		filterType := obj.Fields()[target].Type
		if target == parserTypes.GroupFieldName {
			filterTypeName = obj.Name() + "FilterArg"
		} else if list, isList := filterType.(*gql.List); isList && gql.IsLeafType(list.OfType) {
			// If it is a list of leaf types - the filter is just the set of OperatorBlocks
			// that are supported by this type - there can be no field selections.
			if notNull, isNotNull := list.OfType.(*gql.NonNull); isNotNull {
				// GQL does not support '!' in type names, and so we have to manipulate the
				// underlying name like this if it is a nullable type.
				filterTypeName = fmt.Sprintf("NotNull%sOperatorBlock", notNull.OfType.Name())
			} else {
				filterTypeName = genTypeName(list.OfType, "OperatorBlock")
			}
		} else {
			filterTypeName = filterType.Name() + "FilterArg"
		}

		expandedField := &gql.InputObjectFieldConfig{
			Type: g.manager.schema.TypeMap()[filterTypeName],
		}
		aggregateTarget.Type.(*gql.InputObject).AddFieldConfig("filter", expandedField)
	}

Copy link
Contributor Author

@AndrewSisley AndrewSisley Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would expand the scope of filterType, and you'd have to worry about whether obj.Fields()[target].Typemight panic if target == parserTypes.GroupFieldName - not worth it IMO

}

expandedField := &gql.InputObjectFieldConfig{
Type: g.manager.schema.TypeMap()[targetType+"FilterArg"],
Type: g.manager.schema.TypeMap()[filterTypeName],
}
aggregateTarget.Type.(*gql.InputObject).AddFieldConfig("filter", expandedField)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
aggregateTarget.Type.(*gql.InputObject).AddFieldConfig("filter", expandedField)
aggregateTarget.Type.(*gql.InputObject).AddFieldConfig(parserTypes.FilterClause, expandedField)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest out of scope, and there are a few of these dotted about in here that can be cleaned up in one go.

@@ -0,0 +1,71 @@
// Copyright 2022 Democratized Data Foundation
//
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Thanks for taking the time to add these :)

testUtils "github.com/sourcenetwork/defradb/tests/integration"
)

func TestQueryInlineIntegerArrayWithsWithSumWithFilter(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: typo ? a bit too many withs in WithsWithSumWith haha.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol my bad :)

  • test name copy-paste typo

executeTestCase(t, test)
}

func TestQueryInlineFloatArrayWithsWithSumWithFilter(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: same as above.

Comment on lines 30 to 33
(`{
"Name": "Shahzad",
"FavouriteIntegers": [-1, 2, -1, 1, 0]
}`)},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(`{
"Name": "Shahzad",
"FavouriteIntegers": [-1, 2, -1, 1, 0]
}`)},
`{
"Name": "Shahzad",
"FavouriteIntegers": [-1, 2, -1, 1, 0]
}`},

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved for all

Comment on lines 56 to 60
0: {
(`{
"Name": "Shahzad",
"FavouriteFloats": [3.1425, 0.00000000001, 10]
}`)},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
0: {
(`{
"Name": "Shahzad",
"FavouriteFloats": [3.1425, 0.00000000001, 10]
}`)},
0: {
`{
"Name": "Shahzad",
"FavouriteFloats": [3.1425, 0.00000000001, 10]
}`},

testUtils "github.com/sourcenetwork/defradb/tests/integration"
)

func TestQueryInlineBoolArrayWithsWithCountWithFilter(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: same as other comment

executeTestCase(t, test)
}

func TestQueryInlineIntegerArrayWithsWithCountWithFilter(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: same as other comment

executeTestCase(t, test)
}

func TestQueryInlineFloatArrayWithsWithCountWithFilter(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: same as other comment

executeTestCase(t, test)
}

func TestQueryInlineStringArrayWithsWithCountWithFilter(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: same as other comment

Comment on lines 109 to 119
Docs: map[int][]string{
0: {
(`{
"Name": "Shahzad",
"PreferredStrings": ["", "the previous", "the first", "empty string"]
}`)},
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Docs: map[int][]string{
0: {
(`{
"Name": "Shahzad",
"PreferredStrings": ["", "the previous", "the first", "empty string"]
}`)},
},
Docs: map[int][]string{
0: {
`{
"Name": "Shahzad",
"PreferredStrings": ["", "the previous", "the first", "empty string"]
}`},
},

Comment on lines 84 to 87
(`{
"Name": "Shahzad",
"FavouriteFloats": [3.1425, 0.00000000001, 10]
}`)},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(`{
"Name": "Shahzad",
"FavouriteFloats": [3.1425, 0.00000000001, 10]
}`)},
`{
"Name": "Shahzad",
"FavouriteFloats": [3.1425, 0.00000000001, 10]
}`},

Comment on lines 55 to 60
Docs: map[int][]string{
0: {
(`{
"Name": "Shahzad",
"FavouriteIntegers": [-1, 2, -1, 1, 0]
}`)},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Docs: map[int][]string{
0: {
(`{
"Name": "Shahzad",
"FavouriteIntegers": [-1, 2, -1, 1, 0]
}`)},
Docs: map[int][]string{
0: {
`{
"Name": "Shahzad",
"FavouriteIntegers": [-1, 2, -1, 1, 0]
}`},

Comment on lines 28 to 33
Docs: map[int][]string{
0: {
(`{
"Name": "Shahzad",
"LikedIndexes": [true, true, false, true]
}`)},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Docs: map[int][]string{
0: {
(`{
"Name": "Shahzad",
"LikedIndexes": [true, true, false, true]
}`)},
Docs: map[int][]string{
0: {
`{
"Name": "Shahzad",
"LikedIndexes": [true, true, false, true]
}`},

testUtils "github.com/sourcenetwork/defradb/tests/integration"
)

func TestQueryInlineIntegerArrayWithsWithAverageWithFilter(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: same as other comment

executeTestCase(t, test)
}

func TestQueryInlineFloatArrayWithsWithAverageWithFilter(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: same as other comment

Comment on lines 55 to 63
Docs: map[int][]string{
0: {
(`{
"Name": "Shahzad",
"FavouriteFloats": [3.4, 3.6, 10]
}`)},
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Docs: map[int][]string{
0: {
(`{
"Name": "Shahzad",
"FavouriteFloats": [3.4, 3.6, 10]
}`)},
},
Docs: map[int][]string{
0: {
`{
"Name": "Shahzad",
"FavouriteFloats": [3.4, 3.6, 10]
}`},
},

Comment on lines 28 to 35
Docs: map[int][]string{
0: {
(`{
"Name": "Shahzad",
"FavouriteIntegers": [-1, 2, -1, 1, 0]
}`)},
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Docs: map[int][]string{
0: {
(`{
"Name": "Shahzad",
"FavouriteIntegers": [-1, 2, -1, 1, 0]
}`)},
},
Docs: map[int][]string{
0: {
`{
"Name": "Shahzad",
"FavouriteIntegers": [-1, 2, -1, 1, 0]
}`},
},

@AndrewSisley AndrewSisley requested a review from shahzadlone July 12, 2022 14:50
Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a couple comments.

)

// Match is the default method used in Connor to match some data to a
// set of conditions.
func Match(conditions map[FilterKey]interface{}, data core.Doc) (bool, error) {
func Match(conditions map[FilterKey]interface{}, data interface{}) (bool, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: you changed it to core.Doc previously and now back to interface{}. We are no longer just comparing core.Doc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, this now needs to handle inline arrays (noted in commit message)

Copy link
Collaborator

@fredcarle fredcarle Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(noted in commit message)

well not explicitly 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I fluffed up the commit contents, the message is:

Remove filter type restrictions
This needs to be able to accomodate inline arrays shortly

@@ -99,13 +99,55 @@ func (n *countNode) Next() (bool, error) {
// v.Len will panic if v is not one of these types, we don't want it to panic
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: you don't have to do a type switch. You can simply get the value of the property and pass it to RunFilter.

for _, source := range n.aggregateMapping {
		// track if property is a Slice or Array
		isIterable := false

		property := n.currentValue.Fields[source.Index]
		v := reflect.ValueOf(property)
		// isValueOfPropertyValid is true if v is one of the reflect types that can call v.Len() without panic.
		switch v.Kind() {
		case reflect.Array, reflect.Slice:
			isIterable = true
		case reflect.Chan, reflect.Map, reflect.String:
		default:
			continue
		}

		// v.Len() will panic if isValueOfPropertyValid is false.
		if source.Filter == nil {
			count = count + v.Len()
			continue
		}
        
		if isIterable {
			for i := 0; i < v.Len(); i++ {
				passed, err := mapper.RunFilter(v.Index(i).Interface(), source.Filter)
				if err != nil {
					return false, err
				}
				if passed {
					count += 1
				}
			}
		}
	}

@AndrewSisley AndrewSisley requested a review from fredcarle July 13, 2022 14:36
@AndrewSisley AndrewSisley force-pushed the sisley/feat/I392-agg-inline-array-filter branch from ad2394b to ce8d4dc Compare July 13, 2022 16:23
@AndrewSisley AndrewSisley merged commit 6cee137 into develop Jul 13, 2022
@AndrewSisley AndrewSisley deleted the sisley/feat/I392-agg-inline-array-filter branch July 13, 2022 16:37
shahzadlone pushed a commit to shahzadlone/defradb that referenced this pull request Feb 23, 2024
…rk#622)

* Remove out of date comment

* Remove unuseful 0 check

* Remove filter type restrictions

This needs to be able to accomodate inline arrays shortly

* Add support for inline array aggregate filters
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/no-benchmark Skips the action that runs the benchmark. area/query Related to the query component feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Aggregate: Filter for inline arrays
3 participants