functions that range over collections #4616
Replies: 3 comments 5 replies
-
To be clear, perhaps we should allow things like:
as a way to do implicit joins on collections, something we never supported before. It doesn't seem like a terrible idea to me. |
Beta Was this translation helpful? Give feedback.
-
I introduced that to allow reusing the same alias in multiple places. I guess you are referring to #4602 when you say this doesn't work? I will look into it. |
Beta Was this translation helpful? Give feedback.
-
I don't really like that. IMO it's much clearer to use a correlated subquery e.g. |
Beta Was this translation helpful? Give feedback.
-
So this is not so much a design proposal, as a clarification of what are supposed to be the semantics of all the functions we have for quantifying over collection elements, aggregating over elements, or for qualifying an element.
These come in three varieties:
maxelement()
and friends,elements()
andindices()
, andvalue()
,index()
, andkey()
.Aggregation
A function like
maxelement(author.books)
is supposed to :It should be an error if it's applied to a collection element, as in:
That doesn't make any sense:
b
doesn't have elements. Furthermore, to respect the semantics, we would have to add agroup by b.author_id
to the generated SQL and we don't. So the query compiler should not accept this.Quantification
A function like
elements(author.books)
is supposed to:elements()
andindices()
were always meant to be quantified somehow! Either with:any
/every
,in
, orexists
It does not make sense to write something like:
Unfortunately, we started supporting this sort of query at some stage, and users have come to rely on it, and instead of telling them not to do it we've been fixing bugs relating to this usage. So for reasons of backwards compatibility we're going to have to keep accepting this as some sort of a weird way to write a join.
Qualification
A function like
index()
is supposed to:It does not make sense to write:
Apparently in at least some cases we currently interpret this as an implicit join followed by application of
index()
on the element, which is not the craziest interpretation if it works reliably, which I'm skeptical of.(So perhaps we should discuss this one.)
Why am I writing all this?
I'm writing this up because in the new query parser there's currently little validation of any of this stuff, and so we need to sort that out.
Also because (totally my fault) there was never a spec for any of this stuff, so it's not surprising that everyone came up with their own interpretations. I can hardly blame them, since the maximum documentation they ever got was two lines in the old docs.
Worse,
maxindex()
and friends don't really work at all, and result in the addition of extraneous joins to the root query, and duplicate results.(Note that none of these functions should ever add a join to the root query.)
I care about that because I want to expand the set of aggregate functions for collections, and let you write stuff like
max(index my.list)
andavg(value your.map)
.Beta Was this translation helpful? Give feedback.
All reactions