Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: EXPOSED-560 Support DISTINCT ON from Postgres #2275

Merged
merged 4 commits into from
Oct 30, 2024

Conversation

obabichevjb
Copy link
Collaborator

Description

Added support for DISTINCT ON clauses in the Query class, enabling retrieval of distinct rows based on specified columns for SQL SELECT statements.

Detailed description:

  • What: Introduced new methods withDistinctOn(Column<*>, vararg Column<*>) and withDistinctOn(Pair<Column<*>, SortOrder>, vararg Pair<Column<*>, SortOrder>) in the Query class to allow the use of DISTINCT ON clauses. These methods enable distinct rows selection based on the specified columns and optionally ordered by specified sort orders.
  • Why: We had several requests for that feature from GitHub and Slack
  • How: The changes were implemented by:
    • Adding two new methods to the Query class to set the distinctOn property.
    • Adjusting the prepareSQL method to include the DISTINCT ON clause when distinctOn is set.

Type of Change

  • New feature

Affected databases:

  • Postgres
  • H2

Related Issues

EXPOSED-560 Support DISTINCT ON from Postgres
#500

@obabichevjb obabichevjb force-pushed the obabichev/exposed-560-distinct-on branch from 1139cde to 6ed3b83 Compare October 15, 2024 09:14
@obabichevjb obabichevjb requested review from joc-a and bog-walk October 15, 2024 10:01
@obabichevjb obabichevjb force-pushed the obabichev/exposed-560-distinct-on branch from 6ed3b83 to 7a67160 Compare October 15, 2024 11:56
@obabichevjb obabichevjb force-pushed the obabichev/exposed-560-distinct-on branch from 1890d3b to 7cc0529 Compare October 17, 2024 10:25
require(!distinct) { "DISTINCT ON cannot be used with the DISTINCT modifier. Only one of them should be applied." }
@Suppress("SpreadOperator")
withDistinctOn(*columns.map { it.first }.toTypedArray())
return orderBy(*columns)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) with this implementation it would currently be allowed to -- arguably unintentionally -- add an entry to orderByExpressions twice for a DISTINCT ON column (since that attribute is a list and not a distinct set), e.g.

table
  .selectAll()
  .withDistinctOn(table.col to SortOrder.ASC)
  .orderBy(table.col, SortOrder.DESC)

postgresql allows for this and uses the last specified sorting predicate specified for a column but i'm not sure if you're intentionally supporting this use case; i don't personally know if the behaviour of having a column mentioned twice in ORDER BY is consistent across all sql dialects.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback,

when I was adding the option of withDistinctOn with ordering I also thought that it could cause misunderstanding,

In general, the main purpose of that variant is make a use case when there is only distinctOn without extra ordering simpler, and prevent necessity to call orderBy right after withDistinctOn() with the same columns order, what could be broken on any refactoring.

So my expectation at the current moment that for any complex logic of adding ordering, the version withDistinctOn(vararg columns: Column<*>) (without ordering) should be used.

require(!distinct) { "DISTINCT ON cannot be used with the DISTINCT modifier. Only one of them should be applied." }
@Suppress("SpreadOperator")
withDistinctOn(*columns.map { it.first }.toTypedArray())
return orderBy(*columns)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that if withDistinctOn() is called after orderBy() (which i think is technically possible) then the ordering may be incorrect (DISTINCT ON at least in postgresql requires the specified columns to appear in the ORDER BY expression in the same order), e.g.

table
  .selectAll()
  .orderBy(table.colThree)
  .withDistinctOn(table.colOne, table.colTwo)

i think results in ORDER BY table.col_three, table.col_one, table.col_two which generates a SQL error.

a possible solution would be to manipulate the underlying orderByExpressions mutable list directly to prepend the list of columns passed in, instead of calling orderBy(). something like:

columns.forEachIndexed { index, column -> (orderByExpressions as MutableList).add(index, column) }

(since private set on the declaration orderByExpressions prohibits redefining the var)

…s not generate wrong SQL with empty distinctOn
@obabichevjb obabichevjb requested a review from bog-walk October 29, 2024 11:24
Copy link
Member

@bog-walk bog-walk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a logger left in the new test.

@obabichevjb obabichevjb merged commit 9f3396c into main Oct 30, 2024
5 checks passed
@obabichevjb obabichevjb deleted the obabichev/exposed-560-distinct-on branch October 30, 2024 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants