Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[query] Expose references via ExecuteContext #14686

Merged
merged 1 commit into from
Jan 31, 2025

Conversation

ehigham
Copy link
Member

@ehigham ehigham commented Sep 17, 2024

This change is split out from a larger refactoring effort on the various Backend
implementations. The goals of this effort are to provide query-level
configuration to the backend that's currently tied to the lifetime of a backend,
reduce code duplication and reduce state duplication.

In this change, I'm restoring references to the execute context [1] and
decoupling them from the backend. In a future change, they'll be lifted out of
the backend implementations altogether. This is to reduce the surface area of
the Backend interface to the details that are actually different.

Both the Local and Spark backend have state that's manipulated from python via
various py methods. These pollute the Backend interface [2] and so have been
extracted into the trait Py4JBackendExtensions. In future changes, this will
become a facade that owns state set in python.

Notes
[1] "Restoring" old behaviour I foolishly removed in fe5ed32
[2] "Pollute" in that they obfuscate what's different about backend query plan
and execution

@ehigham ehigham marked this pull request as ready for review September 17, 2024 00:43
@ehigham ehigham force-pushed the ehigham/ctx-references branch 2 times, most recently from 3518244 to e0b89a5 Compare September 17, 2024 15:47
@ehigham ehigham force-pushed the ehigham/cloud-credentials branch from 9cff2fc to d14679d Compare September 19, 2024 20:52
@ehigham ehigham force-pushed the ehigham/ctx-references branch from 92da6f0 to 4576acf Compare September 19, 2024 20:52
@ehigham ehigham force-pushed the ehigham/cloud-credentials branch from d14679d to ed2e8e8 Compare October 1, 2024 19:44
@ehigham ehigham force-pushed the ehigham/ctx-references branch from 4576acf to 8a42e93 Compare October 1, 2024 19:44
@ehigham ehigham force-pushed the ehigham/cloud-credentials branch from ed2e8e8 to fc06f02 Compare October 8, 2024 19:18
@ehigham ehigham force-pushed the ehigham/ctx-references branch from 341e6a7 to 1682964 Compare October 8, 2024 19:18
@ehigham ehigham force-pushed the ehigham/cloud-credentials branch from fc06f02 to a101b55 Compare October 8, 2024 20:30
@ehigham ehigham force-pushed the ehigham/ctx-references branch from 1682964 to b1c0d04 Compare October 8, 2024 20:30
@ehigham ehigham force-pushed the ehigham/cloud-credentials branch from a101b55 to 1682a0f Compare October 16, 2024 20:02
@ehigham ehigham force-pushed the ehigham/ctx-references branch from b1c0d04 to 84ddcf3 Compare October 16, 2024 20:02
@ehigham ehigham force-pushed the ehigham/cloud-credentials branch from 1682a0f to 6e63e69 Compare October 16, 2024 21:30
@ehigham ehigham force-pushed the ehigham/ctx-references branch from 84ddcf3 to a2ff477 Compare October 16, 2024 21:30
@ehigham ehigham force-pushed the ehigham/cloud-credentials branch from 6e63e69 to d31f732 Compare October 17, 2024 14:59
@ehigham ehigham force-pushed the ehigham/ctx-references branch 2 times, most recently from df5c723 to adc0602 Compare October 21, 2024 15:24
@ehigham ehigham force-pushed the ehigham/cloud-credentials branch from 82e6e7a to a70aa24 Compare October 21, 2024 18:50
@ehigham ehigham force-pushed the ehigham/py4j-backend-extensions branch from b8b22b8 to 2d4edc1 Compare December 17, 2024 16:47
@ehigham ehigham force-pushed the ehigham/ctx-references branch from a33bb7f to b19073d Compare December 17, 2024 16:47
@ehigham ehigham force-pushed the ehigham/py4j-backend-extensions branch from 2d4edc1 to 79ecea8 Compare December 17, 2024 20:00
@ehigham ehigham force-pushed the ehigham/ctx-references branch from b19073d to 75f4fd5 Compare December 17, 2024 20:00
@ehigham ehigham force-pushed the ehigham/py4j-backend-extensions branch from 79ecea8 to 1d36659 Compare January 13, 2025 16:36
@ehigham ehigham force-pushed the ehigham/ctx-references branch from 75f4fd5 to 5c0308f Compare January 13, 2025 16:36
@ehigham ehigham force-pushed the ehigham/py4j-backend-extensions branch from 1d36659 to 0f02bdc Compare January 17, 2025 21:15
@ehigham ehigham force-pushed the ehigham/ctx-references branch from 5c0308f to 90a9649 Compare January 17, 2025 21:15
Base automatically changed from ehigham/py4j-backend-extensions to main January 21, 2025 16:37
@ehigham ehigham force-pushed the ehigham/ctx-references branch 4 times, most recently from 482d085 to 901fafe Compare January 22, 2025 16:45
@ehigham ehigham force-pushed the ehigham/ctx-references branch 2 times, most recently from 7c3af40 to dd052e7 Compare January 27, 2025 23:15
Copy link
Collaborator

@chrisvittal chrisvittal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Won't change my approval, I just have one question.

hail/hail/src/is/hail/io/reference/package.scala Outdated Show resolved Hide resolved
@ehigham ehigham force-pushed the ehigham/ctx-references branch 4 times, most recently from bde8755 to 1ed1a39 Compare January 29, 2025 21:40
Copy link
Collaborator

@patrick-schultz patrick-schultz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, just a few questions

try {
backend.flags.set("shuffle_cutoff_to_local_sort", "40")
def testDistributedSortHelper(myTable: TableIR, sortFields: IndexedSeq[SortField]): Unit =
ctx.local(flags = ctx.flags + ("shuffle_cutoff_to_local_sort" -> "40")) { ctx =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit awkward, and I don't love HailFeatureFlags having both a mutable interface and a + operator that returns a new object. How about if ExecuteContext.local took only the bindings to add/update?

Copy link
Member Author

@ehigham ehigham Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see what's awkward, can you ellaborate?

This creates a local context where the flags are updated.
local itself accepts more than flags and takes care of the whole

val whatWasItBefore = get
set newValue
try {
...
} finally {
set whatItWasBefore
}

noise.
If you wanted to modify the flags for the rest of the execution, you could just do

ctx.flags.set("foo", "bar")
...

This interface gives a simple way of doing both

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I could have sworn I replied to this, but my comment seems to have disappeared.

Sorry, I should have been more specific. I just meant that it might be simpler for ExecuteContext.local to take a overrideFlags: Seq[(String, String)] = Seq.empty, so you could call ctx.local(overrideFlags = Seq("shuffle_cutoff_to_local_sort" -> "40")). I.e. local overides individual flags in the scope, rather than the entire flags map.

But this is minor, so I'm happy to approve as is if you don't want to change it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I see what you're getting at. But how would I unset flags? By mapping the value to null? yuck!
I think the current thing is simpler and so I'm inclided to keep it.

hail/hail/src/is/hail/backend/Backend.scala Show resolved Hide resolved
@ehigham ehigham force-pushed the ehigham/ctx-references branch from 1ed1a39 to aca48d3 Compare January 30, 2025 19:52
@hail-ci-robot hail-ci-robot merged commit 0bdffbc into main Jan 31, 2025
3 checks passed
@hail-ci-robot hail-ci-robot deleted the ehigham/ctx-references branch January 31, 2025 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants