-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ruby phase 2 - Workflows #96
Conversation
ruby/sdk-phase-2.md
Outdated
* `Temporalio::Workflow` is both the class to extend and the class with class methods for calling statically to do | ||
things. | ||
* The `workflow_` class methods on the `Temporalio::Workflow` can only be called at class definition time and cannot | ||
be called on `Temporalio::Workflow` directly. These are for adjusting the definition. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a user writing client code writes something like
supported_languages = wf_handle.query(
GreetingWorkflow.<TAB>
will their IDE show them all the "framework" methods (including the unavailable methods starting workflow_
)? It is nice that in other languages the user's IDE only offers the methods from their own workflow interface. I think we would like to have the same be true for Ruby if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I am afraid this is only available at runtime (via method_missing
). Ruby does not allow you to access an instance method at the class level.
I think we would like to have the same be true for Ruby if possible.
Completely agree, but I cannot think of a way to do this without some form of proxying which were trying to avoid for a few reasons (mostly because we'd have to lie about what the type actually is). And even if we did do proxying, Ruby is often not expressive enough to represent this well statically. Here may be some other options, but all have more tradeoffs than the current approach I think.
class Workflow
# @return [self]
def self.stub
StubDelegator.new(self)
end
end
# This if you want to keep accurate with typing, but no clear way to provide
# handle and args
my_wf = MyWorkflow.stub
my_wf.update('my arg')
# This if you want better options, but you'd fail a type checker
my_wf_handle.update(MyWorkflow.stub.update, 'my arg')
# You could use blocks, but it gets confusing for users too
my_wf_handle.update(wait_for_stage: whatever) { |stub| stub.update('my arg') }
Open to suggestions here, but I fear Ruby just cannot statically express things to the IDE like this. Ruby developers are used to having to use symbols or runtime methods that are not in their IDE. Of course a user can provide YARD macros or Sorbet/RBI types to support this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've probably already implicitly explained this, but in Python this is done by implementing static things like @query
decorators as functions on a workflow
module. Can you (re-)summarize why an approach like that doesn't work for Ruby?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Python you can reference an instance method via Class.instance_method
, but in Ruby you cannot. You must have an instance of a class to reference an instance method on it. FooWorkflow.my_query_method
will give a NoMethodError
. You can do hacky things like FooWorkflow.instance_method(:my_query_method)
to get an unbounde method, but you're into reflection there not IDE help.
We will support FooWorkflow.my_query_method
via method_missing
approaches, but that is runtime, not definition time like IDEs need. But I'm open to ideas here. There are concepts of "stubs" we have used in Java, but they have many of their own problems.
I am investigating https://github.com/Shopify/tapioca#writing-custom-dsl-compilers to see if we may be able to write a custom DSL compiler/extension to help here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have confirmed that even for Sorbet users, custom DSL compilers don't run in IDEs by default anyways. I cannot see a way at this time to support typesafe workflow messages without a fake/proxy type of object which I am trying to avoid. Ruby users don't really expect DSLs to be well typed in their IDE (and in the Ruby world, our need to define certain methods as special like as a query is only really done via DSL).
ruby/sdk-phase-2.md
Outdated
* ❓ Is this acceptable? This is a consequence of us wanting to share workflow context class with workflow base | ||
class. Basically at the top of these calls there's a `if self != Workflow` check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's pretty much fine, but... why not just have a WorkflowContext
class and be explicit? Or, have that be Workflow
and the other be WorkflowDefinition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is definitely a good question and subject to debate and is the reason I put "Temporalio::Workflow
is both the class to extend and the class with class methods for calling statically to do things." at the top of the controversial decisions.
So basically you have these options:
Option 1 - Together
class MyWorkflow < Temporalio::Workflow
def execute
Temporalio::Workflow.wait_condition { false }
end
end
Option 2 - Context class named separate
class MyWorkflow < Temporalio::Workflow
def execute
Temporalio::Workflow::Context.wait_condition { false }
end
end
One benefit to this is that we do have Temporalio::Activity::Context
that houses all of the activity things, but if you look at some other SDKs like Java and .NET, the class name is "activity context" but not "workflow context", so that disparity does exist elsewhere.
Option 3 - Workflow class named separate
class MyWorkflow < Temporalio::WorkflowImpl
def execute
Temporalio::Workflow.wait_condition { false }
end
end
I am using WorkflowImpl
here because there will be a Temporalio::Workflow::Definition
that represents the static definition built from the impl (and signal, query, and definition classes, same as there is already Temporalio::Activity::Definition
). Open to a better name, I just struggled to come up with one.
Option 4 - Don't even require class extension
class MyWorkflow
def execute
Temporalio::Workflow.wait_condition { false }
end
end
You aren't buying that much with a class extension, but it is clear and is common in Ruby and is what we do for activities.
Thoughts/preference? Any other/better options here? Mixins would suffer the same class-for-two-purposes reuse issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like option 3. Clear separation, just as terse as option 1 for the vast majority of the code..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, which is basically the same as option 2 just different on who gets the "Workflow" part. I will say I like "Context" for users more than I like "WorkflowImpl" for extenders, but I do like callers of workflow utilities to only need Temporalio::Workflow
instead of Temporalio::Workflow::Context
(and what drove my original approach). Either way, I am starting to lean towards separating what you extend from what you invoke. Any ideas for other names though besides either "Context" for workflow utilities or "WorkflowImpl" for the extension class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really lol. Those are pretty obvious choices. Maybe WorkflowDefinition
like I suggested instead of impl. I would err on making that longer since you type it way less.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, after some internal discussion, I think I'm landing at something like:
module Temporalio
class Workflow
class << self
protected
def workflow_name(name)
raise 'TODO'
end
end
module API
def execute_activity(activity, blah1, blah2)
raise 'TODO'
end
end
extend API
end
end
So basically you have protected class methods for adjusting the workflow definition, then an API
set of module methods in the workflow class by default (extend
in this way I think avoids YARD assuming they are part of the class, still investigating). This will allow others to write utilities for workflows using Temporalio::Workflow::API
directly (or extending it themselves too). Doing a bit more investigation before accepting this approach.
EDIT: This is still no good, the IDEs and docs still see the dozens of methods in API
as part of the class extending Workflow
. I am looking at various options here, but I'm struggling to find a way beyond splitting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussion, proposal updated to require workflows extend from Temporalio::Workflow::Definition
and use module functions on Temporalio::Workflow
. As a consistency consequence, activities will now also have to extend from Temporalio::Activity::Definition
.
* 💭 Why? It is helpful for callers/client-side to be able to reference something more explicit than a symbol, and it | ||
also helps with making sure if the name is changed we use that name instead of the method name. Symbols can always | ||
be used though. | ||
* ❓ Is this acceptable or too much magic? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this kind of magic is largely fine because it's not really pretending some behavior is different, it's just advanced sugar for fetching the definition in a nice way. Works for me. The only sort of magic bit I suppose is that, if you call that, it's not invoking the method, but it would seem fairly obviously undesirable to do that anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. It is not very common in Ruby to have class methods and instance methods of the same name with different behavior, though it does happen here and there. But I think our use case uniquely justifies it.
The bigger struggle is type safety integration, but Ruby type safety is in such poor state I 1) cannot find a good way to support caller-side type safety in RBS/Sorbet (unless relying on something like this which doesn't work in IDEs anyways), and 2) do not want to degrade ease of use by the majority of users using this as normal Ruby without type safety checks. Users can of course write their own signatures for these things, e.g. in RBS you might have:
class MyWorkflow < Temporalio::Workflow
def self.my_update: -> Temporalio::Workflow::Definition::Update[MyReturnType, MyArgType]
def my_update: (MyArgType arg) -> MyReturnType
end
I am open to any/all designs here. Requirements for others reading:
- Workflow handlers for signal, query, and update must be easily implemented as instance methods
- Workflow handlers for signal, query, and update must be easily usable by clients without having an instance of the workflow (we don't want fake/proxy objects where method invocation does different things)
ruby/sdk-phase-2.md
Outdated
def self.any_of_no_raise(*futures) | ||
# Returns future whose result is set to nil when all futures complete | ||
# regardless of whether any future failed. | ||
def self.all_of_no_raise(*futures) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these are confusing. The wait
and wait_no_raise
method accomplishes the waiting & maybe raising part just fine.
From what I understand these are really more like different conditions for when any/all resolve. The first is more like any_success
and the second is really all_of
, where the current all_of
is really try_all_of
or something. IE: It looks like the originals short circuit on any failure, and these versions do not.
That or I'm not understanding them, because what's the difference between any_of_no_raise(...).wait
and any_of_no_raise(...).wait_no_raise
? The latter reads particularly strangely. Either way if I'm wrong or right, I think it says the names are a bit confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because what's the difference between
any_of_no_raise(...).wait
andany_of_no_raise(...).wait_no_raise
?
No big difference if you have both of these calls together. But they have value independently (the first one returns a future the second you have chosen not to even be able to look at the failure) and have value as consistent shortcuts. I can't imagine anyone would call the latter combo, wait_no_raise
will more often be used for people using singular futures they don't care about.
Most languages have forms of these. But I am open to better names though don't want to get too inconsistent with each other and don't want these to be arguments since they change the return type. Basic requirements I think are:
- Ability to wait for first of multiple futures or raise on first failure
- Ability to wait for first of multiple futures but do not raise on first failure
- Ability to wait on all of multiple futures or raise on first failure
- Ability to wait on all of multiple futures but do not raise on first failure
- Ability to wait on a future or raise on failure
- Ability to wait on a future and not raise on failure
This was my best attempt after many permutations to get those 6 methods in. Definitely open to clearer suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I would then change the names a-la my suggestions - the most important part I think is just not to use no_raise
in these names since it's not really about not raising, it's about not short-circuiting in the event one fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I would then change the names a-la my suggestions
Do you have a specific suggestion on names? So instead of:
any_of
any_of_no_raise
all_of
all_of_no_raise
wait
wait_no_raise
reading comment above, something like:
any_of
any_of_success
- says success but includes failure?try_all_of
- fails future if any fail?all_of
- does not fail future if any fail, so one "x_of" doesn't ail the future and one does?wait
wait_no_raise
Maybe something like try
instead of _no_raise
, e.g.
any_of
try_any_of
all_of
try_all_of
wait
try_wait
Looking for symmetry in naming between messages that fail their things and ones that don't
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the try names, but I think they should actually be the short-circuiting versions. IE: Existing any_of
and all_of
become the try_xxx
versions, and the current _no_raise
versions just become any_of
and all_of
. Since "try" implies "try until something breaks". This also matches the Rust naming scheme for these, FWIW.
I would probably keep wait_no_raise
though rather than try
there just because that's more obvious. And, I think, to my point, that is actually about raising or not where the other stuff is about short-circuiting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrmm, to me "try" means failure is captured or doesn't fail as it otherwise would, e.g. .NET int.Parse
vs int.TryParse
. One fails as normal, another captures failure (same as the actual try
statement in many languages).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The try
here is like "try them, but fail if any fails soon" whereas without it is "definitely do all the futures even if some of them fail".
But, I wouldn't cry about having it the other way around, I think that can work too w/ clear docs. I do think wait
and wait_no_raise
should stay that way though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, so wait
, wait_no_raise
settled. Any other term we can think of for "swallow error" form of the others besides try
? I will update proposal with the try_
form, but definitely open to other words here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposal updated to change any_of_no_raise
and all_of_no_raise
to try_any_of
and try_all_of
and clarify they do not raise when wait
ed on.
* ❓ Is this an acceptable default? Python uses `[Etc.nprocessors, 4].max` as the max threads, but we figure it can | ||
be unbounded here and remain bounded by the Core `max_concurrent_workflow_tasks` (TODO: expose this, it was left | ||
off in `main`). | ||
* ❓ Should the default share the same thread pool as activity executor? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, specifically because of some of the workflow isolation stuff that may or may not need to happen in them. Seems easier and cleaner to keep them separate without much downside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a difference between thread pool and executor in this case. One benefit of reusing the thread pool (but still different executors of course) is that it is an "unbounded caching thread pool" which means the same threads could be reused across workflows and activities. The threads, once returned back to the pool, should have no side effects from its use. By having separate unbounded thread pools here, both pools will have idle threads that the other could be using unnecessarily. Definitely not an easy/hard thing to share vs keep separate because we will have lazily created global defaults for these (and already do for activity executor).
So the only argument against I can think of is maybe some kind of concern for user clarity, but my hope is that users never touch this stuff for the most part.
ruby/sdk-phase-2.md
Outdated
is that not only are Ractors not stable and may have bugs, but they also give off a warning. A workflow instance will be | ||
in a Ractor and Core activations will be communicated. | ||
|
||
* ❓ Should we enable this by default, disable this by default, or somehow force the user to make a choice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think anyone cares about choosing between these two really. I think we need to figure out what has acceptable performance and correctness and do that. The warning / experimental status for Ractors seems like a nonstarter, though.
But, other than that, I can't really imagine why someone is coming along and deciding "Oh I really need to use Ractors and not TracePoint" for some reason. The only thing they care about is that the workflows run fast & errors get caught.
That is, of course, excepting some other weird interactions that aren't mentioned here or we don't know about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warning / experimental status for Ractors seems like a nonstarter, though.
I am not sure this is true. The Ractor gives so much benefit to ensuring safe determinism, it's basically a language-supported sandbox. And if we don't enable them by default (or force the user to choose whether to use them for now), we never can enable them by default. And they are quite decent for now for limited uses from my understanding.
I can't really imagine why someone is coming along and deciding "Oh I really need to use Ractors and not TracePoint" for some reason
This section is purely for Ractors, I think we will turn on TracePoint by default.
It is a very tough call here on whether to enable Ractors by default, because Ractors are a perfect fit for this use case and we would support turning them off (at the worker level), and if we have a non-Ractor default now, we will never be able to change to the safer Ractor default if we want to later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we never can enable them by default.
So why is that true?
I think the warning isn't so much the problem as the content of the warning that basically says "these are buggy". If that's not actually true, then cool, I think we can be fine with it, but, we should know that for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why is that true?
To clarify I mean "if we want this as default, we have to do the more restrictive default now because we can't later" because you can't go from a less-restrictive default to a more-restrictive default without breaking people. Ractors prevent state sharing (e.g. a global is different in each Ractor). Also, things that are shared across Ractors must be marked as such via https://docs.ruby-lang.org/en/master/Ractor.html#method-c-make_shareable (which is an awesome feature and we already require this be done with payload and failure converters knowing they may be used inside sandbox). So turning on Ractors by default when they weren't before would break a lot of people that were, say, using their logger or OTel impl or other things people sometimes opt out of sandboxes for.
I think the warning isn't so much the problem as the content of the warning that basically says "these are buggy". If that's not actually true, then cool, I think we can be fine with it, but, we should know that for sure.
We definitely have to test here, and they may be buggy for some advanced uses and maybe not our own. But even if that's the case, defaulting to experimental thing is rough and it spits out a warning, even though plenty of Ruby users do use Ractors today. Even if we default, we have to make it very easy to change and document it very clearly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. I see. But arguably they already need to not be doing all that stuff for proper workflows anyway?
I think, if we find that we're not running into implementation bugs, it can be an OK default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. I see. But arguably they already need to not be doing all that stuff for proper workflows anyway?
Usually, yes, but some things people accept they need to cheat (e.g. interceptor sending exception to sentry when not replaying, or a logger that writes to disk but is not the logger we offer)
I think, if we find that we're not running into implementation bugs, it can be an OK default.
I'm leaning there too. But it will be super loud that we use them and how not to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vote is to force users to explicitly specify whether to use Ractors or not.
If we think we might want to change the default in the future, I think it's fine to require users to explicitly specify it now and remove the requirement later (so long as we're not asking them to do that for too many things). This enables us to migrate to the more restrictive default for future new users without breaking existing users.
We did something similar for Update IIRC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrmm, no SDK requires extra options for running workflows. I think if we choose Ractors we can change the default to be less restrictive if we ever had to (though I don't suspect we ever would).
We did something similar for Update IIRC.
This is not comparable I don't think because we know we will provide a default in the future but it is not yet implemented. Here, we don't know, so this isn't necessarily temporary. I think we might as well decide now, because in 5 years I don't think our decision would change.
I'm leaning towards the safer-though-technically-warns-as-experimental default that is easy to opt out of. Obviously it will be a heavily tested default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposal updated to Ractors by default with easy, well-documented way to opt out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
general non-blocking comments, no need to wait for my approval.
ruby/sdk-phase-2.md
Outdated
* ❓ Why not just take a dependency on `async` gem? It does have a deterministic scheduler and does a decent job of | ||
everything we want, but not only do we want to avoid transitive dependencies in our library (for versioning and | ||
other reasons), we cannot rely on its logic into the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is async
s deterministic behavior actually part of its contract or no? (I'm thinking of how we got burned by Python's change in behavior for asyncio; I imagine you are too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is asyncs deterministic behavior actually part of its contract or no?
Not explicitly, but most of the things it does are delegated to the underlying fiber scheduler. The other problem is not just that it is deterministic, but that it also doesn't make changes that would be incompatible (e.g. schedule two fibers in a different order in a newer version of the library).
Concerning determinism, they do say as a comment inside https://github.com/socketry/async/blob/72fae62658b68e85d53d4c6adf13d3b34108cfe8/lib/async/scheduler.rb#L428-L435 that they intentionally try to preserve determinism, but obviously a code comment is not a clear contract (and that's their scheduler, we'll have our own).
It's a tough call, because it's a full featured library with lots of utilities that leans on the deterministic fiber scheduler, and we surely don't want to recreate every construct they have already created. But maybe we don't want to encourage use of this project? Java SDK gets along just fine with its primitive Promise
support. We may still want to lean on traditional Ruby things like queues, timeouts, and sleep I think, but maybe not (see #96 (comment)). Regardless, we'd have to write our own mutex (we're already writing our own future).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this would be a much easier call if determinism were an explicit design goal, but since it's not as such, I think it's probably safer to avoid the dependency for now. Perhaps we could align our own API surface with its API so we could switch later if there is user demand? That might also help reduce the cognitive overhead of learning Temporal if the user already knows async
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's probably safer to avoid the dependency for now
To clarify, we would never have a dependency, just that it would happen to work and we could recommend its use (and we'd test that it does in our tests). Having said that, I think I agree maybe we should not even recommend its use or discuss it.
Perhaps we could align our own API surface with its API so we could switch later if there is user demand
It's too large. I think we can have Future
(what they call Task
) and Mutex
manually implemented, and then decide whether to explicitly implement sleep
and Timeout
and Queue
or allow reuse of the standard library ones (knowing in Ruby this isn't the same as leaning on a large set of async primitives, these are 3 very limited low-level common Ruby things that will not change behavior).
I think I'm leaning towards no async
library recommendation, Future
+ Mutex
implemented, sleep
+ Timeout
+ Queue
just work from the standard library. We haven't really needed more than this in any other language.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposal updated to discourage use of async
library and clarify that we do support stdlib things that work in all environments.
ruby/sdk-phase-2.md
Outdated
* `Temporalio::Worker::WorkflowTaskExecutor::ThreadPool` that accepts a thread pool which defaults to max threads | ||
unbounded. | ||
* There is a `default` class method with a lazily created global default. | ||
* ❓ Is this an acceptable default? Python uses `[Etc.nprocessors, 4].max` as the max threads, but we figure it can | ||
be unbounded here and remain bounded by the Core `max_concurrent_workflow_tasks` (TODO: expose this, it was left | ||
off in `main`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhmmm, unbounded thread creation seems scary to me. I think I'm OK leaving it bounded by max_concurrent_workflow_tasks
as long as it's actually designed that way intentionally and it's not just something that happens by accident (e.g. it should be explicit in our code/code comments that this is the intention).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is bounded by that from Core. But that is per worker and the thread pool is shared across workers for obvious benefits. So yes, Core ensures it cannot be more than the sum of max concurrent tasks across all workers using the thread pool.
ruby/sdk-phase-2.md
Outdated
is that not only are Ractors not stable and may have bugs, but they also give off a warning. A workflow instance will be | ||
in a Ractor and Core activations will be communicated. | ||
|
||
* ❓ Should we enable this by default, disable this by default, or somehow force the user to make a choice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vote is to force users to explicitly specify whether to use Ractors or not.
If we think we might want to change the default in the future, I think it's fine to require users to explicitly specify it now and remove the requirement later (so long as we're not asking them to do that for too many things). This enables us to migrate to the more restrictive default for future new users without breaking existing users.
We did something similar for Update IIRC.
Updated with the following changes:
|
class GreetingWorkflow < Temporalio::Workflow::Definition | ||
workflow_query_attr :language | ||
|
||
def initialize(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider: getting ahead of Workflow Init and making that a default in ruby. I don't foresee a lot of downside to taking workflow args by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it should be the default and we chose not to make it the default in .NET even though [WorkflowInit]
was present since its inception. There is a workflow_init
below to facilitate opting in just like every other SDK that allows it, not make Ruby some special thing here. Would also like not to eagerly validate that the constructor arguments match the execute arguments if not opted in. And the exact workflow_init
behavior likely wouldn't default anyways, instead we'd have to reflectively look at initialize
and it'd optionally be provided start args only if they were accepted (which is what @dandavison considered doing instead of @workflow.init
in Python before deciding otherwise). We've found requiring either "parameterless + no workflow init" or "workflow init + exact execute parameter matching" to be safer/clearer than "optional workflow init based on reflection".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBC, what I'm proposing is just to require initialize to take start workflow args always. That feels like the most normal and straightforward approach in any object-oriented setting, familiar to everyone. Your approach will dramatically reduce the number of folks who take advantage.
I understand that from your perspective, the workflow init approach is "normal." But stepping out of your Chad shoes, what we've done in other SDKs is pretty esoteric. To most users, using the constructor is normal unless they happen to have used temporal in other languages, and even they will be plenty familiar with constructors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think a default parameterless constructor is esoteric nor do I think an absent constructor is esoteric. These are the defaults for the class in the language, and can/should be the default for us. What is esoteric in Ruby and most programming languages is requiring a constructor with certain parameters even if you don't need it. That's what's non-normal.
@mutex = Temporalio::Workflow::Mutex.new | ||
end | ||
|
||
def execute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: might want to show how args appear here.
(should they appear here, or should they just be part of initialize?)
I think having both methods take them is a bit confusing, but I can buy the convenience of having execute
take them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(should they appear here, or should they just be part of initialize?)
Here
I think having both methods take them is a bit confusing, but I can buy the convenience of having execute take them.
Like every other SDK, and consistent with our behavior of workflow entry points
end | ||
|
||
workflow_update_validator(:set_language) | ||
def validate_set_language(language) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the experience if they don't keep the args in sync b/w validator and handler? I'd like them to get reasonable feedback.
suggestion: This seems like one to see if sorbet/rbs can wrangle static typesafety via a generic, perhaps by putting in a nested class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fails at register time like every other register-time validator in every other SDK
This seems like one to see if sorbet/rbs can wrangle static typesafety via a generic, perhaps by putting in a nested class.
I can't envision how a nested class might look, regardless would like to keep this as simple as possible for regular Ruby users
activities: [GreetingActivity], | ||
workflows: [GreetingWorkflow] | ||
) | ||
worker.run(shutdown_signals: ['SIGINT']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is shutdown_signals
a choice that the user needs to make, or is there a good default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a choice that a user needs to make. This is an application not a library and it's not neighborly for us to trap signals by default (or have any global side effects by default). This is already in main
btw and was part of the activity PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to comment here since that PR is in main.
I think this is important since every user will need to set up a Worker
, and this will be part of the "getting started" scenario, so I have a few questions.
Looks like we default to []
. If this is a choice that every developer needs to make, why is there a default? What % of users do you think that default will be acceptable for?
Looking at your docs: https://github.com/temporalio/sdk-ruby/blob/67741fcbd9e354b9a8d03e7e923efe8844ccf1ca/temporalio/lib/temporalio/worker.rb#L82
I don't see any guidance. It's typed as String | Int
in rbs, and I don't see any runtime validations. How do users know what values are acceptable or get guidance on what's normal to pass in there in what situations and get feedback if they make a mistake?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a choice that every developer needs to make, why is there a default?
It is not a choice every developer will make. This is a library and many developers will prefer to use a block body or the Cancellation
token as a way to control worker lifetime. Signals is but one way to stop a worker, by no means the default and by no means one that everyone will want. These workers are just things that run how the user wants, we just provide this helper to cancel the Cancellation
for them on Signal.trap
.
How do users know what values are acceptable or get guidance on what's normal to pass in there in what situations and get feedback if they make a mistake?
I will clarify in docs that these values are just passed to Signal.trap
. That's typed in https://github.com/ruby/rbs/blob/d9000d23c2c1d032a194848562806896c54fd0af/core/signal.rbs#L92-L93 and https://github.com/sorbet/sorbet/blob/0ab8f3c29260f8ecee31aa960de3692419e2d3c3/rbi/core/signal.rbi#L100-L113 as general item (I need to add Symbol
there it seems), there is no more typing guidance than that. Unsure what is expected wrt "get feedback if they made a mistake" but if https://docs.ruby-lang.org/en/master/Signal.html#method-c-trap is not documented enough in this area, we can consider contributing.
# Send query | ||
supported_languages = wf_handle.query( | ||
GreetingWorkflow.languages, | ||
{ include_unsupported: false } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's our static type safety story here? Probably worth making sure that .rbi users can get good static validations..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Static typing is still under research, but our early research has shown it would not affect design (nor would we want it to for the majority of Ruby users that don't care about it and don't deserve to be affected by it), so we are not blocking this effort.
ruby/sdk-phase-2.md
Outdated
# Exposes an attribute as a query. 💭 Why? We found this _very_ valuable in | ||
# .NET. Think of it as attr_reader for workflows. It was also decided there | ||
# was not enough value to have attr_accessor shortcut for update or | ||
# attr_writer shortcut for signals since both are not commonly just | ||
# getter/setter. | ||
def self.workflow_query_attr(query_method) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cool.
nit: I wonder if workflow_query_attr_reader
would more grokkable by the ruby community, despite the partial reduncancy of query vs reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Can add suffix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
def self.workflow_name(name) | ||
|
||
# Placed above initialize to get start arguments. | ||
def self.workflow_init |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentioned above, I wonder if we can avoid this rigmarole since we're starting fresh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replied above, but I think this should be opt-in and was chosen that way in the last SDK even though we were starting fresh there too.
* Handler methods will use `RawValue` for args if told to, similar with workflow `execute`/`initialize` if | ||
`workflow_raw_args` is set. | ||
* `RawValue` can be returned too. | ||
* To mark a workflow as dynamic, `workflow_dynamic` class method is called. Only one dynamic workflow can be present for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a layering violation. The workflow itself doesn't need to know whether it's dynamic; instead, it should be registered as dynamic on the worker, which also makes it easier to select which is the dynamic workflow for the worker. Without that, if the user has two workflows registered as dynamic_workflow
in the same codebase, it would barf even if only one is registered in the worker.
See https://github.com/coinbase/temporal-ruby/blob/f41efb7bb0c9e755a07382b94e00a4f31bcd33c1/lib/temporal/worker.rb#L97 for what we did in the coinbase SDK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The workflow itself doesn't need to know whether it's dynamic
This is not true in any SDK nor should it be true here. Dynamic workflows/activities should be allowed to be in the same list at registration time with the rest of the workflows. We want authors to clearly mark and be aware that a workflow, activity, signal, query, or update is dynamic. While technically dynamic workflow and activity's signatures are no different, dynamic signal, query, and update signatures are different and must opt in. So all dynamics in all forms are clearer as author opt-in.
Without that, if the user has two workflows registered as dynamic_workflow in the same codebase, it would barf even if only one is registered in the worker.
This is not true here or in any SDK, this is only checked at registration time, same as in every SDK with dynamics. You can have as many as you want implemented, you can only have one dynamic per worker. There are no globals here.
|
||
## Workflow Asynchrony | ||
|
||
Ruby does not have a built-in task/promise/future/etc concept. However it does have fibers and we can leverage those. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like this is out of date and seems to contradict your Future
stuff below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The statements are accurate. Ruby does not have a built-in concept and we can leverage fibers. We add our own future later on top of fibers of course.
class ManyActivityWorkflow < Temporalio::Workflow | ||
def execute(count) | ||
futures = count.times.map do | ||
Future.new do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found Temporal ruby's implicit Futures weird, so this could work well.
However, please try to make sure that return types propagate through the future for static types. Would be a shame to lose type safety for async.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Static typing is still under research, but our early research has shown it would not affect design (nor would we want it to for the majority of Ruby users that don't care about it and don't deserve to be affected by it), so we are not blocking this effort.
ruby/sdk-phase-2.md
Outdated
def info | ||
def current_update_info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def info | |
def current_update_info | |
def self.info | |
def self.current_update_info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Made two minor fixes, but otherwise no outstanding/unresolved things (or even updates) for a while. Merging. |
Summary
For general discussion beyond comments here, please also join us on
#ruby-sdk
in Slack.