Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schedule API updates #230

Merged
merged 9 commits into from
Sep 13, 2022
Merged

Schedule API updates #230

merged 9 commits into from
Sep 13, 2022

Conversation

dnr
Copy link
Member

@dnr dnr commented Aug 25, 2022

What changed?

  • Add StructuredCalendarSpec and specify that CalendarSpec gets compiled into it.
  • Add CronString which is also compiled into StructuredCalendarSpec.
  • Update various comments that were out of date or needed clarification.

Why?
Make the API easier to consume by SDKs and enable more advanced UI features.

Breaking changes
exclude_calendar is deprecated

@dnr dnr requested review from bergundy and lorensr August 25, 2022 06:31
@dnr dnr requested review from a team as code owners August 25, 2022 06:31
Copy link
Contributor

@lorensr lorensr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating! I like the comment fields ☺️

Comment on lines +184 to +189
// Any timestamps matching any of exclude_* will be skipped.
repeated CalendarSpec exclude_calendar = 3 [deprecated = true]; // use exclude_structured_calendar
repeated StructuredCalendarSpec exclude_structured_calendar = 9;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why deprecate? If non-structured is the better-DX input format, I'd allow it here too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, none of these is a great input format for excludes: the way this works (for now) is that excludes have to match the same "second" as the thing they're excluding. So the obvious:

exclude_calendar: {"month":"12","day_of_month":"31"}

doesn't work, because of the 0 defaults for h/m/s. You have to write

exclude_calendar: {"month":"12","day_of_month":"31","hour":"*","minute:"*","second":"*"}

For structured, you'd have to spell out all those ranges.

If you can come up with a better way to do excludes, please share. I think this will eventually require some user feedback. In the meantime I'd like to keep the number of options small. Actually, that's a good argument for not supporting any of this at all until we have a good solution. Maybe I should remove this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calendar spec has default 0 h/m/s. I assume @daily also uses default 0?

So for example if I create with @daily and exclude_calendar: {"month":"12","day_of_month":"31"}, it will exclude Dec 31st?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically correct (although I haven't implemented @daily yet). But it will fail to exclude if you add jitter, for example. Basically it doesn't capture the user's intent very well. I think we need to better understand the intent first. The uses I've heard of are mostly around excluding holidays, and carving out exclusions when external systems are known to be down for maintenance or whatever.

There's also the idea of creating a shared "holiday calendar" that can be referenced from multiple schedules so they don't all have to repeat/maintain the same (potentially large) list of excludes. Of course, this can also be done inside the workflow: check the start time against a list of holidays, possibly even obtained from an external service, and exit early if it's not supposed to run.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think exclude would be much easier to use if it applied to the pre-jitter-applied timestamp—would that be possible?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there other cases in which the exclude schedule you specify won't apply as intended? I think that if we say in the exclude docs that everything including seconds must match, and someone has a spec with seconds: 30, they'd get that they'd have to put seconds: 30 or seconds: '*' in the exclude.

One hard thing might be providing interval spec and calendar exclude. Like if my interval is every 10 minutes since epoc and I want to skip every 4th, then I'd like to be able to specify exclude_interval: { every: 40 }.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for excludes, we may want to default unspecified fields to *.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely don't like the idea of changing the semantics of Calendar or CronString depending on the context.

It does seem like excludes in general could use more work. As @lorensr said, in principle there's no reason why you couldn't have an exclude_interval. So we kinda want something like:

  message SpecSet {
    repeated CalendarSpec calendar;
    repeated CronString cron_string;
    repeated IntervalSpec interval;
    repeated StructuredCalendarSpec structured_calendar;
  }

  message ScheduleSpec {
    SpecSet include;
    SpecSet exclude;
  }

But that's a fairly incompatible change and I don't think we should go there until we have a better idea of what users actually want here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better idea of what users actually want here.

Better idea during beta or after full release of schedules? If latter, are we okay breaking the API post release? If not, should we make proto changes now such that adding exclude interval etc later would be backcompat?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after. we can change things in a compatible way later. possibly more messy than doing it upfront, but we have to release something before we know exactly what to do.

Comment on lines 350 to 355
// From spec:
// Some fields are too large/unimportant for the purpose of listing, so we'll clear them
// from this copy of spec: exclude_calendar, jitter, timezone_data.
// from this copy of spec: exclude_*, jitter, timezone_data.
ScheduleSpec spec = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd include exclude and jitter, just to avoid having to teach users they won't be there / avoid the confusion of:

  • having set exclude and jitter
  • not seeing them in the listinfo along with the other spec fields
  • thinking they created the schedule wrong

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that we're currently passing this value through a search attribute, as json, which is a bit of a hack until we have mutable memos. Search attributes are limited to 2048 bytes. And now the structured calendar specs are going to be a bunch larger too. I can make it try to include them and only drop them if they're too big.

I realize this is pretty awful, but it's only temporary until we have mutable memos. And it's only in list output; with describe you always get the whole thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we target mutable memo to be available in 1.18 and switch to use that before we make this more available?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think we can get it all ready in time...

Comment on lines 144 to 148
// Match years (2000-).
repeated Range year = 6;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will the default end value be? Like [{ start: 2000, end: 'INFINITY' }] or some large number, like max safe int32 end: 2147483647

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The effective end value (the end of time 😱) is 2100, but I don't want to put that in a comment since it's an easily-adjusted constant. But there's no limit to the values that can appear in this Range.

Note that none of these have "defaults": see the comment above, you have to specify something. The range in parens is just documentation about what the meaning of the fields are, i.e. month and day of month are 1-indexed, others are zero-indexed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the year input defaults to '*' (in either 5-part cron expression or calendar spec), so I'm wondering how that's represented in the response. [{ start: 2000, end: 2100 }]?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, yeah, if you get it back you'll see { start: 2000, end: 2100 }. That's not great, maybe we should have a better sentinel value there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ends at 2100 is not too bad. Maybe you want just document it that current implementation will ends at 2100, but it is easily adjustable. No big deal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer a better sentinel, but if we keep 2100, how would I document it—like this?

When using '*' for the year (default), the year response value will currently be [{ start: 2000, end: 2100 }]. The end value will change to a later year when the current year approaches 2100.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a specific suggestion for sentinel values? I'm fine with anything here, just not sure what's easiest to work with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like [{ start: 2000, end: 'INFINITY' }] or some large number, like max safe int32 end: 2147483647

Between these two, integer is easier to work with. Maybe make it an even number so it's more obviously a sentinel? Like 1 billion.

If humans are still around by then, we'll probably at least be done with Earth
image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are ints, so we can't use a string here. practically, I think there's not much real difference. the max possible year isn't at thing the proto level anyway, it's just a convention. (well, unless we introduce a new field.) if we establish a strong convention, or another way to signal this (e.g. leave out the year field here), it can be done in a backwards-compatible way. so I think we can just leave this unspecified for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving out is my favorite solution

Comment on lines 91 to 92
// 6 fields: minute, hour, day_of_month, month, day_of_week, year
// 7 fields: second, minute, hour, day_of_month, month, day_of_week, year
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know we support year. Is there legit use case where you need to specify year?
hmmm, maybe you only want to run for a curtain year.... never thought about this before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It's a natural extension and other systems have it.
  2. It lets you do stuff like make a list of holidays (that fall on different dates each year). Those can be used for either actions or excludes.

Not quite related: I was also considering adding another type of spec

message PrecalculatedSpec {
  repeated google.protobuf.Timestamp timestamp = 1;
}

so we can support other types on the client-side without building them into the server, as long as you're willing to expand it all. Thoughts?

Comment on lines 144 to 148
// Match years (2000-).
repeated Range year = 6;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ends at 2100 is not too bad. Maybe you want just document it that current implementation will ends at 2100, but it is easily adjustable. No big deal.

// Calendar-based specifications of times. Note that calendar and
// cron_string are input-only! They will be compiled into
// structured_calendar when the schedule is created or updated, so when you
// describe a schedule, you'll only see that.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will only see that, what is that? (I think it is structured_calendar.)
// probably because i'm not native english speaker.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is admittedly awkward wording

Suggested change
// describe a schedule, you'll only see that.
// describe a schedule, you won't see calendar or cron_string, you'll see
// only structured_calendar and interval.

Comment on lines +184 to +189
// Any timestamps matching any of exclude_* will be skipped.
repeated CalendarSpec exclude_calendar = 3 [deprecated = true]; // use exclude_structured_calendar
repeated StructuredCalendarSpec exclude_structured_calendar = 9;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for excludes, we may want to default unspecified fields to *.

Comment on lines 350 to 355
// From spec:
// Some fields are too large/unimportant for the purpose of listing, so we'll clear them
// from this copy of spec: exclude_calendar, jitter, timezone_data.
// from this copy of spec: exclude_*, jitter, timezone_data.
ScheduleSpec spec = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we target mutable memo to be available in 1.18 and switch to use that before we make this more available?

Comment on lines 126 to 128
// calendar, in a form that's easy to work with programmatically. Each field is
// be one or more ranges.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// calendar, in a form that's easy to work with programmatically. Each field is
// be one or more ranges.
// calendar, in a form that's easy to work with programmatically. Each field can
// be one or more ranges.

// @every <interval>[/<phase>] is accepted and gets compiled into an
// IntervalSpec instead. <interval> and <phase> should have a unit suffix s, m,
// h, or d.
message CronString {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thought on this one: to make it even simpler, this could not be a message at all, we could just have repeated string cron_string in ScheduleSpec. And then to support a comment we can just stuff it in the string: document that any text after # will be treated as a comment and copied to the comment field. That seems a little more in line with the idea of having a single string shorthand. Thoughts?

Comment on lines 144 to 148
// Match years (2000-).
repeated Range year = 6;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a specific suggestion for sentinel values? I'm fine with anything here, just not sure what's easiest to work with.

Comment on lines 179 to 180
// See comment above for cron_string format.
repeated string cron_string = 8;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could move the comment here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. I left there because it's really long and wanted this to read more easily, but I guess it'll screw up the generated docs, so I should move it

@dnr dnr merged commit ddf07ab into temporalio:master Sep 13, 2022
@dnr dnr deleted the sched5 branch September 13, 2022 00:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants