Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve workflow task retry handling #431

Open
dhiaayachi opened this issue Sep 5, 2024 · 2 comments
Open

Improve workflow task retry handling #431

dhiaayachi opened this issue Sep 5, 2024 · 2 comments

Comments

@dhiaayachi
Copy link
Owner

Is your feature request related to a problem? Please describe.
Exiting logic slows down workflow task retry by increasing workflow task start to close timeout (up to 10min) and rely on the fact that SDK won't respond workflow task failure if workflow task has attempt > 1.

However this does't work well with features that need to wait for pending workflow task to complete.
One example is Query. If there's a pending workflow, query must wait for it to complete before it can be dispatched. Before that, query is buffered in-memory in workflow mutable state. Now since it can take workflow task a long time to complete, mutable state may get evicted from cache and query API will fail with Unavailable error, or the query API itself can timeout.

Describe the solution you'd like
Essentially need a better way for handling workflow task retry

  • Workflow pause feature
  • Do not override workflow task start to close timeout, but schedule a backoff time for the next workflow task attempt.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

@dhiaayachi
Copy link
Owner Author

Thank you for reporting this issue.

The behavior you're describing, where exiting logic slows down workflow task retries and impacts features like Query, is a known issue and it's being actively worked on. There's no immediate workaround available, and it's recommended to use Temporal's default behavior for retrying workflow tasks until a solution is implemented.

We appreciate your patience and understanding as we work to resolve this.

@dhiaayachi
Copy link
Owner Author

Thank you for reporting this issue. The behavior you're experiencing with workflow task retries and its impact on features like Query is a known issue. We are actively working on addressing this in future releases.

In the meantime, a potential workaround is to use a custom retry policy with a NextRetryDelay based on the number of attempts in your Activity code. This allows you to specify the retry interval dynamically, addressing the issue with the increasing workflow task timeout.

Here's an example of how to implement this:

private const int MaxRetryAttempts = 5;
private const int InitialRetryInterval = 1000; 
private const int BackoffCoefficient = 2;
private TimeSpan MaximumRetryInterval = TimeSpan.FromMinutes(5);

public async Task<string> MyActivityAsync() 
{
    // ... your activity logic ... 
    
    int attempt = Activity.GetExecutionContext().GetInfo().Attempt;
    if (attempt < MaxRetryAttempts) 
    {
        // Calculate dynamic retry delay
        TimeSpan retryDelay = TimeSpan.FromMilliseconds(InitialRetryInterval * Math.Pow(BackoffCoefficient, attempt - 1));
        retryDelay = TimeSpan.FromMilliseconds(Math.Min(retryDelay.TotalMilliseconds, MaximumRetryInterval.TotalMilliseconds));
        throw new ApplicationFailure("Retry due to error", "RetryError", null, retryDelay);
    }
    else
    {
        // Handle the case when max retries are reached
        throw new ApplicationFailure("Max retries reached", "MaxRetryError", null, TimeSpan.Zero);
    }
}

This approach allows your Activity to dynamically adjust the retry interval based on the number of attempts, potentially preventing the workflow task timeout from escalating unnecessarily.

We appreciate your feedback and will keep you updated on the progress of this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant