Configurable Serf ReconnectTimeout and TombstoneTimeout #333

gregory-m · 2015-10-25T10:20:26Z

Part of our infrastructure running on AWS spot instances. I played with nomad on spot instances and finished with 50 members in "failed" state. I only run 3 instances concurrent, and can imagine what will happened when I will run 50 agents on spot instances. The default 3 days timeout its huge if you running on spot instances.

If you agree I can open PR.

Thanks.

dadgar · 2015-10-26T18:12:58Z

Hey,

Is there a concern with this? Nomad won't schedule to the failed nodes, they just remain in the system in case they reconnect.

But I agree, we need to expose more of the configuration so a PR would be appreciated!

gregory-m · 2015-10-26T18:34:46Z

It's only user interface issue. And yes its only important to these who run nomad on highly changeable environments like spot instances. For example we bid on 10 spot instances. After hour or two somebody overbids us, and we decide to launch 10 regular instances. 5 hours later spot instances prices go down. And we decide to run 10 spot instances and shutdown regular ones.

This will lead to 20 instances in "leave" state which will never rejoin cluster.

Not related to this ticket but one more question:
Serf doesn't provide any public methods to remove instances from cluster maybe it's good idea to make "reap" function public?

github-actions · 2022-12-29T02:14:47Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

dadgar added type/enhancement theme/core labels Oct 26, 2015

This was referenced Oct 28, 2015

Serf timeouts #354

Closed

Configure Node GC Threshold #362

Merged

dadgar closed this as completed in #362 Oct 29, 2015

benbuzbee pushed a commit to benbuzbee/nomad that referenced this issue Jul 21, 2022

Update changelog. (hashicorp#333)

2016deb

github-actions bot locked as resolved and limited conversation to collaborators Dec 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configurable Serf ReconnectTimeout and TombstoneTimeout #333

Configurable Serf ReconnectTimeout and TombstoneTimeout #333

gregory-m commented Oct 25, 2015

dadgar commented Oct 26, 2015

gregory-m commented Oct 26, 2015

github-actions bot commented Dec 29, 2022

Configurable Serf ReconnectTimeout and TombstoneTimeout #333

Configurable Serf ReconnectTimeout and TombstoneTimeout #333

Comments

gregory-m commented Oct 25, 2015

dadgar commented Oct 26, 2015

gregory-m commented Oct 26, 2015

github-actions bot commented Dec 29, 2022