Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Sentinel - support hostname resolve and annonce #1247

Merged
merged 3 commits into from
Feb 28, 2025

Conversation

mkl262
Copy link
Contributor

@mkl262 mkl262 commented Feb 23, 2025

Description
This feature enables support for hostname resolution and announcing for sentinel configuration, for a more flexible configuration

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist

  • Tests have been added/modified and all tests pass.
  • Functionality/bugs have been confirmed to be unchanged or fixed.
  • I have performed a self-review of my own code.
  • Documentation has been updated or added where necessary.

Additional Context

An additional pull request was created in redis docker image repo to support new environment variables.
OT-CONTAINER-KIT/redis#94

@mkl262 mkl262 force-pushed the sentinel-hostname-support branch 2 times, most recently from baa9a65 to 5b60bc7 Compare February 24, 2025 19:18
@mkl262 mkl262 changed the title Sentinel - support hostname resolve and annonce master feat: Sentinel - support hostname resolve and annonce master Feb 25, 2025
@mkl262 mkl262 changed the title feat: Sentinel - support hostname resolve and annonce master feat: Sentinel - support hostname resolve and annonce Feb 25, 2025
@nicogigi92
Copy link

+1 for this

I tested this PR, and it works well on my side. This is a feature we would love to see included in the operator. Is it planned to be merged?

Just few comment if I may:

  • It would be great to be able to add the cluster's internal domain to the sentinels' hostname. For performance reasons, it's sometimes better to avoid iterating over search domains by using FQDNs.

  • I also noticed one missing piece: currently, only the cluster discovery happens via hostname, but the runtime still relies on IPs. The following needs to be added to ensure full hostname-based communication:
    sentinel announce-ip $HOSTNAME
    Without this, we end up with IP addresses in the Sentinel config instead of hostnames:
    sentinel known-sentinel myMaster 10.244.1.9 26379 17cfaeaa4102e39fdcb8336d7a992fedf13db4a1

  • Last but not least, is there a planned PR to implement the same behavior for replication so that they communicate and announce using hostnames? By doing this, we could avoid having to push IP/HOSTNAME env var changes through the reconciliation loop. We could rely solely on hostnames, and Redis Sentinel would handle the master <-> slave failover on its own thanks to DNS resolution.

@drivebyer
Copy link
Collaborator

Fix the lint check plz @mkl262

@mkl262 mkl262 force-pushed the sentinel-hostname-support branch from 5b60bc7 to 06109c9 Compare February 27, 2025 18:33
@mkl262
Copy link
Contributor Author

mkl262 commented Feb 27, 2025

@nicogigi92

I am working on improving the redundancy of the redis replication after some internal test.
I am planing to add a service with a clusterIP for each of the pods and the service DNS or its IP will be used for replication between redis pods. This will allow each pod to have a static ip address.
I will also pass the list of services into sentinel and add them as known replicas.

@drivebyer should be fixed

image

Copy link

codecov bot commented Feb 28, 2025

Codecov Report

Attention: Patch coverage is 56.09756% with 18 lines in your changes missing coverage. Please review.

Project coverage is 34.97%. Comparing base (60ef5a2) to head (ecbf821).
Report is 50 commits behind head on master.

Files with missing lines Patch % Lines
pkg/k8sutils/redis-sentinel.go 56.09% 15 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1247      +/-   ##
==========================================
+ Coverage   30.20%   34.97%   +4.76%     
==========================================
  Files          55       49       -6     
  Lines        6294     5573     -721     
==========================================
+ Hits         1901     1949      +48     
+ Misses       4200     3452     -748     
+ Partials      193      172      -21     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: yangw <[email protected]>
Signed-off-by: yangw <[email protected]>
@drivebyer drivebyer enabled auto-merge (squash) February 28, 2025 03:39
Copy link
Collaborator

@drivebyer drivebyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@drivebyer drivebyer merged commit aafc663 into OT-CONTAINER-KIT:master Feb 28, 2025
19 checks passed
@nicogigi92
Copy link

nicogigi92 commented Feb 28, 2025

Thank you for the clarification @mkl262 !

However, I still don't understand why additional services are being added and why the operator is given control over the master. The replicas only need to announce their hostname (headless service) via announce-ip, and the failover will be handled by Sentinel without requiring an environment variable push (which would trigger a StatefulSet rollout).

Moreover, I still don't see the use of the announce-ip option in the Sentinel configuration, which, according to the official Redis documentation, seems necessary to avoid mixing hostnames and IP addresses.

When I ran tests, I found that using announce-ip <headless-svc> without the operator improved the high availability of a replication + Sentinel system because it eliminates the rollouts triggered by the operator while maintaining cluster synchronization over DNS.

Please let me know if I'm missing something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants