fix(redis-ha): recover no-master cycle when sentinel names a replica as master#410
Open
mohamed3laa333 wants to merge 1 commit into
Open
Conversation
…as master After disorderly pod restarts the Sentinel quorum can settle on a master address that points at a node whose role is actually `slave`, leaving the cluster in an all-replicas / no-master cycle with HAProxy `bk_redis_master` serving 0 backends (a write outage). PR DandyDeveloper#404 (closing DandyDeveloper#397/DandyDeveloper#398) only added a `sentinel reset` for the case where `get-master-addr-by-name` returns an EMPTY master. The non-empty case (Sentinel names a real address that answers `role:slave`) instead falls into the `MASTER = ANNOUNCE_IP` branch, which calls `reinit`. For any pod that is not StatefulSet ordinal 0, `reinit` re-runs init.sh (which re-derives `slaveof`) and shuts the pod down, so it restarts straight back into a replica while Sentinel keeps naming it: an endless reinit/shutdown CrashLoop (the "unnecessary master shutdown" of DandyDeveloper#383) that never elects a master. Fix: when Sentinel names this pod as master but Redis here is a replica, promote it with `replicaof no one` instead of reinit-looping it. Sentinel has already elected this pod, and only one pod can match get-master-addr-by-name, so this cannot create two masters; the other replicas already target this address, so their links recover once it is a real master. This generalizes the self-healing of DandyDeveloper#404 to the pingable-slave variant. Refs DandyDeveloper#398, DandyDeveloper#397, DandyDeveloper#383.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
After disorderly pod restarts (e.g. several redis pods restarting near-simultaneously), the Sentinel quorum can settle on a master address that points at a node whose role is actually
slave. Every node ends up a replica of another (an all-replicas / no-master cycle), so HAProxybk_redis_masterhas 0 backends and writes fail until a human intervenes withsentinel reset.#404 (closing #397/#398) added a
sentinel resetself-heal, but it only fires whenget-master-addr-by-namereturns an empty master. The non-empty variant — Sentinel returns a real address that answersrole:slave— is not covered.Root cause
In
fix-split-brain.sh, whenMASTER == ANNOUNCE_IP(Sentinel names this pod as master) but the local role isslave, the script callsreinit.reinitre-runsinit.sh(which re-derivesslaveoffor any pod that is not StatefulSet ordinal 0) andshutdowns the pod. So the pod restarts straight back into a replica while Sentinel keeps naming it → an endless reinit/shutdown CrashLoop (the "unnecessary master shutdown" reported in #383) that never elects a master, and the cycle never breaks.Fix
When Sentinel names this pod as master but Redis here is a replica, promote it with
replicaof no oneinstead of reinit-looping it:get-master-addr-by-name, so this cannot create two masters.This generalizes the self-healing introduced in #404 to the pingable-slave variant. The
elsebranch keeps the original "no need to reinitialize" log for the healthy case.Validation
helm lintpasses;helm templateconfirmspromote_self/replicaof no onerender into the split-brainConfigMapand the rest of the chart is unchanged.Refs #398, #397, #383.