Known troubleshooting scenarios¶
This page lists specific operational issues that we know about, and how to solve them.
Contact us or contribute if there is a scenario you’d like to suggest, or add yourself.
One offline unit and other units as secondaries¶
Problem: The primary unit went offline and primary reelection failed, rendered remaining units in RO mode.
Solution:
Restart
mysqld_safe
service on secondaries, i.e.:# for each secondarie unit `n` juju ssh --container mysql mysql-k8s/n pebble restart mysqld_safe
Wait update-status hook to trigger recovery. For faster recovery, it’s possible to speed up the update-status hook with:
juju model-config update-status-hook-interval=30s -m mymodel # after recovery, set default interval of 5 minutes juju model-config update-status-hook-interval=5m -m mymodel
Explanation: When restarting secondaries, all MySQL instance will return as offline, which will trigger a cluster recovery.
Two primaries, one in “split-brain” state¶
Problem: Original primary had a transitory network cut, and a new primary was elected. On returning, old primary enter split-brain state.
Solution:
Restart
mysqld_safe
service on secondaries, i.e.:# using `n` as the unit in split brain state juju ssh --container mysql mysql-k8s/n pebble restart mysqld_safe
Wait unit rejoin the cluster
Explanation: On restart, unit will reset it state and try to rejoin the cluster as a secondary.