Script ganeti_watcher
Tool to restart erroneously downed virtual machines.
This program and set of classes implement a watchdog to restart
virtual machines in a Ganeti cluster that have crashed or been killed by
a node reboot. Run from cron or similar.
|
NotMasterError
Exception raised when this host is not the master.
|
|
NodeMaintenance
Talks to confd daemons and possible shutdown instances/drbd
devices.
|
|
WatcherState
Interface to a state file recording restart attempts.
|
|
Instance
Abstraction for a Virtual Machine instance.
|
|
Watcher
Encapsulate the logic for restarting erroneously halted virtual
machines.
|
|
ShouldPause()
Check whether we should pause. |
|
|
|
StartNodeDaemons()
Start all the daemons that should be running on all nodes. |
|
|
|
RunWatcherHooks()
Run the watcher hooks. |
|
|
|
GetClusterData()
Get a list of instances on this cluster. |
|
|
|
OpenStateFile(path)
Opens the state file and acquires a lock on it. |
|
|
bool
|
|
|
|
|
|
|
MAXTRIES = 5
|
|
RETRY_EXPIRATION = 8* 3600
|
|
BAD_STATES = ['ERROR_down']
|
|
HELPLESS_STATES = ['ERROR_nodedown', 'ERROR_nodeoffline']
|
|
NOTICE = 'NOTICE'
|
|
ERROR = 'ERROR'
|
|
KEY_RESTART_COUNT = "restart_count"
|
|
KEY_RESTART_WHEN = "restart_when"
|
|
KEY_BOOT_ID = "bootid"
|
|
client = None
|
Imports:
os,
sys,
time,
logging,
OptionParser,
utils,
constants,
serializer,
errors,
opcodes,
cli,
luxi,
ssconf,
bdev,
hypervisor,
rapi,
confd_client,
netutils,
ganeti
Opens the state file and acquires a lock on it.
- Parameters:
path (string) - Path to state file
|
IsRapiResponding(hostname)
|
|
Connects to RAPI port and does a simple test.
Connects to RAPI port of hostname and does a simple test. At this
time, the test is GetVersion.
- Parameters:
hostname (string) - hostname of the node to connect to.
- Returns: bool
- Whether RAPI is working properly
|
Parse the command line options.
- Returns:
- (options, args) as from OptionParser.parse_args()
|
Main function.
- Decorators:
@rapi.client.UsesRapiClient
|