Package watcher
source code
Tool to restart erroneously downed virtual machines.
  This program and set of classes implement a watchdog to restart 
  virtual machines in a Ganeti cluster that have crashed or been killed by 
  a node reboot.  Run from cron or similar.
    | 
       
     | 
        NotMasterError 
      Exception raised when this host is not the master.
     | 
  
    | 
       
     | 
        NodeMaintenance 
      Talks to confd daemons and possible shutdown instances/drbd 
        devices.
     | 
  
    | 
       
     | 
        WatcherState 
      Interface to a state file recording restart attempts.
     | 
  
    | 
       
     | 
        Instance 
      Abstraction for a Virtual Machine instance.
     | 
  
    | 
       
     | 
        Watcher 
      Encapsulate the logic for restarting erroneously halted virtual 
        machines.
     | 
  
    | 
       
     | 
      
        
          ShouldPause() 
      Check whether we should pause. | 
          
            source code
            
           | 
         
       
      
     | 
  
    | 
       
     | 
      
        
          StartNodeDaemons() 
      Start all the daemons that should be running on all nodes. | 
          
            source code
            
           | 
         
       
      
     | 
  
    | 
       
     | 
      
      
     | 
  
    | 
       
     | 
      
        
          GetClusterData() 
      Get a list of instances on this cluster. | 
          
            source code
            
           | 
         
       
      
     | 
  
    | 
       
     | 
      
      
     | 
  
    | 
      bool
     | 
      
      
     | 
  
    | 
       
     | 
      
      
     | 
  
    | 
       
     | 
      
      
     | 
  
    | 
       
     | 
        MAXTRIES = 5
     | 
  
    | 
       
     | 
        RETRY_EXPIRATION = 8* 3600
     | 
  
    | 
       
     | 
        BAD_STATES = ['ERROR_down']
     | 
  
    | 
       
     | 
        HELPLESS_STATES = ['ERROR_nodedown', 'ERROR_nodeoffline']
     | 
  
    | 
       
     | 
        NOTICE = 'NOTICE'
     | 
  
    | 
       
     | 
        ERROR = 'ERROR'
     | 
  
    | 
       
     | 
        KEY_RESTART_COUNT = "restart_count"
     | 
  
    | 
       
     | 
        KEY_RESTART_WHEN = "restart_when"
     | 
  
    | 
       
     | 
        KEY_BOOT_ID = "bootid"
     | 
  
    | 
       
     | 
        client = None
     | 
  
Imports:
  os,
  sys,
  time,
  logging,
  OptionParser,
  utils,
  constants,
  serializer,
  errors,
  opcodes,
  cli,
  luxi,
  ssconf,
  bdev,
  hypervisor,
  rapi,
  confd_client,
  netutils,
  ganeti
| 
  
  
   Opens the state file and acquires a lock on it. 
  
    - Parameters:
 
    
        path (string) - Path to state file 
      
   
 | 
 
| 
  
  
   Connects to RAPI port and does a simple test. 
  Connects to RAPI port of hostname and does a simple test. At this 
  time, the test is GetVersion. 
  
    - Parameters:
 
    
        hostname (string) - hostname of the node to connect to. 
      
    - Returns: bool
 
        - Whether RAPI is working properly
 
   
 | 
 
| 
  
  
   Parse the command line options. 
  
    - Returns:
 
        - (options, args) as from OptionParser.parse_args()
 
   
 | 
 
| 
  
  
   Main function. 
  
    - Decorators:
 
    
        @rapi.client.UsesRapiClient 
      
   
 |