Design for executing commands via RPC

Current state and shortcomings

We have encountered situations where a node was no longer responding to attempts at connecting via SSH or SSH became unavailable through other means. Quite often the node daemon is still available, even in situations where there’s little free memory. The latter is due to the node daemon being locked into main memory using mlock(2).

Since the node daemon does not allow the execution of arbitrary commands, quite often the only solution left was either to attempt a powercycle request via said node daemon or to physically reset the node.

Proposed changes

The goal of this design is to allow the execution of non-arbitrary commands via RPC requests. Since this can be dangerous in case the cluster certificate (server.pem) is leaked, some precautions need to be taken:

  • No parameters may be passed
  • No absolute or relative path may be passed, only a filename
  • Executable must reside in /etc/ganeti/restricted-commands, which must be owned by root:root and have mode 0755 or stricter - Must be regular files or symlinks - Must be executable by root:root

There shall be no way to list available commands or to retrieve an executable’s contents. The result from a request to execute a specific command will either be its output and exit code, or a generic error message. Only the receiving node’s log files shall contain information as to why executing the command failed.

To slow down dictionary attacks on command names in case an attacker manages to obtain a copy of server.pem, a system-wide, file-based lock is acquired before verifying the command name and its executable. If a command can not be executed for some reason, the lock is only released with a delay of several seconds, after which the generic error message will be returned to the caller.

At first, restricted commands will not be made available through the remote API, though that could be done at a later point (with a separate password).

On the command line, a new sub-command will be added to the gnt-node script.

Table Of Contents

Previous topic

Ganeti reason trail

Next topic

Ganeti shared storage support

This Page