Documents Ganeti version 2.3
Contents
Ganeti supports a remote API for enable external tools to easily retrieve information about a cluster’s state. The remote API daemon, ganeti-rapi, is automatically started on the master node. By default it runs on TCP port 5080, but this can be changed either in .../constants.py or via the command line parameter -p. SSL mode, which is used by default, can also be disabled by passing command line parameters.
ganeti-rapi reads users and passwords from a file (usually /var/lib/ganeti/rapi_users) on startup. Changes to the file will be read automatically.
Each line consists of two or three fields separated by whitespace. The first two fields are for username and password. The third field is optional and can be used to specify per-user options. Currently, write is the only option supported and enables the user to execute operations modifying the cluster. Lines starting with the hash sign (#) are treated as comments.
Passwords can either be written in clear text or as a hash. Clear text passwords may not start with an opening brace ({) or they must be prefixed with {cleartext}. To use the hashed form, get the MD5 hash of the string $username:Ganeti Remote API:$password (e.g. echo -n 'jack:Ganeti Remote API:abc123' | openssl md5) [1] and prefix it with {ha1}. Using the scheme prefix for all passwords is recommended. Scheme prefixes are not case sensitive.
Example:
# Give Jack and Fred read-only access
jack abc123
fred {cleartext}foo555
# Give write access to an imaginary instance creation script
autocreator xyz789 write
# Hashed password for Jessica
jessica {HA1}7046452df2cbb530877058712cf17bd4 write
[1] | Using the MD5 hash of username, realm and password is described in RFC 2617 (“HTTP Authentication”), sections 3.2.2.2 and 3.3. The reason for using it over another algorithm is forward compatibility. If ganeti-rapi were to implement HTTP Digest authentication in the future, the same hash could be used. In the current version ganeti-rapi‘s realm, Ganeti Remote API, can only be changed by modifying the source code. |
The protocol used is JSON over HTTP designed after the REST principle. HTTP Basic authentication as per RFC 2617 is supported.
JSON as used by Ganeti RAPI does not conform to the specification in RFC 4627. Section 2 defines a JSON text to be either an object ({"key": "value", …}) or an array ([1, 2, 3, …]). In violation of this RAPI uses plain strings ("master-candidate", "1234") for some requests or responses. Changing this now would likely break existing clients and cause a lot of trouble.
Unlike Python’s JSON encoder and decoder, other programming languages or libraries may only provide a strict implementation, not allowing plain values. For those, responses can usually be wrapped in an array whose first element is then used, e.g. the response "1234" becomes ["1234"]. This works equally well for more complex values. Example in Ruby:
require "json"
# Insert code to get response here
response = "\"1234\""
decoded = JSON.parse("[#{response}]").first
Short of modifying the encoder to allow encoding to a less strict format, requests will have to be formatted by hand. Newer RAPI requests already use a dictionary as their input data and shouldn’t cause any problems.
According to RFC 2616 the main difference between PUT and POST is that POST can create new resources but PUT can only create the resource the URI was pointing to on the PUT request.
Unfortunately, due to historic reasons, the Ganeti RAPI library is not consistent with this usage, so just use the methods as documented below for each resource.
For more details have a look in the source code at lib/rapi/rlib2.py.
A few generic refered parameter types and the values they allow.
A few parameter mean the same thing across all resources which implement it.
Bulk-mode means that for the resources which usually return just a list of child resources (e.g. /2/instances which returns just instance names), the output will instead contain detailed data for all these subresources. This is more efficient than query-ing the sub-resources themselves.
The boolean dry-run argument, if provided and set, signals to Ganeti that the job should not be executed, only the pre-execution checks will be done.
This is useful in trying to determine (without guarantees though, as in the meantime the cluster state could have changed) if the operation is likely to succeed or at least start executing.
You can access the API using your favorite programming language as long as it supports network connections.
Ganeti includes a standalone RAPI client, lib/rapi/client.py.
Using wget:
wget -q -O - https://CLUSTERNAME:5080/2/info
or curl:
curl https://CLUSTERNAME:5080/2/info
Warning
While it’s possible to use JavaScript, it poses several potential problems, including browser blocking request due to non-standard ports or different domain names. Fetching the data on the webserver is easier.
var url = 'https://CLUSTERNAME:5080/2/info';
var info;
var xmlreq = new XMLHttpRequest();
xmlreq.onreadystatechange = function () {
if (xmlreq.readyState != 4) return;
if (xmlreq.status == 200) {
info = eval("(" + xmlreq.responseText + ")");
alert(info);
} else {
alert('Error fetching cluster info');
}
xmlreq = null;
};
xmlreq.open('GET', url, true);
xmlreq.send(null);
Cluster information resource.
It supports the following commands: GET.
Returns cluster information.
Example:
{
"config_version": 2000000,
"name": "cluster",
"software_version": "2.0.0~beta2",
"os_api_version": 10,
"export_version": 0,
"candidate_pool_size": 10,
"enabled_hypervisors": [
"fake"
],
"hvparams": {
"fake": {}
},
"default_hypervisor": "fake",
"master": "node1.example.com",
"architecture": [
"64bit",
"x86_64"
],
"protocol_version": 20,
"beparams": {
"default": {
"auto_balance": true,
"vcpus": 1,
"memory": 128
}
}
}
Redistribute configuration to all nodes.
It supports the following commands: PUT.
The instances resource.
It supports the following commands: GET, POST.
Returns a list of all available instances.
Example:
[
{
"name": "web.example.com",
"uri": "\/instances\/web.example.com"
},
{
"name": "mail.example.com",
"uri": "\/instances\/mail.example.com"
}
]
If the optional bool bulk argument is provided and set to a true value (i.e ?bulk=1), the output contains detailed information about instances as a list.
Example:
[
{
"status": "running",
"disk_usage": 20480,
"nic.bridges": [
"xen-br0"
],
"name": "web.example.com",
"tags": ["tag1", "tag2"],
"beparams": {
"vcpus": 2,
"memory": 512
},
"disk.sizes": [
20480
],
"pnode": "node1.example.com",
"nic.macs": ["01:23:45:67:89:01"],
"snodes": ["node2.example.com"],
"disk_template": "drbd",
"admin_state": true,
"os": "debian-etch",
"oper_state": true
},
...
]
Creates an instance.
If the optional bool dry-run argument is provided, the job will not be actually executed, only the pre-execution checks will be done. Query-ing the job result will return, in both dry-run and normal case, the list of nodes selected for the instance.
Returns: a job ID that can be used later for polling.
Body parameters:
Instance-specific resource.
It supports the following commands: GET, DELETE.
It supports the following commands: GET.
Reboots URI for an instance.
It supports the following commands: POST.
Reboots the instance.
The URI takes optional type=soft|hard|full and ignore_secondaries=0|1 parameters.
type defines the reboot type. soft is just a normal reboot, without terminating the hypervisor. hard means full shutdown (including terminating the hypervisor process) and startup again. full is like hard but also recreates the configuration from ground up as if you would have done a gnt-instance shutdown and gnt-instance start on it.
ignore_secondaries is a bool argument indicating if we start the instance even if secondary disks are failing.
It supports the dry-run argument.
Instance shutdown URI.
It supports the following commands: PUT.
Instance startup URI.
It supports the following commands: PUT.
Installs the operating system again.
It supports the following commands: POST.
Replaces disks on an instance.
It supports the following commands: POST.
Takes the parameters mode (one of replace_on_primary, replace_on_secondary, replace_new_secondary or replace_auto), disks (comma separated list of disk indexes), remote_node and iallocator.
Either remote_node or iallocator needs to be defined when using mode=replace_new_secondary.
mode is a mandatory parameter. replace_auto tries to determine the broken disk(s) on its own and replacing it.
Activate disks on an instance.
It supports the following commands: PUT.
Deactivate disks on an instance.
It supports the following commands: PUT.
Prepares an export of an instance.
It supports the following commands: PUT.
Exports an instance.
It supports the following commands: PUT.
Returns a job ID.
Body parameters:
Modifies an instance.
Supports the following commands: PUT.
Returns a job ID.
Body parameters:
Manages per-instance tags.
It supports the following commands: GET, PUT, DELETE.
Individual job URI.
It supports the following commands: GET, DELETE.
Returns a job status.
Returns: a dictionary with job parameters.
The result includes:
For a successful opcode, the opresult field corresponding to it will contain the raw result from its LogicalUnit. In case an opcode has failed, its element in the opresult list will be a list of two elements:
The error classification is most useful for the OpPrereqError error type - these errors happen before the OpCode has started executing, so it’s possible to retry the OpCode without side effects. But whether it make sense to retry depends on the error classification:
Note that in the above list, by entity we refer to a node or instance, while by a resource we refer to an instance’s disk, or NIC, etc.
Waits for changes on a job. Takes the following body parameters in a dict:
Returns None if no changes have been detected and a dict with two keys, job_info and log_entries otherwise.
Nodes resource.
It supports the following commands: GET.
Returns a list of all nodes.
Example:
[
{
"id": "node1.example.com",
"uri": "\/nodes\/node1.example.com"
},
{
"id": "node2.example.com",
"uri": "\/nodes\/node2.example.com"
}
]
If the optional ‘bulk’ argument is provided and set to ‘true’ value (i.e ‘?bulk=1’), the output contains detailed information about nodes as a list.
Example:
[
{
"pinst_cnt": 1,
"mfree": 31280,
"mtotal": 32763,
"name": "www.example.com",
"tags": [],
"mnode": 512,
"dtotal": 5246208,
"sinst_cnt": 2,
"dfree": 5171712,
"offline": false
},
...
]
Evacuates all secondary instances off a node.
It supports the following commands: POST.
To evacuate a node, either one of the iallocator or remote_node parameters must be passed:
evacuate?iallocator=[iallocator]
evacuate?remote_node=[nodeX.example.com]
The result value will be a list, each element being a triple of the job id (for this specific evacuation), the instance which is being evacuated by this job, and the node to which it is being relocated. In case the node is already empty, the result will be an empty list (without any jobs being submitted).
And additional parameter early_release signifies whether to try to parallelize the evacuations, at the risk of increasing I/O contention and increasing the chances of data loss, if the primary node of any of the instances being evacuated is not fully healthy.
If the dry-run parameter was specified, then the evacuation jobs were not actually submitted, and the job IDs will be null.
Migrates all primary instances from a node.
It supports the following commands: POST.
If no mode is explicitly specified, each instances’ hypervisor default migration mode will be used. Query parameters:
Manages node role.
It supports the following commands: GET, PUT.
The role is always one of the following:
- drained
- master
- master-candidate
- offline
- regular
Manages storage units on the node.
Modifies storage units on the node.
Repairs a storage unit on the node.
Manages per-node tags.
It supports the following commands: GET, PUT, DELETE.