Upgrade notes¶
This document details the steps needed to upgrade a cluster to newer versions of Ganeti.
As a general rule the node daemons need to be restarted after each software upgrade; if using the provided example init.d script, this means running the following command on all nodes:
$ /etc/init.d/ganeti restart
2.11 and above¶
Starting from 2.10 onwards, Ganeti has support for parallely installed versions and automated upgrades. The default configuration for 2.11 and higher already is to install as a parallel version without changing the running version. If both versions, the installed one and the one to upgrade to, are 2.10 or higher, the actual switch of the live version can be carried out by the following command on the master node.:
$ gnt-cluster upgrade --to 2.11
This will carry out the steps described below in the section on upgrades from
2.1 and above. Downgrades to the previous minor version can be done in the same
way, specifiying the smaller version on the --to
argument.
Note that gnt-cluster upgrade
only manages the actual switch between
versions as described below on upgrades from 2.1 and above. It does not install
or remove any binaries. Having the new binaries installed is a prerequisite of
calling gnt-cluster upgrade
(and the command will abort if the prerequisite
is not met). The old binaries can be used to downgrade back to the previous
version; once the system administrator decides that going back to the old
version is not needed any more, they can be removed. Addition and removal of
the Ganeti binaries should happen in the same way as for all other binaries on
your system.
2.13¶
When upgrading to 2.13, first apply the instructions of 2.11 and
above
. 2.13 comes with the new feature of enhanced SSH security
through individual SSH keys. This features needs to be enabled
after the upgrade by:
$ gnt-cluster renew-crypto --new-ssh-keys --no-ssh-key-check
Note that new SSH keys are generated automatically without warning when
upgrading with gnt-cluster upgrade
.
If you instructed Ganeti to not touch the SSH setup (by using the
--no-ssh-init
option of gnt-cluster init
, the changes in the
handling of SSH keys will not affect your cluster.
If you want to be prompted for each newly created SSH key, leave out
the --no-ssh-key-check
option in the command listed above.
Note that after a downgrade from 2.13 to 2.12, the individual SSH keys will not get removed automatically. This can lead to reachability errors under very specific circumstances (Issue 1008). In case you plan on keeping 2.12 for a while and not upgrade to 2.13 again soon, we recommend to replace all SSH key pairs of non-master nodes’ with the master node’s SSH key pair.
2.12¶
Due to issue #1094 in Ganeti 2.11 and 2.12 up to version 2.12.4, we advise to rerun ‘gnt-cluster renew-crypto –new-node-certificates’ after an upgrade to 2.12.5 or higher.
2.11¶
When upgrading to 2.11, first apply the instructions of 2.11 and
above
. 2.11 comes with the new feature of enhanced RPC security
through client certificates. This features needs to be enabled after the
upgrade by:
$ gnt-cluster renew-crypto --new-node-certificates
Note that new node certificates are generated automatically without
warning when upgrading with gnt-cluster upgrade
.
2.1 and above¶
Starting with Ganeti 2.0, upgrades between revisions (e.g. 2.1.0 to 2.1.1)
should not need manual intervention. As a safety measure, minor releases (e.g.
2.1.3 to 2.2.0) require the cfgupgrade
command for changing the
configuration version. Below you find the steps necessary to upgrade between
minor releases.
To run commands on all nodes, the distributed shell (dsh) can be used, e.g.
dsh -M -F 8 -f /var/lib/ganeti/ssconf_online_nodes gnt-cluster --version
.
Ensure no jobs are running (master node only):
$ gnt-job list
Pause the watcher for an hour (master node only):
$ gnt-cluster watcher pause 1h
Stop all daemons on all nodes:
$ /etc/init.d/ganeti stop
Backup old configuration (master node only):
$ tar czf /var/lib/ganeti-$(date +%FT%T).tar.gz -C /var/lib ganeti (``/var/lib/ganeti`` can also contain exported instances, so make sure to backup only files you are interested in. Use ``--exclude export`` for example)
Install new Ganeti version on all nodes
Run cfgupgrade on the master node:
$ /usr/lib/ganeti/tools/cfgupgrade --verbose --dry-run $ /usr/lib/ganeti/tools/cfgupgrade --verbose
(
cfgupgrade
supports a number of parameters, run it with--help
for more information)Upgrade the directory permissions on all nodes:
$ /usr/lib/ganeti/ensure-dirs --full-run
Note that ensure-dirs does not create the directories for file and shared-file storage. This is due to security reasons. They need to be created manually. For details see
man gnt-cluster
.Create the (missing) required users and make users part of the required groups on all nodes:
$ /usr/lib/ganeti/tools/users-setup
This will ask for confirmation. To execute directly, add the
--yes-do-it
option.Restart daemons on all nodes:
$ /etc/init.d/ganeti restart
Re-distribute configuration (master node only):
$ gnt-cluster redist-conf
If you use file storage, check that the
/etc/ganeti/file-storage-paths
is correct on all nodes. For security reasons it’s not copied automatically, but it can be copied manually via:$ gnt-cluster copyfile /etc/ganeti/file-storage-paths
Restart daemons again on all nodes:
$ /etc/init.d/ganeti restart
Enable the watcher again (master node only):
$ gnt-cluster watcher continue
Verify cluster (master node only):
$ gnt-cluster verify
Reverting an upgrade¶
For going back between revisions (e.g. 2.1.1 to 2.1.0) no manual intervention is required, as for upgrades.
Starting from version 2.8, cfgupgrade
supports --downgrade
option to bring the configuration back to the previous stable version.
This is useful if you upgrade Ganeti and after some time you run into
problems with the new version. You can downgrade the configuration
without losing the changes made since the upgrade. Any feature not
supported by the old version will be removed from the configuration, of
course, but you get a warning about it. If there is any new feature and
you haven’t changed from its default value, you don’t have to worry
about it, as it will get the same value whenever you’ll upgrade again.
Automatic downgrades¶
From version 2.11 onwards, downgrades can be done by using the
gnt-cluster upgrade
command.:
gnt-cluster upgrade --to 2.10
Manual downgrades¶
The procedure is similar to upgrading, but please notice that you have to revert the configuration before installing the old version.
Ensure no jobs are running (master node only):
$ gnt-job list
Pause the watcher for an hour (master node only):
$ gnt-cluster watcher pause 1h
Stop all daemons on all nodes:
$ /etc/init.d/ganeti stop
Backup old configuration (master node only):
$ tar czf /var/lib/ganeti-$(date +%FT%T).tar.gz -C /var/lib ganeti
Run cfgupgrade on the master node:
$ /usr/lib/ganeti/tools/cfgupgrade --verbose --downgrade --dry-run $ /usr/lib/ganeti/tools/cfgupgrade --verbose --downgrade
You may want to copy all the messages about features that have been removed during the downgrade, in case you want to restore them when upgrading again.
Install the old Ganeti version on all nodes
NB: in Ganeti 2.8, the
cmdlib.py
file was split into a series of files contained in thecmdlib
directory. If Ganeti is installed from sources and not from a package, while downgrading Ganeti to a pre-2.8 version it is important to remember to remove thecmdlib
directory from the directory containing the Ganeti python files (which usually is${PREFIX}/lib/python${VERSION}/dist-packages/ganeti
). A simpler upgrade/downgrade procedure will be made available in future versions of Ganeti.Restart daemons on all nodes:
$ /etc/init.d/ganeti restart
Re-distribute configuration (master node only):
$ gnt-cluster redist-conf
Restart daemons again on all nodes:
$ /etc/init.d/ganeti restart
Enable the watcher again (master node only):
$ gnt-cluster watcher continue
Verify cluster (master node only):
$ gnt-cluster verify
Specific tasks for 2.11 to 2.10 downgrade¶
After running cfgupgrade
, the client.pem
and
ssconf_master_candidates_certs
files need to be removed
from Ganeti’s data directory on all nodes. While this step is
not necessary for 2.10 to run cleanly, leaving them will cause
problems when upgrading again after the downgrade.
2.0 releases¶
2.0.3 to 2.0.4¶
No changes needed except restarting the daemon; but rollback to 2.0.3 might require configuration editing.
If you’re using Xen-HVM instances, please double-check the network
configuration (nic_type
parameter) as the defaults might have changed:
2.0.4 adds any missing configuration items and depending on the version of the
software the cluster has been installed with, some new keys might have been
added.
2.0.1 to 2.0.2/2.0.3¶
Between 2.0.1 and 2.0.2 there have been some changes in the handling of block devices, which can cause some issues. 2.0.3 was then released which adds two new options/commands to fix this issue.
If you use DRBD-type instances and see problems in instance start or activate-disks with messages from DRBD about “lower device too small” or similar, it is recoomended to:
- Run
gnt-instance activate-disks --ignore-size $instance
for each of the affected instances - Then run
gnt-cluster repair-disk-sizes
which will check that instances have the correct disk sizes
1.2 to 2.0¶
Prerequisites:
- Ganeti 1.2.7 is currently installed
- All instances have been migrated from DRBD 0.7 to DRBD 8.x (i.e. no
remote_raid1
disk template) - Upgrade to Ganeti 2.0.0~rc2 or later (~rc1 and earlier don’t have the needed upgrade tool)
In the below steps, replace /var/lib
with $libdir
if Ganeti was not
installed with this prefix (e.g. /usr/local/var
). Same for
/usr/lib
.
Execution (all steps are required in the order given):
Make a backup of the current configuration, for safety:
$ cp -a /var/lib/ganeti /var/lib/ganeti-1.2.backup
Stop all instances:
$ gnt-instance stop --all
Make sure no DRBD device are in use, the following command should show no active minors:
$ gnt-cluster command grep cs: /proc/drbd | grep -v cs:Unconf
Stop the node daemons and rapi daemon on all nodes (note: should be logged in not via the cluster name, but the master node name, as the command below will remove the cluster ip from the master node):
$ gnt-cluster command /etc/init.d/ganeti stop
Install the new software on all nodes, either from packaging (if available) or from sources; the master daemon will not start but give error messages about wrong configuration file, which is normal
Upgrade the configuration file:
$ /usr/lib/ganeti/tools/cfgupgrade12 -v --dry-run $ /usr/lib/ganeti/tools/cfgupgrade12 -v
Make sure
ganeti-noded
is running on all nodes (and start it if not)Start the master daemon:
$ ganeti-masterd
Check that a simple node-list works:
$ gnt-node list
Redistribute updated configuration to all nodes:
$ gnt-cluster redist-conf $ gnt-cluster copyfile /var/lib/ganeti/known_hosts
Optional: if needed, install RAPI-specific certificates under
/var/lib/ganeti/rapi.pem
and run:$ gnt-cluster copyfile /var/lib/ganeti/rapi.pem
Run a cluster verify, this should show no problems:
$ gnt-cluster verify
Remove some obsolete files:
$ gnt-cluster command rm /var/lib/ganeti/ssconf_node_pass $ gnt-cluster command rm /var/lib/ganeti/ssconf_hypervisor
Update the xen pvm (if this was a pvm cluster) setting for 1.2 compatibility:
$ gnt-cluster modify -H xen-pvm:root_path=/dev/sda
Depending on your setup, you might also want to reset the initrd parameter:
$ gnt-cluster modify -H xen-pvm:initrd_path=/boot/initrd-2.6-xenU
Reset the instance autobalance setting to default:
$ for i in $(gnt-instance list -o name --no-headers); do \ gnt-instance modify -B auto_balance=default $i; \ done
Optional: start the RAPI demon:
$ ganeti-rapi
Restart instances:
$ gnt-instance start --force-multiple --all
At this point, gnt-cluster verify
should show no errors and the migration
is complete.
1.2 releases¶
1.2.4 to any other higher 1.2 version¶
No changes needed. Rollback will usually require manual edit of the configuration file.
1.2.3 to 1.2.4¶
No changes needed. Note that going back from 1.2.4 to 1.2.3 will require manual edit of the configuration file (since we added some HVM-related new attributes).
1.2.2 to 1.2.3¶
No changes needed. Note that the drbd7-to-8 upgrade tool does a disk format change for the DRBD metadata, so in theory this might be risky. It is advised to have (good) backups before doing the upgrade.
1.2.1 to 1.2.2¶
No changes needed.
1.2.0 to 1.2.1¶
No changes needed. Only some bugfixes and new additions that don’t affect existing clusters.
1.2.0 beta 3 to 1.2.0¶
No changes needed.
1.2.0 beta 2 to beta 3¶
No changes needed. A new version of the debian-etch-instance OS (0.3) has been released, but upgrading it is not required.
1.2.0 beta 1 to beta 2¶
Beta 2 switched the config file format to JSON. Steps to upgrade:
Stop the daemons (
/etc/init.d/ganeti stop
) on all nodesDisable the cron job (default is
/etc/cron.d/ganeti
)Install the new version
Make a backup copy of the config file
Upgrade the config file using the following command:
$ /usr/share/ganeti/cfgupgrade --verbose /var/lib/ganeti/config.data
Start the daemons and run
gnt-cluster info
,gnt-node list
andgnt-instance list
to check if the upgrade process finished successfully
The OS definition also need to be upgraded. There is a new version of the debian-etch-instance OS (0.2) that goes along with beta 2.