Migrating with migrator.sh (deprecated)
Important: The migrator.sh
script is deprecated and has been replaced by sdc-migrate
for platform versions 2020-03-26 and above. Please refer to Migrating instances between compute nodes for instructions on using sdc-migrate
.
The following details instance migrations, both zones (SmartOS and Container Native Linux) and HVMs (Linux/Windows) between two compute nodes (CN). Differences between Zone and KVM instances are noted in the details below.
Notes:
- Both HVMs and ZONES include 2 ZFS datasets:
zones/:UUID
- Mounted under
/zones/:UUID
- Mounted under
zones/cores/:UUID
- Mounted under
/zones/:UUID/cores
- Mounted under
- Bhyve VMs include 2 additional ZVOL child datasets:
zones/:UUID/disk0
zones/:UUID/disk1
- KVMs include 2 additional ZVOL sibling datasets:
zones/:UUID-disk0
zones/:UUID-disk1
Warnings
-
The migration of instances between compute nodes is not a supported process. It is recommended to reprovision when an instance needs to be moved to a new compute node. You assume all risks responsible when utilizing this process; MNX is not responsible for any loss or damage caused by using this procedure.
- Current versions of the migration tooling do not support migrating instances that have delegated datasets. A future update to the script will address this limitation.
Because of this risk, customers with current Triton support contracts must contact MNX support for access to the migrator.sh
script. To do this, please send an email to help@mnxsolutions.com
Overview
The migrator.sh
script is designed to be used from the head node to handle the full migration of a instance between two compute nodes.
Of note, migrator.sh
will determine the current run state of the instance prior to migration. If the instance is running prior to migration, it will be started on the dest compute node at the end of migration. If it is in any state other than running, the instance will not be restarted.
The script will create it's own ssh keys and push them to the source and destination compute nodes; these keys are used to enable migrator.sh
to move the zfs data between the compute nodes without requiring passwords. These keys are removed at the completion of the migration.
At this time, migrator.sh
will handle three kinds of migrations, full, incremental, and automatic incremental:
- A full migration is where
migrator.sh
will immediately shut down the instance (if running), fully migrate over to the DEST Compute Node all of the instance's datasets, and if the instance was previously running, restart it on the DEST Compute Node. - An incremental migration allows the source instance to continue running while the datasets are incrementally transferred, and requires the source instance to only be brought down for roughly 2-15 minutes (depending on the instance's activity and amount of changed data) at the end of the process.
- An automatic incremental migration is a variation of the incremental migration in which your VM stays up until the first snap is done, shuts down, and then the final snap is sent over.
There are a number of options available for the migrator.sh
script:
[root@headnode (cak-1) ~]# /opt/custom/bin/migrator.sh -h
migrator.sh migrates an instance from one CN to another within an
AZ in SDC. A migration can be all at once or incremental. All at
once means the instance will be immediately shut down, datasets
transferred, config files copied, APIs updated, and restarted once
transfer is complete to the new CN. Incremental means the instance
will stay online while incremental dataset snapshots are
transferred to the new CN, decreasing in size until the instance's
migration state file is removed. Once the state file is removed,
the instance is shut down, a final snapshot taken and transferred
to the new CN, config files copied, APIs updates, and the instance
is restarted on the new CN.
Usage: migrator.sh [ -h ]
migrator.sh [ -v ]
migrator.sh [ -i | -a ] [ -D ] INST_UUID DEST_CN_NAME
-h This help output
-D Allow migration to an older dest PI
-v Display migrator.sh version and exit
-i Perform an incremental migration (optional); A log
file and state file will be created at:
/var/tmp/migrator-INST_UUID-DEST_CN_NAME-ST_EPOCH.EXT
where EXT is either 'log' or 'st'; incremental
dataset transfers will continue to occur while
the state file exists; once the state file is
removed, the migration be will finalized of all
other work to complete the migration
-a Automatic incremental mode (optional); A hybrid mode in which
one incremenatal dataset transfer will occurr after which
migrator will remove the statefile for you and finish the
migration normally. a implies i
INST_UUID Instance UUID to be migrated
DEST_CN_NAME CN Hostname to migrate the instance to
The migrator.sh
script now supports migrating instances on Fabric (VXLAN) Networks and Docker instances.
Full migration
This is the default migration option, and requires that the source instance be down during the entire process.
At a high level, the migrator.sh
script performs the following for a Full Migration.
- Validates the instance UUID.
- Validates access to the source and destination compute nodes.
- Generates and pushes SSH keys for use between the compute nodes.
- Reserves the instance's IP address(es)
- Verifies the instance on the source compute node.
- Shuts the instance down and validates the attributes.
- Snapshots the datasets associated with the instance.
- Transfers the datasets and configuration from the source to the destination.
- Reviews the transfer and cleans up the snapshots.
- Creates the cores dataset on the destination compute node.
- Sets up the
/etc/zones/index
file on the destination compute node. - Forces the API to acknowledge the instance on the destination compute node.
- Verifies the instance in the API.
- Boots the instance and validates state (Only if instance was in a running state).
- Unreserve the instance's IP address(es).
- Provides instructions on cleaning data from the source compute node.
Preparation
You will need to have root level access to the source compute node (CN), the destination compute node, and the head node (HN). You will also need to know the UUID of the instance to be migrated, and you should ensure that the destination compute node has the requisite traits, network tags, disk space, memory, and CPU requirements necessary to host the migrated instance.
Note: It is highly recommended that you push an SSH key from the head node to the compute nodes to allow passwordless ssh between the head node and the compute node. This will eliminate the need to type passwords during this process. The keys that are created by migrator.sh
are used only between the source and destination compute nodes, not the head node and the compute nodes.
Identify instance
Identify the instance's current server_uuid
, alias
, create_timestamp
, and state
. Also determine the hostname of the server it is running on; this can be done using the sdc-vmapi and sdc-cnapi tools. For point of example, we'll be migrating SmartOS instance "kirby" with the UUID of "5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa" from the server source-server
.
headnode# sdc-vmapi /vms/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa| json -Ha server_uuid alias create_timestamp state
00000000-0000-0000-0000-d43d7ef73056 kirby 2013-12-12T20:37:34.494Z running
headnode# sdc-cnapi /servers/00000000-0000-0000-0000-d43d7ef73056 | json -Ha hostname
source-server
Estimate migration time
In order to determine the amount of time required to migrate an instance, the mig-estimator.sh
script can be used. This script is designed to be run on the compute node hosting the instance to be migrated, and assumes a data transfer rate of 20MB/s - the script can be adjusted to use a different data transfer rate if desired. The script provides an estimate for each instance found on the compute node, as well as an overall "purge" value to indicate how long it would take to migrate all instances off the compute node.
All networks are different; test your network speed and adjust the mig-estimator.sh
script as required for your network.
For more about how to use this script, please contact support at help@mnxsolutions.com.
Locate a destination compute node
You will need to ensure that the destination compute node meets all necessary requirements for the migrated instance:
- Amount of available RAM.
- Amount of available Disk.
- Any required NIC Tags.
- Any required Traits.
- Connectivity to source compute node.
The migrator is an operator-level tool, so it will bypass normal sanity checks (i.e. checking to make sure the right networks are present, traits, etc). If you are not familiar with operator tools, it is possible to render the VM unbootable on new hardware. Consider instead contacting support.
Validate access to source and destination compute nodes
Identify the IP address for both source (src) and destination (dest) compute nodes; the code snippet below is one way to determine the IP addresses. Replace 10.1.1 with the internal subnet you are using for your Triton installation:
headnode# sdc-oneachnode -n source-server,dest-server 'ifconfig -a| grep 10.1.1' | awk '!/HOST/{print $1" ("$3")"}'
source-server (10.1.1.33)
dest-server (10.1.1.34)
Once you have the IP addresses, validate that you are able to establish an SSH connection. In our example below we are using private/public key authentication, although it is acceptable to use passwords it is highly recommended that you configure SSH keys between your head node and the compute nodes you are using to help streamline this process.
headnode# ssh 10.1.1.33 uname -a
SunOS SRC_SERVER 5.11 joyent_20140115T175151Z i86pc i386 i86pc
headnode# ssh 10.1.1.34 uname -a
SunOS dest-server 5.11 joyent_20140115T175151Z i86pc i386 i86pc
Using the migrator.sh script - full migration
You can use migrator.sh
from the head node to handle fully migrating an instance between two compute nodes. Of note, migrator.sh
will determine the current run state of the instance prior to migration. If the instance is running prior to migration, it will be started on the destination compute node at the end of migration. If it is in any state other than running, the instance will not be restarted.
Verify the instance on the source compute node
Verify the current run state and datasets of the instance on the source compute node:
source-server# INSTANCE_UUID="5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa" ; zoneadm list -vic | grep ${INSTANCE_UUID} ; zfs list -rtall | grep ${INSTANCE_UUID}
3 5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa running /zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa joyent excl
zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa 208M 24.8G 2.41G /zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa
zones/cores/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa 31K 100G 31K /zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa/cores
Note on Time Estimation
From the above datasets, there's about 208MB of data that will need to be migrated to the destination compute node based on the 4th column (REFER) of zfs output above. For these examples, we assume a data transfer rate of 1 - 1.5 GB of data per minute. This means data transfer should take between 10 and 20 seconds, with about 2-3 minutes of overhead from migrator.sh
. However, all networks are different so please test your network speed and throughput and adjust accordingly.
Destination compute node prep work
On the dest compute node, run the following to watch the progress of the migration (the output repeats once per minute, but you can adjust by changing the sleep time):
dest-server# INSTANCE_UUID="5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa" ; while : ; do date ; zoneadm list -vic | grep ${INSTANCE_UUID} ; zfs list -rtall | grep ${INSTANCE_UUID} ; echo ; sleep 60 ; done
March 3, 2014 09:58:11 PM UTC
Destroy or recreate NAT zones
Use nat-recreate.sh
to destroy or recreate a NAT zone, and migrate the zone off a compute node.
The migrator.sh
script will not migrate an NAT zone. It also will not migrate any core zones used by Triton DataCenter or Object Storage.
Start the migration
On the head node, start the migration. You'll want to execute migrator.sh
with root privileges. Note that you can either pass the instance UUID and compute node name on the command line, or allow the script to prompt you for that data. The following is the full runtime output of our migration:
headnode# ./migrator.sh
-: Welcome to the SDC Migrator :-
UUID of Instance to Move: 5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa
Destination Server: dest-server
* Please wait, gathering instance and CN data....
+ retrieving instance alias, server_uuid, brand, create_timestamp,
image_uuid, zone_state, quota, ram, and owner_uuid .... [ DONE ]
+ checking instance for any IP addrs to potentially reserve during
the migration ... [ DONE ]
+ retrieving SRC CN hostname, SDC Version, reservation status, and
IP addr ... [ DONE ]
+ retrieving DEST CN UUID, SDC Version, reservation status, and
IP addr ... [ DONE ]
+ checking instance for datasets to migrate during the migration ... [ DONE ]
+ Creating an SSH key on source-server to copy to authorized_keys on
dest-server for the migration; Key will be removed once migration completes. [ DONE ]
+ Copying SSH key to dest-server... [ DONE ]
- Data gathering complete!
We will be migrating:
INSTANCE:
uuid: 5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa
alias: kirby
IP Addr(s): 192.168.212.115
datacenter: cak-1
zone state: running
type: ZONE
owner: d56db211-59cf-4913-c10c-cb35d3a26bee
create at: 2014-04-14T16:52:42.749Z
base image: 398deede-c025-11e3-8b24-f3ba141900bd
total dataset size: 207.98 MBytes across 1 datasets
est. migration time: 2.17 minutes (@ ~20 MB / second (~1.17 GB / minute); +2 minutes extra)
migration type: Non-Incremental
SRC CN DEST CN
---------------------------------------------- ----------------------------------------------
Host: source-server Host: dest-server
UUID: 10895171-5599-db11-8667-763f24705829 UUID: 00000000-0000-0000-0000-d43d7ef73056
SDC Ver: 7.0 SDC Ver: 7.0
IP Addr: 10.1.1.33 IP Addr: 10.1.1.1
reserved: false reserved: false
migr key: /root/.ssh/migr-key-38135 auth_keys bkup: /root/.ssh/authorized_keys.38135
Are you ready to proceed? [y|n] y
Here we go...
* Checking if tcp_max_buf, tcp_xmit_hiwat, and tcp_recv_hiwat have
been tuned on both source-server and dest-server ...
- nothing to do for source-server
- nothing to do for dest-server
* Checking for origin dataset for the instance...
+ origin dataset is zones/398deede-c025-11e3-8b24-f3ba141900bd@final
> origin DS doesn't exist on dest-server, will need to try import
of origin DS (398deede-c025-11e3-8b24-f3ba141900bd@final)
* Attempting import of image (398deede-c025-11e3-8b24-f3ba141900bd) to dest-server
- Image imported successfully!
**** Runtime impact and pre-migration changes to ZONE instance
**** 5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa (kirby)
**** start here!
* Instance kirby is running; shutting it down, please wait .. [ DONE ]
- New state is: stopped
* Source XML file modifications...
+ Checking instance XML file for create-timestamp (JPC-1212)...
= create-timestamp already exists
+ Checking instance XML file for dataset-uuid (image_uuid; AGENT-629)...
= dataset-uuid already exists
- Source XML file modifications complete!
* Getting Zone index configuration... [ DONE ]
+ checking if we need to correct it (JPC-1421)
- Zone index configuration check complete!
* Creating dataset snapshots on source-server
+ Creating zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa snapshot... [ DONE ]
* Transferring dataset snapshots to dest-server
+ Tranferring zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa@vmsnap-1393137060_ops_migration_1393137060 ... The authenticity of host '10.1.1.1 (10.1.1.1)' can't be established.
RSA key fingerprint is 3c:86:e6:13:e6:59:4d:fc:9a:ae:59:05:19:30:fb:11.
Are you sure you want to continue connecting (yes/no)? yes
done
* Creating cores dataset on dest-server
- Created cores dataset: zones/cores/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa
* Transferring Zone XML to the destination... [ DONE ]
* Deleting migration snapshots on dest-server... [ DONE ]
* Disabling instance on source-server... [ DONE ]
* Importing & Attaching Zone on dest-server... [ DONE ]
* Restarting vmadmd on dest-server, to ensure detection of transferred VM's presence.
* Temporarily reserving instance IP addrs so they don't get reprovisioned elsewhere
in the short amount of time that the instance may show up as "destroyed"...
+ reserving 192.168.212.115 on network_uuid e76a5115-1353-4788-b3a3-7302e2f2b710... Done
* Setting attr 'do-not-inventory' for 5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa on source-server...
+ Restarting vmadmd again on destination, to hasten update of VM's presence.
+ Restarting heartbeater on destination, to hasten update of VM's presence.
* Checking VMAPI for 'do-not-inventory' update (VMAPI will show dest-server's
server_uuid if updated); may take up to a minute or so ......... Success!
* Unreserving instance IP addrs since we're no longer at risk of
the instance showing up as "destroyed".
- unreserving 192.168.212.115 on network_uuid e76a5115-1353-4788-b3a3-7302e2f2b710... Done
* Enabling Autostart on the dest-server... [ DONE ]
* VM kirby is ready for startup, please wait for boot..Done. (State is: running)
=== Done! ===
ZONE Instance: 5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa (kirby)
is now installed on
Dest CN: dest-server (00000000-0000-0000-0000-d43d7ef73056)
Migration started: April 28, 2014 03:43:40 PM UTC
Migration ended: April 28, 2014 03:51:42 PM UTC
Duration of migration: 2.23 minutes
Instance downtime: 17 seconds
Migration type: Non-Incremental
# dataset increments: 1
Don't forget to at least comment out 5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa
on source-server in /etc/zones/index, if not outright remove it.
To remove it on source-server:
* /usr/bin/ssh 10.1.1.33 "/usr/sbin/zonecfg -z 5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa delete -F"
* /usr/bin/ssh 10.1.1.33 "/usr/sbin/zfs destroy zones/cores/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa"
* /usr/bin/ssh 10.1.1.33 "/usr/sbin/zfs destroy -r zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa"
Please validate in AdminUI: https://10.1.1.26/vms/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa
or validate via CLI: /opt/smartdc/bin/sdc-vmapi /vms/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa
* Clearing migration SSH keys
Monitoring progress
On the dest compute node, we started a while loop to basically watch progress of the dataset transfers. The following is a sample (taken at 1 minute intervals) of that output while migrator.sh
was running:
dest-server# INSTANCE_UUID="5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa" ; while : ; do date ; zoneadm list -vic | grep ${INSTANCE_UUID} ; zfs list -rtall | grep ${INSTANCE_UUID} ; echo ; sleep 60 ; done
March 3, 2014 09:58:11 PM UTC
zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa 100M 24.8G 2.69G /zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa
March 3, 2014 09:59:11 PM UTC
zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa 200M 24.8G 2.69G /zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa
March 3, 2014 10:00:11 PM UTC
zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa 215M 24.8G 2.69G /zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa
zones/cores/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa 144K 2.50G 144K /zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa/cores 31K 13.1G 31K /zones/92a45727-b73c-664f-92e9-e13cdb635b28/cores
End state on Destination compute node
Post migration, the instance should appear on the Destination compute node as:
dest-server# INSTANCE_UUID="5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa" ; zoneadm list -vic | grep ${INSTANCE_UUID} ; zfs list -rtall | grep ${INSTANCE_UUID}
28 5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa running /zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa joyent excl
zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa 215M 24.8G 2.69G /zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa
zones/cores/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa 144K 2.50G 144K /zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa/cores 31K 13.1G 31K /zones/92a45727-b73c-664f-92e9-e13cdb635b28/cores
Validate migration via APIs
Validate the instance was successfully migrated and is reported as being hosted by the destination compute node:
dest-server# sdc-vmapi /vms/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa | json -Ha server_uuid alias create_timestamp state
00000000-0000-0000-0000-d43d7ef73056 kirby 2014-04-14T16:52:42.749Z running
Verify instance
Verify the state and sanity of the migrated instance with the instance user. Provided they have no issues, you can destroy the instance on the source compute node.
Clean up the source compute node
If you haven't yet verified the instance with the instance user, at least comment it out in /etc/zones/index
on the source compute node by using a # in front of the appropriate line.
source-server# INSTANCE_UUID="5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa "; grep ${INSTANCE_UUID} /etc/zones/index
#5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa :installed:/zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa:5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa
If you have verified the instance as stable with the user, remove the instance from the source compute node:
headnode# /usr/bin/ssh 10.1.1.33 "/usr/sbin/zonecfg -z 5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa delete -F"
headnode# /usr/bin/ssh 10.1.1.33 "/usr/sbin/zfs destroy zones/cores/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa"
headnode# /usr/bin/ssh 10.1.1.33 "/usr/sbin/zfs destroy -r zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa"
Zone, Bhyve and KVM differences
KVM instances have 2 extra datasets that Zones do not have, zones/:UUID-disk0
and zones/:UUID-disk1
. If we had migrated a KVM instance, there would be additional destroy commands for the 2 zones/:UUID-disk\#
datasets associated with a KVM instance.
Bhyve instances have the top-level disk quota and reservation at 100% of the sum of the root and child dataset quotas. There is no additional quota for snapshots. For this reason, the migrator has extra steps to temporarily bump up the quota before migration and reset it after migration.
Incremental migration
An incremental migration differs from a standard migration in that migrator.sh
will enable the operator to minimize the amount of downtime the instance will incur as a result of the migration. The benefit to the incremental migration is fully realized when you have large instances that would take hours (or days) to migrate.
Rather than use a full migration that would immediately shut down the instance, during an incremental migration the instance is still online until the last set of snapshots. Providing the rate of data deviation before removal of the state file is no more than a few GB (say < 2GB), the instance will only be down for about 2 minutes.
In the course of an incremental migration, migrator.sh
will start sending incremental snapshots of the instance's datasets to the destination compute node immediately while the instance is still running.
Once the first round of snapshots has been received, migrator.sh
will pause for 60 seconds, create a new set of snapshots for the instance, and send the deviation between the new snapshots and the original snapshots. During this, the instance continues to run.
Once the 2nd round of snapshots has been received, 'migrator.sh' will pause for 60 seconds, create a new set of snapshots for the instance, and send the deviation between the new snapshots and the 2nd round of snapshots. In order to track when it should finalize the migration and quit with snapshotting, migrator.sh
creates a state file to indicate migration type is incremental. As long as that state file exists on the head node, migrator.sh
continues to create and send snapshots.
As soon as that state file is removed, migrator.sh
will finish sending the current round of snapshots, shutdown the instance, and take one final incremental snapshot of the instance's datasets. It will then send the deviation between the last snapshots and the penultimate snapshots, at which point the destination compute node will now have the full content of the instance's datasets. At this point, migrator.sh
finishes the migration of the instance just as it would in a full migration.
At a high level, the migrator.sh
script performs the following for a Incremental Migration.
- Validates the instance UUID.
- Validates access to the source and destination compute nodes.
- Creates a state file to be used to determine when to complete the migration.
- Generates and pushes SSH keys for use between the compute nodes.
- Reserves the instance's IP address(es)
- Verifies the instance on the source compute node.
- Snapshots the datasets associated with the instance.
- Transfers the datasets to the destination compute node.
- Pauses at completion of transfer and then loops and:
- Creates a new set of snapshots.
- Sends the deviation between the snapshots to the destination compute node.
- Checks to see if the state file exists.
- If the state file exists, goes back to the top of the loop.
- If the state file does not exist, the script breaks out of the loop and continues.
- Transfers the configuration from the source to the destination.
- Shuts the instance down and validates the attributes.
- Reviews the transfer and cleans up the snapshots.
- Creates the cores dataset on the destination compute node.
- Sets up the
/etc/zones/index
file on the destination compute node. - Forces the API to acknowledge the instance on the destination compute node.
- Verifies the instance in the API.
- Boots the instance and validates state.
- Unreserve the instance's IP address(es).
- Provides instructions on cleaning data from the source compute node. There is an exponential number of backups for incremental migrations.
Preparation
You will need to have root level access to the source compute node (CN), the destination compute node, and the head node (HN). You will also need to know the UUID of the instance to be migrated, and you should ensure that the destination compute node has the requisite traits, network tags, disk space, memory, and CPU requirements necessary to host the migrated instance.
Note: It is highly recommended that you push an SSH key from the head node to the compute nodes to allow passwordless SSH between the head node and the compute node. This will eliminate the need to type passwords during this process. The keys that are created by migrator.sh
are used only between the source and destination compute
nodes, not the head node and the compute nodes.
Identify instance
Identify the instance's current server_uuid, alias, create_timestamp, and state. Also determine the hostname of the server it is running on; this can be done using the sdc-vmapi and sdc-cnapi tools. For point of example, we'll be migrating SmartOS instance "kirby" with the UUID of "5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa" from the server source-server
.
headnode# sdc-vmapi /vms/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 | json -Ha server_uuid alias create_timestamp state
00000000-0000-0000-0000-d43d7ef73056 pkgbuild 2013-12-12T20:37:34.494Z running
headnode# sdc-cnapi /servers/00000000-0000-0000-0000-d43d7ef73056 | json -Ha hostname
source-server
Estimate migration time
In order to determine the amount of time required to migrate an instance, the mig-estimator.sh
script can be used. This script is designed to be run on the compute node hosting the instance to be migrated, and assumes a data transfer rate of 20MB/s - the script can be adjusted to use a different data transfer rate if desired. The script provides an estimate for each instance found on the compute node, as well as an overall "purge" value to indicate how long it would take to migrate all instances off the compute node.
All networks are different; test your network speed and adjust the mig-estimator.sh
script as required for your network.
For more about how to use this script, please contact support at help@mnxsolutions.com.
Locate a destination compute node
You will need to ensure that the Destination compute node meets all necessary requirements for the migrated instance:
- Amount of available RAM.
- Amount of available Disk.
- Any required NIC Tags.
- Any required Traits.
- Connectivity to source compute node.
Validate access to source and destination compute nodes
Identify the IP address for both source (src) and destination (dest) compute nodes; the code snippet below is one way to determine the IP addresses. Replace 10.1.1 with the internal subnet you are using for your Triton installation:
headnode# sdc-oneachnode -n source-server,dest-server 'ifconfig -a| grep 10.1.1' | awk '!/HOST/{print $1" ("$3")"}'
source-server (10.1.1.33)
dest-server (10.1.1.34)
Once you have the IP addresses, validate that you are able to establish an SSH connection. In our example below we are using private/public key authentication, although it is acceptable to use passwords it is highly recommended that you configure ssh keys between your head node and the compute nodes you are using to help streamline this process.
headnode# ssh 10.1.1.33 uname -a
SunOS SRC_SERVER 5.11 joyent_20140115T175151Z i86pc i386 i86pc
headnode# ssh 10.1.1.34 uname -a
SunOS dest-server 5.11 joyent_20140115T175151Z i86pc i386 i86pc
Using the migrator.sh script - incremental migration
You can use migrator.sh
from the head node to handle fully migrating an instance between two compute nodes. Of note, migrator.sh
will determine the current run state of the instance prior to migration. If the instance is running prior to migration, it will be started on the destination compute node at the end of migration. If it is in any state other than running, the instance will not be restarted.
Verify the instance on the source compute node
Verify the current run state and datasets of the instance on the source compute node:
source-server# INSTANCE_UUID="43aba933-2cd6-6c47-e63e-b1d2d1ab4956" ; zoneadm list -vic | grep ${INSTANCE_UUID} ; zfs list -rtall | grep ${INSTANCE_UUID}
- 43aba933-2cd6-6c47-e63e-b1d2d1ab4956 installed /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 joyent excl
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 98G 34.8G 98G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
zones/cores/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 144K 15.0G 144K /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956/cores 31K 100G 31K /zones/5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa/cores
Note on time estimation
From the above datasets, there's about 98GB of data that will need to be migrated to the destination compute node based on the 4th column (REFER) of zfs output above. For these examples, we assume a data transfer rate of 1 - 1.5 GB of data per minute. This means data transfer should take between 80 and 120 minutes, with about 2-3 minutes of overhead from migrator.sh
. However, all networks are different so please test your network speed and throughput and adjust accordingly.
Destination compute node prep work
On the dest compute node, run the following to watch the progress of the migration (the output repeats once per minute, but you can adjust by changing the sleep time):
dest-server# INSTANCE_UUID="5ca5b3dc-48fe-6f75-ed8c-8d8fbd77e2aa" ; while : ; do date ; zoneadm list -vic | grep ${INSTANCE_UUID} ; zfs list -rtall | grep ${INSTANCE_UUID} ; echo ; sleep 60 ; done
March 3, 2014 09:58:11 PM UTC
Start the migration
On the head node, start the migration. You'll want to execute migrator.sh
with root privileges. Note that you can either pass the instance UUID and compute node name on the command line, or allow the script to prompt you for that data. The following is the full runtime output of our migration:
headnode# ./migrator.sh -i
-: Welcome to the SDC Migrator :-
UUID of Instance to Move: 43aba933-2cd6-6c47-e63e-b1d2d1ab4956
Destination Server: dest-server
* Please wait, gathering instance and CN data....
+ retrieving instance alias, server_uuid, brand, create_timestamp,
image_uuid, zone_state, quota, ram, and owner_uuid .... [ DONE ]
+ checking instance for any IP addrs to potentially reserve during
the migration ... [ DONE ]
+ retrieving SRC CN hostname, SDC Version, reservation status, and
IP addr ... [ DONE ]
+ retrieving DEST CN UUID, SDC Version, reservation status, and
IP addr ... [ DONE ]
+ checking instance for datasets to migrate during the migration ... [ DONE ]
+ Creating an SSH key on source-server to copy to authorized_keys on
dest-server for the migration; Key will be removed once migration completes. [ DONE ]
+ Copying SSH key to dest-server... [ DONE ]
- Data gathering complete!
We will be migrating:
INSTANCE:
uuid: 43aba933-2cd6-6c47-e63e-b1d2d1ab4956
alias: pkgbuild
IP Addr(s): 192.168.212.108
datacenter: cak-1
zone state: installed
type: ZONE
owner: d56db211-59cf-4913-c10c-cb35d3a26bee
create at: 2014-03-13T15:44:35.586Z
base image: 74c3b232-7961-11e3-a7a7-935768270b93
total dataset size: 97.72 GBytes across 1 datasets
est. migration time: 1.42 hours (@ ~20 MB / second (~1.17 GB / minute); +2 minutes extra)
migration type: Incremental
migration log file: cak-1 HN:/var/tmp/migrator-43aba933-2cd6-6c47-e63e-b1d2d1ab4956-dest-server-1398709539.log
migration state file: cak-1 HN:/var/tmp/migrator-43aba933-2cd6-6c47-e63e-b1d2d1ab4956-dest-server-1398709539.st
SRC CN DEST CN
---------------------------------------------- ----------------------------------------------
Host: source-server Host: dest-server
UUID: 10895171-5599-db11-8667-763f24705829 UUID: 00000000-0000-0000-0000-d43d7ef73056
SDC Ver: 7.0 SDC Ver: 7.0
IP Addr: 10.1.1.33 IP Addr: 10.1.1.1
reserved: false reserved: false
migr key: /root/.ssh/migr-key-52742 auth_keys bkup: /root/.ssh/authorized_keys.52742
Are you ready to proceed? [y|n] y
Here we go...
* Checking if tcp_max_buf, tcp_xmit_hiwat, and tcp_recv_hiwat have
been tuned on both source-server and dest-server ...
- nothing to do for source-server
- nothing to do for dest-server
* Checking for origin dataset for the instance...
+ origin dataset null or non-standard (-)
> instance DS origin does not reference IMG_UUID@final, relation to
origin DS will be lost in the course of migrating the instance.
* Creating dataset snapshots on source-server
+ Creating zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 snapshot 0... [ DONE ]
* Transferring incremental dataset snapshot 0 to dest-server
+ Tranferring zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-0 ... done
+ Sleeping for 60 seconds before next snapshot.
* Creating dataset snapshots on source-server
+ Creating zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 snapshot 1... [ DONE ]
* Transferring incremental dataset snapshot 1 to dest-server
+ Tranferring zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-1 ... done
+ Sleeping for 60 seconds before next snapshot.
* Creating dataset snapshots on source-server
+ Creating zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 snapshot 2... [ DONE ]
* Transferring incremental dataset snapshot 2 to dest-server
+ Tranferring zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-2 ... done
+ Sleeping for 60 seconds before next snapshot.
* Creating dataset snapshots on source-server
+ Creating zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 snapshot 3... [ DONE ]
* Transferring incremental dataset snapshot 3 to dest-server
+ Tranferring zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-3 ... done
+ Sleeping for 60 seconds before next snapshot.
* Creating dataset snapshots on source-server
+ Creating zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 snapshot 4... [ DONE ]
* Transferring incremental dataset snapshot 4 to dest-server
+ Tranferring zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-4 ... done
+ Sleeping for 60 seconds before next snapshot.
* Creating dataset snapshots on source-server
+ Creating zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 snapshot 5... [ DONE ]
* Transferring incremental dataset snapshot 5 to dest-server
+ Tranferring zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-5 ... done
+ Sleeping for 60 seconds before next snapshot.
* Creating dataset snapshots on source-server
+ Creating zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 snapshot 6... [ DONE ]
* Transferring incremental dataset snapshot 6 to dest-server
+ Tranferring zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-6 ... done
+ Sleeping for 60 seconds before next snapshot.
**** Runtime impact and pre-migration changes to ZONE instance
**** 43aba933-2cd6-6c47-e63e-b1d2d1ab4956 (pkgbuild)
**** start here!
* Instance pkgbuild is already shutdown. Proceeding.
* Source XML file modifications...
+ Checking instance XML file for create-timestamp (JPC-1212)...
= create-timestamp already exists
+ Checking instance XML file for dataset-uuid (image_uuid; AGENT-629)...
= dataset-uuid already exists
- Source XML file modifications complete!
* Getting Zone index configuration... [ DONE ]
+ checking if we need to correct it (JPC-1421)
- Zone index configuration check complete!
* Creating final incrmental dataset snapshot on source-server
+ Creating zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 snapshot 7... [ DONE ]
+ Tranferring zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-7 ... done
* Creating cores dataset on dest-server
- Created cores dataset: zones/cores/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
* Transferring Zone XML to the destination... [ DONE ]
* Deleting migration snapshots on dest-server... [ DONE ]
* Disabling instance on source-server... [ DONE ]
* Importing & Attaching Zone on dest-server... [ DONE ]
* Restarting vmadmd on dest-server, to ensure detection of transferred VM's presence.
* Temporarily reserving instance IP addrs so they don't get reprovisioned elsewhere
in the short amount of time that the instance may show up as "destroyed"...
+ reserving 192.168.212.108 on network_uuid e76a5115-1353-4788-b3a3-7302e2f2b710... Done
* Setting attr 'do-not-inventory' for 43aba933-2cd6-6c47-e63e-b1d2d1ab4956 on source-server...
+ Restarting vmadmd again on destination, to hasten update of VM's presence.
+ Restarting heartbeater on destination, to hasten update of VM's presence.
* Checking VMAPI for 'do-not-inventory' update (VMAPI will show dest-server's
server_uuid if updated); may take up to a minute or so ........ Success!
* Unreserving instance IP addrs since we're no longer at risk of
the instance showing up as "destroyed".
- unreserving 192.168.212.108 on network_uuid e76a5115-1353-4788-b3a3-7302e2f2b710... Done
* Enabling Autostart on the dest-server... [ DONE ]
- VM pkgbuild was in state installed when we started
thus not running. Leaving in "installed" state.
=== Done! ===
ZONE Instance: 43aba933-2cd6-6c47-e63e-b1d2d1ab4956 (pkgbuild)
is now installed on
Dest CN: dest-server (00000000-0000-0000-0000-d43d7ef73056)
Migration started: April 28, 2014 06:43:10 PM UTC
Migration ended: April 28, 2014 09:30:00 PM UTC
Duration of migration: 1.50 hours
Instance downtime: 36 seconds
Migration type: Incremental
# dataset increments: 8
migration log file: cak-1 HN:/var/tmp/migrator-43aba933-2cd6-6c47-e63e-b1d2d1ab4956-dest-server-1398709539.log
Don't forget to at least comment out 43aba933-2cd6-6c47-e63e-b1d2d1ab4956
on source-server in /etc/zones/index, if not outright remove it.
To remove it on source-server:
* /usr/bin/ssh 10.1.1.33 "/usr/sbin/zonecfg -z 43aba933-2cd6-6c47-e63e-b1d2d1ab4956 delete -F"
* /usr/bin/ssh 10.1.1.33 "/usr/sbin/zfs destroy zones/cores/43aba933-2cd6-6c47-e63e-b1d2d1ab4956"
* /usr/bin/ssh 10.1.1.33 "/usr/sbin/zfs destroy -r zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956"
Please validate in AdminUI: https://10.1.1.26/vms/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
or validate via CLI: /opt/smartdc/bin/sdc-vmapi /vms/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
* Clearing migration SSH keys
You have new mail in /var/mail/root
Note that the log above shows the migration took 2.78 hours, but the actual downtime was only 36 seconds.
Monitoring progress
On the dest compute node, we started a while loop to basically watch progress of the dataset transfers. The following is a sample (taken at 1 minute intervals) of that output while migrator.sh
was running; the output has been broken up for the sake of brevity, but you can see the generation of the snapshots once the first iteration of the data transfer has been completed, following by the removal of those snapshots once the state file is removed and the transfer finalized.
dest-server# INSTANCE_UUID="43aba933-2cd6-6c47-e63e-b1d2d1ab4956" ; while : ; do date ; zoneadm list -vic | grep ${INSTANCE_UUID} ; zfs list -rtall | grep ${INSTANCE_UUID} ; echo ; sleep 60 ; done
April 28, 2014 06:43:02 PM UTC
April 28, 2014 06:44:02 PM UTC
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 540M 149G 540M /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
April 28, 2014 06:45:02 PM UTC
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 1.19G 149G 1.19G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
April 28, 2014 06:46:02 PM UTC
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 1.85G 148G 1.85G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
April 28, 2014 06:47:02 PM UTC
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 2.63G 147G 2.63G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
April 28, 2014 06:48:02 PM UTC
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 3.57G 146G 3.57G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
<----------SNIP---------->
April 28, 2014 08:01:57 PM UTC
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 114G 36.3G 114G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
April 28, 2014 08:02:58 PM UTC
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 114G 35.6G 114G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
April 28, 2014 08:03:59 PM UTC
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 115G 34.9G 115G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
April 28, 2014 08:05:01 PM UTC
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 115G 34.8G 115G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-0 0 - 115G -
April 28, 2014 08:07:01 PM UTC
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 115G 34.8G 115G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-0 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-1 0 - 115G -
April 28, 2014 08:08:01 PM UTC
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 115G 34.8G 115G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-0 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-1 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-2 0 - 115G -
April 28, 2014 08:09:01 PM UTC
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 115G 34.8G 115G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-0 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-1 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-2 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-3 0 - 115G -
April 28, 2014 08:10:01 PM UTC
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 115G 34.8G 115G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-0 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-1 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-2 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-3 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-4 0 - 115G -
April 28, 2014 08:11:01 PM UTC
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 115G 34.8G 115G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-0 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-1 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-2 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-3 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-4 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-5 0 - 115G -
April 28, 2014 08:12:01 PM UTC
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 115G 34.8G 115G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-0 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-1 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-2 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-3 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-4 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-5 8K - 115G -
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956@vmsnap-1393137060_ops_migration_1393137060-6 0 - 115G -
<----------SNIP---------->
April 28, 2014 08:15:01 PM UTC
- 43aba933-2cd6-6c47-e63e-b1d2d1ab4956 installed /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 joyent excl
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 115G 34.8G 115G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
zones/cores/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 144K 15.0G 144K /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956/cores
April 28, 2014 08:16:01 PM UTC
- 43aba933-2cd6-6c47-e63e-b1d2d1ab4956 installed /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 joyent excl
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 115G 34.8G 115G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
zones/cores/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 144K 15.0G 144K /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956/cores
Remove state file to finish migration
By design, the incremental migration will continue to loop and resend snapshots every 60 seconds (60 seconds from the end of the previous snapshot send) until the state file is deleted. The state file will be specified in the header information that is echoed by the script at startup, and is also stored in the log:
headnode# ls -l migrator-43aba933-2cd6-6c47-e63e-b1d2d1ab4956-dest-server-1398709539.*
-rw-r--r-- 1 root root 5300 Apr 28 21:28 migrator-43aba933-2cd6-6c47-e63e-b1d2d1ab4956-dest-server-1398709539.log
-rw-r--r-- 1 root root 0 Apr 28 18:25 migrator-43aba933-2cd6-6c47-e63e-b1d2d1ab4956-dest-server-1398709539.st
Removing the state file tells the script to shutdown the instance and continue.
headnode# rm migrator-43aba933-2cd6-6c47-e63e-b1d2d1ab4956-dest-server-1398709539.st
End state on destination compute node
Post migration, the instance should appear on the destination compute node as:
dest-server# # INSTANCE_UUID="43aba933-2cd6-6c47-e63e-b1d2d1ab4956" ; zoneadm list -vic | grep ${INSTANCE_UUID} ; zfs list -rtall | grep ${INSTANCE_UUID}
- 43aba933-2cd6-6c47-e63e-b1d2d1ab4956 installed /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 joyent excl
zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 115G 34.8G 115G /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956
zones/cores/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 144K 15.0G 144K /zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956/cores
Validate migration via APIs
Validate the instance was successfully migrated and is reported as being hosted by the destination compute node:
dest-server# # sdc-vmapi /vms/43aba933-2cd6-6c47-e63e-b1d2d1ab4956 | json -Ha server_uuid alias create_timestamp state
00000000-0000-0000-0000-d43d7ef73056 pkgbuild 2014-03-13T15:44:35.586Z running
Verify instance
Verify the state and sanity of the migrated instance with the instance user. Provided they have no issues, you can destroy the instance on the source compute node.
Clean up the source compute node
If you haven't yet verified the instance with the instance user, at least comment it out in /etc/zones/index
on the source compute node by using a # in front of the appropriate line.
source-server# INSTANCE_UUID="43aba933-2cd6-6c47-e63e-b1d2d1ab4956 "; grep ${INSTANCE_UUID} /etc/zones/index
#43aba933-2cd6-6c47-e63e-b1d2d1ab4956 :installed:/zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956:43aba933-2cd6-6c47-e63e-b1d2d1ab4956
If you have verified the instance as stable with the user, remove the instance from the source compute node:
dest-server# /usr/bin/ssh 10.1.1.33 "/usr/sbin/zonecfg -z 43aba933-2cd6-6c47-e63e-b1d2d1ab4956 delete -F"
dest-server# /usr/bin/ssh 10.1.1.33 "/usr/sbin/zfs destroy zones/cores/43aba933-2cd6-6c47-e63e-b1d2d1ab4956"
dest-server# /usr/bin/ssh 10.1.1.33 "/usr/sbin/zfs destroy -r zones/43aba933-2cd6-6c47-e63e-b1d2d1ab4956"
Zone and KVM differences
KVM instances have 2 extra datasets that Zones do not have, zones/:UUID-disk0
and zones/:UUID-disk1
. If we had migrated a KVM instance, there would be additional destroy commands for the 2 zones/:UUID-disk\#
datasets associated with a KVM instance.
Bhyve instances have the top-level disk quota and reservation at 100% of the sum of the root and child dataset quotas. There is no additional quota for snapshots. For this reason, the migrator has extra steps to temporarily bump up the quota before migration and reset it after migration.