Troubleshooting Triton
This document provides basic instruction on the process of troubleshooting both an Triton Installation, and the instances running in an Triton Installation.
Troubleshooting the installation
Installation level troubleshooting generally involves working on the installation head node. Most troubleshooting tasks will require root (or root-equivalent) access.
Checking the health of Triton
The sdc-healthcheck
script is designed to check the major components of Triton and provide a simplified report indicating the status of key services. Please see checking the health of a Triton installation for details on how to run and interpret the output of this script.
Checking compute node access
In order to check access to all defined compute nodes in an installation, you can use the sdc-oneachnode
command to issue a hostname
command to each configured compute node.
For example, this is the output for a Triton installation with one head node and 5 compute nodes:
[root@headnode (mxpa) /opt/custom/bin]# sdc-oneachnode -a hostname
HOST OUTPUT
headnode headnode
cnode01 cnode01
cnode02 cnode02
cnode03 cnode03
cnode04 cnode04
cnode05 cnode05
Notes:
- The
sdc-oneachnode
command executes with root permissions, and when called with the-a
flag will execute commands on all nodes in the Triton installation. Because of this, care should be taken when utilizing it. - Versions of Triton shipped prior to 1-July-2014 do not use the
-a
flag for thesdc-oneachode
command; the supplied command is run on all nodes by default.
You should receive a response from each compute node in your installation. If you fail to see a compute node, you will need to investigate to ensure that you are able to reach that compute node (via ping
, ssh
, etc), as this could point to an issue with that compute node. It could also point to an issue with the cnapi
zone on the head node.
Changing the root password
Because of the way that Triton boots (from USB key for the head node, and via PXE boot for the compute nodes), the following steps must be taken when changing the root password to ensure that the change persists across a reboot. To change the root password in Triton, please follow the procedure entitled changing the root password in Triton.
Troubleshooting a compute node
Issues with a compute node affect all instances that live on that particular compute node. In order to troubleshoot a compute node you will need to log onto that node with root (or root-equivalent) access.
Problems with a compute node can be broken into two broad areas:
- Performance issues
- Hardware issues
Hardware issues should be first diagnosed by checking the output of the fmadm
utility as described in Working with the Fault Management Configuration Tool fmadm.
Performance issues should always be investigated from the "bottom up"; that is, you should first check the performance within the instance that is experiencing the issue. For more information on troubleshooting an instance, please see Troubleshooting Virtual Machines / Instances below.
Once you have satisfied yourself that the instance is performing properly, you should then check the performance on the underlying compute node by reviewing the sections below and performing whatever troubleshooting is relevant to the symptoms you are seeing.
Working with the fault management configuration tool fmadm
The fmadm(1M) utility is used to administer and service problems detected by the Solaris Fault Manager, fmd(1M). If a component has been diagnosed as faulty, fmadm
will report what component has failed, and the response taken for the failed device.
To view a list of failed components, run fmadm
with the faulty
option:
# fmadm faulty
-------------- ------------------------------------ -------------- ---------
TIME EVENT-ID MSG-ID SEVERITY
-------------- ------------------------------------ -------------- ---------
May 02 20:00:34 abe52661-52aa-ec45-983e-f019e465db53 ZFS-8000-FD Major
Host : headnode
Platform : MS-7850 Chassis_id : To-be-filled-by-O.E.M.
Product_sn :
Fault class : fault.fs.zfs.vdev.io
Affects : zfs://pool=zones/vdev=5dbf266cd162b324
faulted and taken out of service
Problem in : zfs://pool=zones/vdev=5dbf266cd162b324
faulted and taken out of service
Description : The number of I/O errors associated with a ZFS device exceeded
acceptable levels. Refer to
http://illumos.org/msg/ZFS-8000-FD for more information.
Response : The device has been offlined and marked as faulted. An attempt
will be made to activate a hot spare if available.
Impact : Fault tolerance of the pool may be compromised.
Action : Run 'zpool status -x' and replace the bad device.
The above provides an example of what you may see if a device in a zpool was diagnosed as faulted. For more details on fmadm
, please view the fmadm
man(1M) pages.
Compute node disk performance
Disk performance on a compute node is a function of the speed of the disks, the amount of ARC cache, and the I/O load on the system.
The following sections approach disk performance from the standpoint of the compute node. For information on viewing disk performance from the standpoint of an instance, please see the section entitled Checking instance disk usage.
Compute node disk throttling
In order to handle a multi-tenant load, Triton implements a ZFS I/O throttle. This throttle is composed of two components; one component tracks and accounts for each zone's I/O request, and one component throttles each zone that exceeds its fair share of disk I/O. When the throttle detects that a zone is consuming more than is appropriate, each read or write system call is delayed by up to 100 microseconds. This allows other zones to interleave I/O requests during those delays.
Disk throttling activity can be viewed using the vfsstat(1m) command.
In this example, we pass the -zZ
flags to vfsstat
which tell it to report on all zones (-z
) but to omit data for any zones that are not showing any activity (-Z
). The 5 5
tells it to run 5 times at a 5 second delay between runs.
# vfsstat -Zz 5 5
r/s w/s kr/s kw/s ractv wactv read_t writ_t %r %w d/s del_t zone
112.9 13.7 566.0 21.9 0.0 0.0 47.2 45.8 0 0 0.0 0.0 global (0)
0.1 0.0 0.2 0.1 0.0 0.0 530.0 220.0 0 0 0.0 10.6 f45b2cc1 (2)
0.1 0.0 0.2 0.1 0.0 0.0 566.8 324.9 0 0 0.0 9.1 52ea67c4 (3)
0.1 0.0 0.2 0.1 0.0 0.0 569.3 316.4 0 0 0.0 8.2 7352a1d5 (4)
0.1 0.0 0.2 0.1 0.0 0.0 508.0 323.6 0 0 0.0 10.7 dce9c215 (5)
242.4 829.3 16292.4 15085.3 0.2 0.1 960.4 131.8 9 7 514.5 126.7 42ef02d8 (6)
0.6 0.1 2.0 0.4 0.0 0.0 178.5 120.7 0 0 0.1 48.4 929245ee (7)
150.4 0.1 475.0 0.0 0.0 0.0 3.5 143.9 0 0 0.0 8.4 446966ef (8)
55.9 0.1 176.0 0.0 0.0 0.0 4.3 76.7 0 0 0.0 5.0 3496feff (9)
151.8 0.1 479.5 0.0 0.0 0.0 3.6 142.6 0 0 0.0 9.7 e9e4589b (10)
147.1 0.1 465.3 0.0 0.0 0.0 5.1 139.5 0 0 0.0 24.9 211a5737 (11)
46.2 0.1 145.7 0.0 0.0 0.0 4.8 116.0 0 0 0.0 8.8 708ec2c4 (12)
52.5 2.5 161.7 0.7 0.0 0.0 9.6 44.3 0 0 0.0 18.3 c3de353c (13)
0.0 0.0 0.0 0.0 0.0 0.0 7.0 7.8 0 0 1.7 61.8 3ca31d30 (15)
0.0 0.0 0.0 0.0 0.0 0.0 86.3 13.8 0 0 0.7 34.5 119f6ffc (16)
0.0 0.0 0.0 0.0 0.0 0.0 1903.5 16.8 0 0 0.1 48.0 e25f589d (17)
0.0 0.0 0.0 0.0 0.0 0.0 8.5 13.8 0 0 0.1 103.6 dbaa3a0d (18)
0.0 0.0 0.0 0.0 0.0 0.0 7.5 6.6 0 0 0.0 35.0 d451978f (20)
0.0 0.0 0.0 0.0 0.0 0.0 7.1 12.7 0 0 0.0 79.3 f0cd9950 (21)
0.0 0.0 0.0 0.0 0.0 0.0 9.4 7.3 0 0 0.1 24.7 b5020965 (22)
159.5 0.1 503.5 0.0 0.0 0.0 3.6 138.3 0 0 0.0 6.8 2d716614 (31)
162.5 0.1 513.0 0.0 0.0 0.0 3.4 123.7 0 0 0.0 5.5 7d11de51 (34)
0.0 0.0 0.0 0.0 0.0 0.0 12.7 16.3 0 0 0.0 9.7 9911693c (51)
0.0 0.0 0.0 0.0 0.0 0.0 7.2 10.7 0 0 0.1 36.5 914ed552 (55)
0.0 0.0 0.0 0.0 0.0 0.0 109.8 13.3 0 0 0.0 30.6 529c4dc3 (56)
175.4 0.1 553.8 0.1 0.0 0.0 3.5 126.1 0 0 0.0 7.5 b89cbc9d (62)
0.0 0.0 0.0 0.0 0.0 0.0 35.3 7.2 0 0 6.9 85.1 b94f668a (63)
174.0 0.5 547.9 0.3 0.0 0.0 3.8 24.5 0 0 0.0 7.5 4c827f59 (65)
12.8 0.3 39.8 0.3 0.0 0.0 11.3 29.6 0 0 0.0 5.5 0f37f0e0 (66)
r/s w/s kr/s kw/s ractv wactv read_t writ_t %r %w d/s del_t zone
237.1 1.3 576.5 0.3 0.0 0.0 4.4 53.3 0 0 0.0 0.0 global (0)
1733.5 38.6 8285.1 90.4 0.0 0.0 14.6 20.1 2 0 0.0 0.0 42ef02d8 (6)
1.1 1.8 0.1 0.2 0.0 0.0 11.5 40.2 0 0 0.0 0.0 c3de353c (13)
0.2 0.0 0.0 0.0 0.0 0.0 28.7 0.0 0 0 0.0 0.0 7d11de51 (34)
r/s w/s kr/s kw/s ractv wactv read_t writ_t %r %w d/s del_t zone
240.2 2.0 584.2 0.5 0.0 0.0 3.6 45.9 0 0 0.0 0.0 global (0)
898.1 113.3 3533.2 315.1 0.0 0.0 14.2 17.3 1 0 0.0 0.0 42ef02d8 (6)
0.2 0.0 0.0 0.0 0.0 0.0 20.5 0.0 0 0 0.0 0.0 929245ee (7)
0.2 0.0 0.0 0.0 0.0 0.0 23.8 0.0 0 0 0.0 0.0 708ec2c4 (12)
1.7 1.8 0.1 0.3 0.0 0.0 13.3 46.4 0 0 0.0 0.0 c3de353c (13)
0.7 0.4 0.6 0.1 0.0 0.0 17.0 15.1 0 0 0.0 0.0 b89cbc9d (62)
r/s w/s kr/s kw/s ractv wactv read_t writ_t %r %w d/s del_t zone
245.6 3.3 666.2 1.2 0.0 0.0 3.9 56.3 0 0 0.0 0.0 global (0)
1317.7 27.5 6121.2 74.7 0.0 0.0 16.5 19.4 1 0 0.0 0.0 42ef02d8 (6)
1.1 3.9 0.1 0.6 0.0 0.0 11.2 38.4 0 0 0.0 0.0 c3de353c (13)
r/s w/s kr/s kw/s ractv wactv read_t writ_t %r %w d/s del_t zone
241.6 1.7 587.8 0.4 0.0 0.0 4.1 54.2 0 0 0.0 0.0 global (0)
1533.1 45.6 6427.7 106.1 0.0 0.0 13.9 21.6 1 0 0.0 0.0 42ef02d8 (6)
1.9 2.4 0.1 0.4 0.0 0.0 11.6 47.3 0 0 0.0 0.0 c3de353c (13)
0.2 0.0 0.0 0.0 0.0 0.0 34.0 0.0 0 0 0.0 0.0 2d716614 (31)
In the output above you will see six sets of data, separated by a header line:
r/s w/s kr/s kw/s ractv wactv read_t writ_t %r %w d/s del_t zone
The key fields we want to look at are d/s
and del_t
, which represent the delays per millisecond and the total delay that are being added to a zone's I/O requests to throttle its I/O.
The first set of data is historical and can be safely ignored; what we want to review are the other 5 sets.
Based on the data we see above, we do see some activity on this compute node, especially the zone containing UUID 42ef02d8
. However, we are not seeing any disk throttling taking place currently (although we can see from the historical data that zones on this compute node have been throttled in the past).
For more detailed information on how Triton manages disk throttling, please see Our ZFS IO Throttle.
Compute node ARC cache
The ARC, or Adaptive Replacement Cache, is used by Triton to improve file system and disk performance, with the goal of driving down overall system latency. In order to function smoothly, ARC requires a percentage of the available RAM in the system. This value, which can be set on the compute node level via the reservation ratio
value, is typically between 10-15% of the available RAM on the compute node.
Systems that have a low amount of ARC available will tend to exhibit symptoms of poor performance, such as stalls and overall lack of disk responsiveness.
The utilization and performance of the ARC cache can be viewed by using the arcstat
perl script.
A list of options and field definitions can be viewed by running arcstat
with the -v
flag:
# arcstat -v
Usage: arcstat [-hvx] [-f fields] [-o file] [interval [count]]
Field definitions are as follows:
mtxmis : mutex_miss per second
arcsz : ARC Size
mrug : MRU Ghost List hits per second
l2hit% : L2ARC access hit percentage
mh% : Metadata hit percentage
l2miss% : L2ARC access miss percentage
read : Total ARC accesses per second
c : ARC Target Size
mfug : MFU Ghost List hits per second
miss : ARC misses per second
dm% : Demand Data miss percentage
dhit : Demand Data hits per second
pread : Prefetch accesses per second
dread : Demand data accesses per second
pmis : Prefetch misses per second
l2miss : L2ARC misses per second
time : Time
l2bytes : bytes read per second from the L2ARC
pm% : Prefetch miss percentage
mm% : Metadata miss percentage
hits : ARC reads per second
mfu : MFU List hits per second
l2read : Total L2ARC accesses per second
mmis : Metadata misses per second
rmis : recycle_miss per second
mhit : Metadata hits per second
dmis : Demand Data misses per second
mru : MRU List hits per second
ph% : Prefetch hits percentage
eskip : evict_skip per second
l2size : Size of the L2ARC
l2hits : L2ARC hits per second
hit% : ARC Hit percentage
miss% : ARC miss percentage
dh% : Demand Data hit percentage
mread : Metadata accesses per second
phit : Prefetch hits per second
In the example below, we see that the compute node we are viewing currently has 14GB allocated to ARC:
# arcstat 5 5
time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c
14:23:03 0 0 0 0 0 0 0 0 0 14G 14G
14:23:08 44 1 2 1 2 0 0 0 0 14G 14G
14:23:13 224 1 0 1 0 0 0 0 0 14G 14G
14:23:18 125 0 0 0 0 0 0 0 0 14G 14G
14:23:23 81 0 0 0 0 0 0 0 0 14G 14G
Based on the stats, we can see this is a fairly quiet compute node. Key areas to examine are:
- Total size of ARC (
arcsz
) versus ARC Target Size (c
). An undersized ARC will result in performance problems. - Total ARC accesses per second (
read
). Shows how active ARC is. - ARC misses per second (
miss
); how often a request is not satisfied by data in ARC. - ARC miss percentage (
miss%
); percentage of the time a request is not satisfied by data in ARC.
It is possible to customize the fields shown by arcstat
as shown below.
# arcstat -f arcsz,read,hit%,miss% 5 5
arcsz read hit% miss%
14G 0 0 0
14G 1.3K 99 0
14G 60 100 0
14G 71 99 0
14G 50 95 4
The example above shows us the Total ARC reads, the ARC hit percentage, and the ARC miss percentage. Again, this shows that the compute node in question is performing very well.
For an in-depth discussion of ARC, please see Activity of the ZFS ARC
Compute node memory utilization
The memory on a compute node is used for provisioned instances (virtual machines), ARC, and the running OS. Although the bulk of the memory is reserved for instances, the compute node needs sufficient free space for both ARC and OS in order to function optimally.
ARC and memory
Please see the section above entitled Compute Node ARC Cache for information on how compute node memory is used for ARC.
Using zonememstat
The zonememstat(1m) command is used to show memory utilization by zone on a compute node.
This command can be run in three ways:
Show all zones and their usage:
```
[jschmidt@30Q7HS1 (us-east-1) ~]$ zonememstat
ZONE RSS(MB) CAP(MB) NOVER POUT(MB)
global 1179 - - -
f45b2cc1-4bec-c504-8661-bcb984796771 42 256 0 0
52ea67c4-65d6-4072-8442-8a035727ae85 68 256 0 0
7352a1d5-aff7-c206-f3d9-d5992ba5d713 66 256 0 0
dce9c215-2200-c27a-c339-94a8af9b2706 49 256 0 0
42ef02d8-1f25-4636-b964-cf054691d3f2 73 16384 0 0
929245ee-fcee-676d-fad7-c80c72e73487 252 256 12733 7821
446966ef-c957-ebfb-f667-f8ddafec7075 163 640 0 0
3496feff-1f80-c176-dd10-a7e36b53482e 36 4096 0 0
e9e4589b-4895-463c-e37a-d4a3720b900b 138 640 0 0
211a5737-e006-cb83-bca2-aa07e7d16411 226 256 2059 17171
708ec2c4-6bb0-4570-eed2-96de1ac4413f 24 256 0 0
c3de353c-5001-67ea-e6d9-d1a08eafc046 97 4096 0 0
3ca31d30-1613-404b-9db1-0dea9c5228fc 8309 9216 0 0
119f6ffc-6fde-4cf7-80ba-daa176a33da6 1863 2048 0 0
e25f589d-0fd9-6008-b27b-d088d9c6333c 295 512 0 0
dbaa3a0d-8254-4fa3-8869-bf596a9283b8 4141 4352 0 0
d451978f-a325-618c-ae03-b765657ce82d 17695 17792 0 0
f0cd9950-f6ab-e8b5-c2c1-fa54c408f297 1848 2048 0 0
b5020965-b563-cc4e-9d9d-8b83a54d3348 1867 2048 0 0
2d716614-0610-cc15-a2f3-b23a001d9e84 47 256 0 0
7d11de51-2a4d-6a89-a284-e1d0e4211f0c 47 4096 0 0
9911693c-af0d-6b05-9e16-ff0826b8ce32 285 512 0 0
914ed552-6aa6-e6dc-f380-9594645f96ac 2126 2304 0 0
529c4dc3-d470-e168-fa57-c00cf6c2da21 1073 1280 0 0
b89cbc9d-4948-6185-d6c3-b2f6661c9b7f 48 256 0 0
b94f668a-01b6-cc70-e0b0-d7934bdfda25 7739 7936 0 0
4c827f59-3318-c24f-c86c-d426c08a121a 38 256 0 0
0f37f0e0-8c93-4f5f-bd9d-135a173bb901 73 1024 0 0
```
Show only zones that have gone over their cap:
```
# zonememstat -o
ZONE RSS(MB) CAP(MB) NOVER POUT(MB)
929245ee-fcee-676d-fad7-c80c72e73487 197 256 12733 7821
211a5737-e006-cb83-bca2-aa07e7d16411 226 256 2059 17171
```
Show only one zone:
```
# zonememstat -z 929245ee-fcee-676d-fad7-c80c72e73487
ZONE RSS(MB) CAP(MB) NOVER POUT(MB)
929245ee-fcee-676d-fad7-c80c72e73487 233 256 12733 7821
```
The key fields to look at are:
Field | Meaning |
---|---|
ZONE | The UUID (name) of the zone |
RSS(MB) | The RSS (resident stack size) of the zone currently, in MB |
CAP(MB) | The memory cap of the zone, in MB |
NOVER | The number of times the zone has gone over the memory cap since booting |
POUT(MB) | The amount of memory the zone has gone over the memory cap since booting, in MB |
Looking at our example above, we can see that zone 929245ee-fcee-676d-fad7-c80c72e73487
has a 256MB cap and they are currently using 233MB of memory. They have gone over their cap 12,733 times since the instance was booted for a total of 7.8GB of data. This instance should likely be investigated, and a resize suggested to the instance owner.
Zone memory usage calculation
For additional information on how memory usage/capping works in Triton and SmartOS, please see About Calculating Memory Usage and Capping
Compute node CPU utilization
Although it does show you how investigate at the zone/instance level, the documentation below primarily concerns overall CPU utilization on the compute node, and does not discuss the Fair Share scheduler and how it manages CPU resources. For more information on Fair Share please see About CPU Usage on our wiki.
Load averages
The uptime(1) command can be used to provide a quick overview of the server's activity. The numbers reported by this command provide a rough count of the number of processes currently executing plus the number of processes in the run queue. The first number is the average over the last minute, the second over the last 5 minutes, and the third over the last 15 minutes.
Although a high load average may be evidence of problems, it is not enough by itself to diagnose a problem. For that you will need to run additional diagnostics.
The prstat command
SmartOS uses prstat(1M) instead of top; it understands SmartOS better and has lower overhead. The man pages contain a great deal of detail on the various options available, but the most common invocations are:
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
15971 root 7749M 7737M cpu5 1 0 108:26:25 5.4% qemu-system-x86/11
16622 root 8320M 8307M cpu7 59 0 423:50:23 2.9% qemu-system-x86/10
82497 103 91M 90M cpu10 4 0 1:44:35 0.8% node/5
15554 root 1876M 1861M sleep 59 0 96:05:36 0.6% qemu-system-x86/5
82499 103 176M 135M sleep 14 0 3:46:18 0.4% node/7
15755 root 4150M 4138M sleep 1 0 82:44:55 0.3% qemu-system-x86/5
16429 root 17G 17G sleep 59 0 17:43:16 0.1% qemu-system-x86/5
16483 root 1878M 1865M sleep 59 0 14:35:47 0.1% qemu-system-x86/6
3133 root 180M 151M sleep 1 0 6:56:48 0.1% node/7
85 root 0K 0K sleep 99 -20 86:20:47 0.1% zpool-zones/166
86339 root 1084M 1071M sleep 59 0 3:41:59 0.1% qemu-system-x86/4
64300 root 2137M 2124M sleep 59 0 4:35:36 0.1% qemu-system-x86/5
15838 root 1861M 1846M sleep 59 0 8:53:51 0.0% qemu-system-x86/4
80341 root 296M 283M sleep 59 0 2:46:39 0.0% qemu-system-x86/4
42078 root 28M 20M sleep 1 0 0:09:07 0.0% nscd/35
16065 root 305M 292M sleep 59 0 7:37:14 0.0% qemu-system-x86/4
34728 zabbix 6152K 3944K sleep 59 0 4:06:40 0.0% zabbix_agentd/1
12526 root 54M 36M sleep 58 0 2:20:22 0.0% node/7
18025 root 52M 25M sleep 59 0 4:39:01 0.0% node/7
11053 zabbix 5592K 3184K sleep 59 0 2:44:30 0.0% zabbix_agentd/1
72139 829 210M 7708K sleep 59 0 1:29:42 0.0% mongod/11
16466 829 210M 7588K sleep 59 0 1:22:50 0.0% mongod/11
22362 829 210M 8500K sleep 59 0 0:37:48 0.0% mongod/11
17703 829 211M 8440K sleep 59 0 1:48:13 0.0% mongod/11
17856 829 211M 7372K sleep 59 0 1:44:26 0.0% mongod/11
7185 jschmidt 4732K 3984K cpu15 59 0 0:00:00 0.0% prstat/1
34733 zabbix 6192K 2940K sleep 59 0 0:55:59 0.0% zabbix_agentd/1
4306 root 70M 48M sleep 1 0 2:29:19 0.0% node/7
16418 root 1852K 904K sleep 59 0 0:00:00 0.0% pfexecd/2
4292 root 2144K 772K sleep 1 0 0:00:00 0.0% iscsid/2
ZONEID NPROC SWAP RSS MEMORY TIME CPU ZONE
63 2 7749M 7737M 7.9% 108:26:25 5.4% b94f668a-01b6-cc70-e0b0-d79*
15 2 8320M 8307M 8.5% 423:50:23 2.9% 3ca31d30-1613-404b-9db1-0de*
7 18 327M 249M 0.2% 5:40:51 1.2% 929245ee-fcee-676d-fad7-c80*
16 2 1876M 1861M 1.9% 96:05:36 0.6% 119f6ffc-6fde-4cf7-80ba-daa*
18 2 4150M 4138M 4.2% 82:44:55 0.3% dbaa3a0d-8254-4fa3-8869-bf5*
0 116 2047M 1373M 1.2% 174:06:38 0.3% global
20 2 17G 17G 18% 17:43:16 0.1% d451978f-a325-618c-ae03-b76*
22 2 1878M 1865M 1.9% 14:35:47 0.1% b5020965-b563-cc4e-9d9d-8b8*
56 2 1084M 1071M 1.1% 3:41:59 0.1% 529c4dc3-d470-e168-fa57-c00*
55 2 2137M 2124M 2.2% 4:35:36 0.1% 914ed552-6aa6-e6dc-f380-959*
21 2 1861M 1846M 1.9% 8:53:51 0.0% f0cd9950-f6ab-e8b5-c2c1-fa5*
Total: 488 processes, 2310 lwps, load averages: 3.97, 5.51, 9.82
, given the UUID of the instance you are interested.
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
22362 829 210M 8500K sleep 59 0 0:37:48 0.0% mongod/11
22189 root 14M 10M sleep 1 0 0:00:19 0.0% nscd/26
22364 root 2280K 1084K sleep 59 0 0:00:00 0.0% ttymon/1
22370 root 6528K 2640K sleep 58 0 0:00:00 0.0% inetd/3
21852 root 8584K 7012K sleep 43 0 0:00:18 0.0% svc.configd/13
21744 root 0K 0K sleep 60 - 0:00:00 0.0% zsched/1
22363 root 2144K 1056K sleep 59 0 0:00:00 0.0% sac/1
21810 root 2680K 1652K sleep 59 0 0:00:01 0.0% init/1
21850 root 6244K 4296K sleep 59 0 0:00:09 0.0% svc.startd/13
22405 root 4344K 1776K sleep 59 0 0:00:06 0.0% sshd/1
22252 root 2492K 1260K sleep 59 0 0:00:00 0.0% pfexecd/3
22371 root 1940K 1024K sleep 59 0 0:00:00 0.0% ttymon/1
22359 root 6072K 3944K sleep 59 0 0:00:17 0.0% rsyslogd/6
22355 root 1872K 1068K sleep 59 0 0:00:00 0.0% cron/1
22369 root 1652K 816K sleep 59 0 0:00:01 0.0% utmpd/1
21916 netadm 4448K 3124K sleep 59 0 0:00:05 0.0% ipmgmtd/4
Total: 16 processes, 87 lwps, load averages: 3.66, 5.25, 9.55
, which includes microstate information and information on LWP (lightweight processes).
The -c
flag tells prstat to print each new set of information below the previous set. This is ideal for writing prstat
information to a log file, for example.
# prstat -mLc
Please wait...
PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
15971 root 0.7 20 0.0 0.0 0.0 0.2 74 4.6 1K 542 9K 6 qemu-system-/3
16622 root 9.7 8.1 0.0 0.0 0.0 0.3 79 2.5 19K 71 .1M 0 qemu-system-/1
82499 103 16 0.9 0.1 0.0 0.0 0.0 73 10 111 645 976 0 node/1
15971 root 0.6 16 0.0 0.0 0.0 0.1 79 3.9 1K 283 6K 5 qemu-system-/4
7334 jschmidt 2.9 13 0.0 0.0 0.0 0.0 84 0.0 41 41 49K 0 prstat/1
16622 root 0.3 8.7 0.0 0.0 0.0 0.2 87 4.2 2K 260 2K 0 qemu-system-/3
16622 root 0.3 8.7 0.0 0.0 0.0 0.2 87 4.2 2K 261 2K 0 qemu-system-/5
16622 root 0.3 8.7 0.0 0.0 0.0 0.2 87 4.2 2K 267 2K 0 qemu-system-/4
16622 root 0.3 8.7 0.0 0.0 0.0 0.2 87 4.2 2K 262 2K 0 qemu-system-/6
15971 root 4.9 3.6 0.0 0.0 0.0 0.2 89 2.6 10K 84 81K 0 qemu-system-/1
82497 103 8.1 0.0 0.0 0.0 0.0 0.0 90 2.2 9 112 100 0 forza/1
15554 root 3.5 3.0 0.0 0.0 0.0 0.0 92 1.8 7K 54 58K 0 qemu-system-/1
5155 root 0.0 6.3 0.0 0.0 0.0 94 0.1 0.0 29 2 302 0 zoneadmd/5
15755 root 3.6 2.1 0.0 0.0 0.0 0.0 91 2.9 6K 200 44K 0 qemu-system-/1
15554 root 0.2 4.5 0.0 0.0 0.0 0.0 94 0.8 1K 47 1K 0 qemu-system-/3
Total: 489 processes, 2334 lwps, load averages: 3.39, 4.73, 8.93
PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
15971 root 2.0 58 0.0 0.0 0.0 0.1 40 0.4 4K 288 24K 5 qemu-system-/3
15971 root 1.3 53 0.0 0.0 0.0 0.1 45 0.3 4K 88 17K 8 qemu-system-/4
82499 103 37 2.5 0.0 0.0 0.0 0.0 44 16 256 573 2K 0 node/1
16622 root 12 9.5 0.0 0.0 0.0 0.0 78 1.2 22K 24 .2M 0 qemu-system-/1
82497 103 18 0.0 0.0 0.0 0.0 0.0 78 3.5 9 150 109 0 forza/1
16622 root 0.5 12 0.0 0.0 0.0 0.1 87 0.2 3K 3 3K 0 qemu-system-/3
16622 root 0.5 12 0.0 0.0 0.0 0.2 87 0.4 3K 217 3K 0 qemu-system-/6
16622 root 0.5 11 0.0 0.0 0.0 0.2 88 0.4 3K 261 3K 0 qemu-system-/4
15971 root 6.1 4.8 0.0 0.0 0.0 0.0 88 1.0 12K 92 .1M 0 qemu-system-/1
16622 root 0.4 10 0.0 0.0 0.0 0.1 89 0.2 3K 7 3K 0 qemu-system-/5
15554 root 4.4 3.8 0.0 0.0 0.0 0.0 91 0.6 9K 9 74K 0 qemu-system-/1
15554 root 0.4 5.9 0.0 0.0 0.0 0.1 93 0.2 2K 27 3K 0 qemu-system-/3
15755 root 3.7 1.8 0.0 0.0 0.0 0.0 94 0.5 5K 118 40K 0 qemu-system-/1
16429 root 0.0 4.4 0.0 0.0 0.0 0.0 96 0.0 71 16 56 0 qemu-system-/4
16622 root 0.9 2.5 0.0 0.0 0.0 0.0 95 1.2 18K 5 36K 18K qemu-system-/7
Total: 489 processes, 2319 lwps, load averages: 3.39, 4.71, 8.90
The kstat command
The kstat utility permits you to read available kernel statistics; to access this facility you can call kstat(1M) with a set of criteria which it will then try to match against available statistics. Data that matches is printed with its module, instance, and name fields, as well as its actual value.
To view information on all zones, use kstat -p caps::cpucaps_zone_*
# kstat -p caps::cpucaps_zone_*
caps:2:cpucaps_zone_2:above_base_sec 52
caps:2:cpucaps_zone_2:above_sec 22
caps:2:cpucaps_zone_2:baseline 7
caps:2:cpucaps_zone_2:below_sec 2659222
caps:2:cpucaps_zone_2:burst_limit_sec 0
caps:2:cpucaps_zone_2:bursting_sec 0
caps:2:cpucaps_zone_2:class zone_caps
caps:2:cpucaps_zone_2:crtime 2007.585090139
caps:2:cpucaps_zone_2:effective 50
caps:2:cpucaps_zone_2:maxusage 65
caps:2:cpucaps_zone_2:nwait 0
caps:2:cpucaps_zone_2:snaptime 2661232.852473399
caps:2:cpucaps_zone_2:usage 0
caps:2:cpucaps_zone_2:value 50
caps:2:cpucaps_zone_2:zonename f45b2cc1-4bec-c504-8661-bcb984796771
<-------- SNIP -------->
caps:66:cpucaps_zone_66:above_base_sec 20
caps:66:cpucaps_zone_66:above_sec 1
caps:66:cpucaps_zone_66:baseline 30
caps:66:cpucaps_zone_66:below_sec 146678
caps:66:cpucaps_zone_66:burst_limit_sec 0
caps:66:cpucaps_zone_66:bursting_sec 0
caps:66:cpucaps_zone_66:class zone_caps
caps:66:cpucaps_zone_66:crtime 2514572.257500655
caps:66:cpucaps_zone_66:effective 200
caps:66:cpucaps_zone_66:maxusage 217
caps:66:cpucaps_zone_66:nwait 0
caps:66:cpucaps_zone_66:snaptime 2661241.997279293
caps:66:cpucaps_zone_66:usage 0
caps:66:cpucaps_zone_66:value 200
caps:66:cpucaps_zone_66:zonename 0f37f0e0-8c93-4f5f-bd9d-135a173bb901
Viewing information on a specific zone is a two step process.
-
Get the
zone_id
by runningzoneadm list -v
and then grep'ing for the UUID of the zone you want information on.#zoneadm list -v | grep f0cd9950-f6ab-e8b5-c2c1-fa54c408f297 21 f0cd9950-f6ab-e8b5-c2c1-fa54c408f297 running /zones/f0cd9950-f6ab-e8b5-c2c1-fa54c408f297 kvm excl
-
Call
kstat
with thezone_id
from above.# kstat -p caps::cpucaps_zone_21 caps:21:cpucaps_zone_21:above_base_sec 0 caps:21:cpucaps_zone_21:above_sec 5 caps:21:cpucaps_zone_21:baseline 0 caps:21:cpucaps_zone_21:below_sec 2659508 caps:21:cpucaps_zone_21:burst_limit_sec 0 caps:21:cpucaps_zone_21:bursting_sec 0 caps:21:cpucaps_zone_21:class zone_caps caps:21:cpucaps_zone_21:crtime 2031.315195651 caps:21:cpucaps_zone_21:effective 100 caps:21:cpucaps_zone_21:maxusage 162 caps:21:cpucaps_zone_21:nwait 0 caps:21:cpucaps_zone_21:snaptime 2661537.715075964 caps:21:cpucaps_zone_21:usage 1 caps:21:cpucaps_zone_21:value 100 caps:21:cpucaps_zone_21:zonename f0cd9950-f6ab-e8b5-c2c1-fa54c408f297
The key values you want to check are listed in the table below.
Name | Description |
---|---|
usage | current CPU usage |
maxusage | high watermark of CPU usage |
value | CPU cap value. This is the most CPU the instance can use while bursting. |
baseline | CPU minimum value. This is the guaranteed minimum CPU usage for the instance. |
above_base_sec | Number of seconds the instance was bursting (above baseline) |
The percentage is the total across all CPUs (psrinfo
). So, a value of
200 is equivalent to 2 virtual CPUs (a virtual CPU is either a core or a
hyper-thread).
Compute node hard drive failures
Hard drive failures on compute nodes affect all instances that live on the server. Two key tools to help troubleshoot and identify potential problems with the underlying storage are iostat(1m), kstat(1m), zpool(1m)
Verifying failures using iostat / kstat
For a quick look at potential disk errors, you can run the iostat(1m) command with the -En
option as shown below:
# iostat -En
c0t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: Generic Product: STORAGE DEVICE Revision: 9451 Serial No:
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c1t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: Kingston Product: DataTraveler 2.0 Revision: PMAP Serial No:
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 82 Predictive Failure Analysis: 0
c2t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: WDC WD10EZEX-00K Revision: 1H15 Serial No: WD-WCC1S5975038
Size: 1000.20GB <1000204886016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 116 Predictive Failure Analysis: 0
c2t1d0 Soft Errors: 0 Hard Errors: 104 Transport Errors: 0
Vendor: ATA Product: WDC WD5000AVVS-0 Revision: 1B01 Serial No: WD-WCASU7437291
Size: 500.11GB <500107862016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 121 Predictive Failure Analysis: 0
c2t2d0 Soft Errors: 0 Hard Errors: 26 Transport Errors: 0
Vendor: HL-DT-ST Product: DVDRAM GH24NS95 Revision: RN01 Serial No:
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 26 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
In the example above, device c2t1d0
is showing 104 hard errors. This is indicative of a potential failure, and the disk should be monitored for failure.
The values reported by iostat -E
can also be pulled from the kstat
command:
[root@headnode (cak-1) ~]# kstat -n sd3,err
module: sderr instance: 3
name: sd3,err class: device_error
crtime 48.741894257
Device Not Ready 0
Hard Errors 104
Illegal Request 62
Media Error 0
No Device 0
Predictive Failure Analysis 0
Product WDC WD5000AVVS-09
Recoverable 0
Revision 1B01
Serial No WD-WCASU7437291
Size 500107862016
snaptime 766415.722265268
Soft Errors 0
Transport Errors 0
Vendor ATA
Verifying the status of zpools
The zpool
command manages and configures ZFS storage pools, which are a collection of virtual devices (physical drives or LUNS) that are provided to ZFS datasets (zones). The actual physical configuration of disks comprising your zpool will depend on numerous factors, including your hardware vendor, type and number of disks, presence of RAID card, etc.
To view the health and status of storage pools you can run the the following zpool
command:
# zpool status
pool: zones
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
zones ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
cache
c2t1d0 ONLINE 0 0 0
errors: No known data errors
The above output indicates a healthy storage pool with no errors or disk maintenance activities in progress.
The output below shows a zpool that is currently in a degraded state.
# zpool status
pool: zones
state: DEGRADED
status: One or more devices is an an offline state.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: none requested
config:
NAME STATE READ WRITE CKSUM
zones DEGRADED 0 0 0
mirror-0 ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
c2t1d0 ONLINE 0 0 0
mirror-1 DEGRADED 0 0 0
c3t0d0 OFFLINE 0 0 0
c3t1d0 ONLINE 0 0 0
spares
c4t0d0 AVAIL
c4t1d0 AVAIL
As you can see from the output above, the instructions on how to correct this issue are provided. In this case we are able to simply online the device that is showing as offline.
# zpool status
pool: zones
state: ONLINE
scan: resilvered 143K in 0h0m with 0 errors on Fri Jun 27 13:30:37 2014
config:
NAME STATE READ WRITE CKSUM
zones ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
c2t1d0 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
c3t0d0 ONLINE 0 0 0
c3t1d0 ONLINE 0 0 0
spares
c4t0d0 AVAIL
c4t1d0 AVAIL
errors: No known data errors
The zpool is now showing as healthy; the scan
line show us that the resilvering process to bring the device we onlined back to a functional state completed without errors.
Recovering a compute node in Triton
In extreme cases where a hardware fault renders a compute node non-functional, it is generally possible to recover the compute node. To recover a failed compute node, please see recovering a compute node in Triton
Troubleshooting manatee
Manatee is an automated failover, fault monitoring and leader-election system built for managing a set of replicated Postgres servers. It is written completely in Node.js, and is used by Triton for the storage of persistent data. Please see Manatee overview and troubleshooting manatee for more information on manatee, how to check the health of manatee, and how to recover from failure modes with manatee.
Troubleshooting Cloud Firewall (FWAPI)
Triton comes with Cloud Firewall, a service that manages firewalls within the installation for infrastructure containers running SmartOS. Please see Troubleshooting Cloud Firewall for more information on Cloud Firewall (FWAPI) and how to troubleshoot it.
Troubleshooting virtual machines / instances
Instance (or Virtual Machine) level troubleshooting will require access to the instance in question. This access can come in one of four ways:
- Connecting to the instance via SSH (for SmartOS, Linux, or FreeBSD based instances).
- Connecting to the instance via RDP (for Windows based instances).
- Connecting to the instance via zlogin(1) (for containers running SmartOS).
- Connecting via VNC console access (for Windows, Linux, or FreeBSD based instances). This is the only option for hardware virtual machines that are not reachable via the networks defined to the instances.
Most of these troubleshooting processes will require root (or root equivalent) access to the virtual machine / instance in question.
Checking instance disk usage
There are numerous tools that can be used to manage containers and hardware virtual machines in Triton. Please see instance disk usage for a discussion of those tools as well as some of the key differences between how disk space is allocated, used, and managed between the two types of instances.
Checking instance networking
The analysis of network utilization inside instances varies by instance type.
Instance Type | Tools |
---|---|
SmartOS | netstat(1m), netcat, nmap |
Linux | netstat(8), netcat, nmap |
FreeBSD | netstat(1) , netcat, nmap |
Windows | Performance Monitor |
Triton provides tools to view and manage instance level networking in order to troubleshoot, diagnose, and solve problems at the instance level. For more information please see checking instance networking.
Checking instance CPU usage
The analysis of CPU utilization inside instances varies by instance type. The links in the table below provide additional information on the listed tools.
Instance Type | Tools |
---|---|
SmartOS | prstat(1m), kstat(1m) |
Linux | top(1) |
FreeBSD | top(1) |
Windows | Performance Monitor |
Checking instance memory usage
The analysis of CPU utilization inside instances varies by instance type. The links in the table below provide additional information on the listed tools.
Instance Type | Tools |
---|---|
SmartOS | prstat(1m), kstat(1m) |
Linux | top(1) |
FreeBSD | top(1) |
Windows | Performance Monitor |
Clearing an instance stuck in “provisioning"
There are certain job failure modes which will leave an instance stuck in a provisioning state. In order to clear this, you will need to follow the procedure outlined in the document Clearing an instance stuck in “provisioning”.
Forcing a re-sync of VMAPI for an instance
There are certain rare job failure modes which can cause the output of vmadm(1m) on a compute node and the data contained in VMAPI to differ. It is possible to force VMAPI to synchronize with the actual state of the instance as reported by vmadm
. To do this, you will need to follow the procedure outlined in the document Forcing a re-sync of VMAPI.
Advanced troubleshooting
The following section outlines more advanced troubleshooting procedures.
General provision troubleshooting
At times, the provisioning process will fail. When this happens, the cause for the failure should be investigated. Please see provision troubleshooting for more information.
Changing NTP and/or DNS servers post install
The DNS and NTP settings that are configured at installation time for Triton are intended to remain static. However, if your circumstances dictate that either or both of these must be changed it can be achieved as describe in the document entitled Changing global NTP and DNS settings post configuration.
Sending files to MNX
In the case of severe or chronic problems, MNX support may request that you submit a crash dump or support bundle. These files provide valuable information that is used by MNX support and Engineering to investigate the particular issue you are experiencing.
Generating a crash dump
Any time a system becomes nonfunctional and has to be rebooted, MNX support requires that a crash dump be written. This can be done via the console, or via the generation of a NMI (non-maskable interrupt) Without a crash dump, diagnosing system hangs is effectively impossible.
Additionally, there may be other situations where you are requested to generate a crash dump in order to supply MNX support with information to help debug issue you are experiencing.
To generate a crash dump, please see generating a crash dump
Sending support bundles to MNX
Triton DataCenter includes a feature to create and optionally send a support bundle to MNX support to help troubleshoot problems with your installation. This bundle includes a number of logs and configuration files from your Triton Triton DataCenter installation. To generate and send a support bundle to MNX support, please see sending support bundles to MNX support
Additional resources
-
Triton DataCenter offers several different training courses to enable you to get the most out of SmartOS and Triton. For more information, please go to Triton DataCenter Training Services.
- The USE Method is a framework for breaking down and solving complex performance problems, and can be applied at the Triton level as well as the instance level. See The USE Method for more information.