Analyzing zone memory utilization

Modified: 08 Sep 2022 04:28 UTC

At the lowest level all containers run within a zone provided by the host SmartOS hypervisor. This page explains some of the ways that the containers can be monitored at the zone level.

Monitoring and analyzing memory on zones

Memory utilization for zones can be monitored using tools that are available on the compute node. The most useful tools include (but are not limited to):

Using and understanding zonememstat

The zonememstat command provides an easy-to-read summary of a zone's memory usage, and offers a quick way to verify if a zone is running out of memory.

When zonememstat is run without any options from the global zone of a compute node, your results will return memory utilization stats for every zone (including HVM instances) on the server:

computenode# zonememstat
                                 ZONE  RSS(MB)  CAP(MB)    NOVER  POUT(MB)
                               global      789        -        -         -
 fea0641a-00c6-40f7-9e28-61a9403afd2e      107     1024        0         0
 b149e29d-5922-413e-8824-b2143445f2dc      902     8192        0         0
 b1733fdd-34cb-4cc2-b96c-988a42b133a1      176     2048        0         0
 993033e3-15d0-4061-8d87-8c57f0a72451     3852     4096        0         0
 f5f37f43-223a-4839-b6ea-5fc70527897c      232      512        0         0
 da7b4c70-2742-47a1-86d6-f8a9fe38d0ac     1741     1792       46      7111
 c176cf01-fa85-4d13-8ea1-6a59b443ac93     1648     8192        0         0
 78ef3709-7122-4b06-b997-17d0292082c4      877     4096      956      4275
 c16b63b5-792e-46d6-9fb4-02ec5e1be94b      263      512        0         0

It is important to note that for hardware virtualized containers (HVM instances), you will see the entire amount of allocated memory under the RSS(MB) column (whether that memory is being used or not). For zones with SmartOS or Conatiner Native Linux, you will only see the amount the zone is using at that moment in time.

You can quickly see which zones on a compute node are running out of memory by running zonememstat with the -o option as shown below:

computenode# zonememstat -o
                                 ZONE  RSS(MB)  CAP(MB)    NOVER  POUT(MB)
 da7b4c70-2742-47a1-86d6-f8a9fe38d0ac     1808     1792       46      7111
 78ef3709-7122-4b06-b997-17d0292082c4     1456     4096      956      4275

The above example shows two zones that are running out of memory. The NOVER column provides the number of times the zone has gone over its cap, and the POUT(MB) column indicates the total amount of memory that has been paged out when the zone has gone over its cap.

You can also look at memory utilization for a specific zone by using the -z option, as exampled below:

computenode# zonememstat -z da7b4c70-2742-47a1-86d6-f8a9fe38d0ac

In general, if you see that a zone's RSS is quite high and has reported values for NOVER and POUT(MB), then the zone is likely running out of physical memory.

Using and understanding prstat

The prstat utility examines active processes on a compute node, but can also be used to monitor and examine memory usage. Using prstat with the -Z option will give you a live snapshot of process and memory activity.

Each process will report the RSS being used, and a general summary of activity by zone can be monitored at the bottom of the output.

computenode# prstat -Z

   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
  3152 root       72M   17M sleep  100    -   0:18:49 0.2% node/6
  3040 root       63M   54M sleep    1    0   0:06:40 0.2% node/7
 84656 root     4208K 3168K sleep    1    0   0:00:00 0.0% bash/1
    85 root        0K    0K sleep   99  -20   0:07:12 0.0% zpool-zones/229
 84651 root     6384K 3644K sleep    1    0   0:00:00 0.0% sshd/1
  3041 root      180M  145M sleep    1    0   0:04:02 0.0% node/3
  3365 root       20M   15M sleep    1    0   0:01:10 0.0% dtrace/1
  3051 root       62M   54M sleep    1    0   0:01:54 0.0% node/7
  2844 root     4212K 2932K sleep   29    0   0:00:00 0.0% picld/4
  3015 root     1948K 1432K sleep   59    0   0:00:00 0.0% ctrun/1
  3016 root       41M   34M sleep    1    0   0:00:01 0.0% node/7
  2874 root     2360K 1416K sleep    1    0   0:00:00 0.0% svc.ipfd/1
ZONEID    NPROC  SWAP   RSS MEMORY      TIME  CPU ZONE
     0       74  966M  665M    16%   0:43:23 0.6% global
     2       26  504M  167M   4.1%   0:00:23 0.0% 3ba2d253-6a5b-66af-aca7-f70*

The zone 3ba2d253-6a5b-66af-aca7-f70 shown in the output above shows that it's currently using 504M in swap and 167M in RSS. That zone is only using approximately 4.1% of its total memory, and is therefore well under its physical memory usage.

Another useful example that can be seen by adding the -mLc options for prstat as shown below:

computenode# prstat -mLc
Please wait...
   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
 84969 root      21  79 0.1 0.0 0.0 0.0 0.0 0.0   0 330 .2M   0 prstat/1
  3152 root     0.7 0.1 0.0 0.0 0.0 0.0  99 0.0   4   0  44   0 node/1
  3040 root     0.1 0.2 0.0 0.0 0.0 0.0 100 0.0   8   0  60   0 node/1
  3041 root     0.0 0.1 0.0 0.0 0.0 0.0 100 0.0   3   0  26   0 node/1
  3032 root     0.0 0.1 0.0 0.0 0.0 0.0 100 0.0   2   0  25   0 node/1
  3051 root     0.0 0.1 0.0 0.0 0.0 0.0 100 0.0   2   0  17   0 node/1
  3365 root     0.0 0.0 0.0 0.0 0.0 100 0.0 0.0  10   0  1K   0 dtrace/1
    85 root     0.0 0.0 0.0 0.0 0.0 0.0 100 0.0   9   0   0   0 zpool-zones/100
    85 root     0.0 0.0 0.0 0.0 0.0 0.0 100 0.0   9   0   0   0 zpool-zones/97
    85 root     0.0 0.0 0.0 0.0 0.0 0.0 100 0.0   9   0   0   0 zpool-zones/94
 84656 root     0.0 0.0 0.0 0.0 0.0 0.0 100 0.0   0   0  50   0 bash/1
 84651 root     0.0 0.0 0.0 0.0 0.0 0.0 100 0.0   2   0  22   0 sshd/1
  3936 root     0.0 0.0 0.0 0.0 0.0 0.0 100 0.0   2   0   3   0 vmadmd/1
  3034 root     0.0 0.0 0.0 0.0 0.0 0.0 100 0.0   0   0   1   0 metadata/1
    85 root     0.0 0.0 0.0 0.0 0.0 0.0 100 0.0  24   0   0   0 zpool-zones/131
Total: 100 processes, 628 lwps, load averages: 0.07, 0.07, 0.06

Specifically, a high percentage in the DFL column indicates that a zone is paging out, and is therefore a sign of memory exhaustion.

Similar to zonememstat, if you are only interested in monitoring memory usage for one specific zone, you can do so by specifying the zone uuid with the -z option:

computenode# prstat -Z -z 3ba2d253-6a5b-66af-aca7-f7083bf370d3

   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
 82793 901        10M 4592K sleep    1    0   0:00:00 0.0% pickup/1
  4469 root     5856K 2596K sleep    1    0   0:00:00 0.0% rsyslogd/6
  4472 root     4388K 1632K sleep   20    0   0:00:00 0.0% sshd/1
  4557 847        37M 2856K sleep    1    0   0:00:00 0.0% httpd/1
...

ZONEID    NPROC  SWAP   RSS MEMORY      TIME  CPU ZONE
     2       26  504M  167M   4.1%   0:00:23 0.0% 3ba2d253-6a5b-66af-aca7-f70*

Using and understanding vmstat

The vmstat tool is a performance tool that provides a summary of the overall health of a system in terms of both cpu and memory usage.

The key columns you'll want to focus on for monitoring memory usage are:

Column Meaning
swap measure of virtual memory
free measure of dram, or actual main memory of system
sr scanrate; the number of pages scanned
w the count of threads that have been copied out to swap

The vmstat tool can be run as follows:

computenode# vmstat 1 5

The first number (1) is the interval in seconds in which you want to obtain your summary, and the second number (5) is the count, or number of intervals you want returned. So, basically what we're saying is we want an overall average of cpu and memory usage per second, for the next 5 seconds.

The following is an example of vmstat output:

computenode# vmstat 1 5
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr lf rm s0 s1   in   sy   cs us sy id
 0 0 0 4358320 323188 26 574 1  0  0  0  3  0 -245 58 19 3131 852 942  1  1 98
 0 0 0 4255148 217032 2  48  0  0  0  0  0  0  0  0 26 2797  686  645  0  0 100
 0 0 0 4255068 216964 51 462 0  0  0  0  0  0  0  0  0 2778 1496  592  0  1 98
 0 0 0 4255068 216968 0   2  0  0  0  0  0  0  0  0  0 2765  617  538  0  0 100
 0 0 0 4255068 216984 0   2  0  0  0  0  0  0  0  0  0 2769  605  537  0  0 100

The very first line reports the summary since boot, meaning that it reports the averages for the entire time that the system has been up. While in some cases, the first line can be ignored if all you care about is what the system is doing now, it can offer a good baseline in terms of how the system is trending (is it getting worse or better?).

As mentioned earlier, the key columns to pay attention to for memory usage are: swap, free, sr, and w.

If the swap column goes down to 0, mallocs (requests for memory), start to fail. This can often bring applications requesting memory to their knees. In general, this can be alleviated by expanding virtual memory (i.e. adding more swap devices).

If the free column starts to get low, then the system will start paging and swapping, which will affect performance.

If you see indications of scanning under the sr column, then the system is suffering from severe memory pressure.

If you see a count of threads under the w column, then the system is suffering from extreme memory issues, and is likely to become entirely unresponsive.

Keep in mind that vmstat provides memory and cpu usage averages for the entire system (i.e. the compute node), and is not zone-aware.