Please note: This website includes an accessibility system. Press Control-F11 to adjust the website to the visually impaired who are using a screen reader; Press Control-F10 to open an accessibility menu.

NCC Health Check: pcvm_disk_usage_check

NCC Health Check: pcvm_disk_usage_check

NCC Health Check: pcvm_disk_usage_check

Description

The NCC health check pcvm_disk_usage_check verifies that the amount of disk or system partition usage in the Prism Central (PC) VM is within limits.

This check has the following parts:

  1. Checking the individual data disk usage (added in NCC 3.5.1):
    • If usage is more than 75% for several hours, a WARNING is returned to identify the disk.
    • If usage is more than 90% for several hours, a FAIL is returned to identify the disk.
       
  2. Checking the overall data disk usage (added in NCC 3.10.1):
    • If overall usage is more than 90% for several hours, a WARNING is returned.
       
  3. Checking the Prism Central VM system root partition usage (added in NCC 3.9.4). Returns only FAIL message if the partition usage exceeds 95%.
     
  4. Checking the Prism Central VM home partition usage (added in NCC 3.9.4):
    • If the usage is more than 75%, a WARNING is returned.
    • If the usage is more than 90%, a FAIL is returned.
       
  5. Checking the Prism Central VM CMSP partition usage (added in NCC 3.10.1):
    • If usage is more than 75%, a WARNING is returned.
    • If the usage is more than 90%, a FAIL is returned.
       
  6. Checking the Prism Central VM Upgrade disk partition usage (added in NCC 4.6.0):
    • If the usage is more than 70%, a FAIL is returned.
    • This check runs every 5 mins.
    • If there are more than 5 failures (30 mins), a critical alert is raised.

Note: If you are running LCM-2.6 or LCM-2.6.0.1, LCM log collection fills up /home directory refer KB-14671 for workaround.

Running the NCC check
Run the NCC check as part of the complete NCC health checks.

Click here to display detailed information below:

Checking Disk Usage in PC VM
Following is an example of how to check disk usage on a PC VM.

Click here to display the example below:

Scenarios that trigger pcvm_disk_usage check Warn/Fail on /home partition

Click here to display detailed information in this step:

Solution

If the check reports a WARN or FAIL status, disk usage is above the threshold and needs investigation. Generally, space utilization can be queried using df -h. The output below shows the mount points as follows:

  • /dev/sdb1 is root partition
  • /dev/sdb3 is home partition
  • /dev/sdc1 is data disk partition
nutanix@pcvm$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        7.9G     0  7.9G   0% /dev
tmpfs           7.9G   44K  7.9G   1% /dev/shm
tmpfs           7.9G  6.1M  7.9G   1% /run
tmpfs           7.9G     0  7.9G   0% /sys/fs/cgroup
/dev/sdb1       9.8G  7.4G  2.3G  77% /
/dev/sdb3        50G  8.5G   41G  18% /home
/dev/sdc1       492G  150M  486G   1% /home/nutanix/data/stargate-storage/disks/NFS_2_0_267_5a298323_3c9f_4a6f_a265_10c4c1e6593e
tmpfs           1.6G     0  1.6G   0% /run/user/1000
/dev/sde         98G  401M   93G   1% /home/nutanix/data/sys-storage/NFS_1_0_264_1f5cda9a_2b3f_4f49_b348_baeb0ae338b8
tmpfs           1.6G     0  1.6G   0% /run/user/0

Data disk usage (/dev/sdXXor overall multivdisk usage:

Verify the number of VMs supported for the particular Prism Central size is within the limit (consult the Prism Central Guide for your version from the Support Portal for the limits). Contact Nutanix Support. While opening a support case, attach the output of the following commands to the case.

nutanix@pcvm$ allssh df -h
nutanix@pcvm$ ncc health_checks system_checks pcvm_disk_usage_check

Prism Central VM home partition (/home):

Inspect the NCC output to determine which Prism Central VM has high usage, then perform the following:

  1. Log in to the Prism Central VM.
  2. Use the cd command to change the location to the /home partition.
  3. List the contents of the directory by size using the command below:
    nutanix@pcvm$ ls -al | sort -k5,5nr

    Examine the output for any large unused files that can be deleted.

  4. Run the du command below to list the usage of each file and sub-directory:
    nutanix@pcvm$ sudo du -skxh * | sort -h

    Examine the output of large sub-directories. You can run the du command for each sub-directory in question to further identify large unused files that can be deleted.

  5. Below are some common sub-directories of /home where large unused files are likely to exist:
    • /home/nutanix/software_downloads/ - delete any old versions other than the versions you are currently upgrading.
    • /home/nutanix/software_uncompressed/ - delete any old versions other than the versions you are currently upgrading.
    • /home/nutanix/data/cores - delete old stack traces that are no longer needed.
    • /home/nutanix/data/log_collector/ - delete old NCC Logs with NCC-logs-2018-07-20-11111111111111-1032057545.tar format.
    • /home/nutanix/foundation/isos/ - old ISOs.
    • /home/nutanix/foundation/tmp/ - temporary files that can be deleted.

If the above steps do not resolve the issue or if the issue matches one of the scenarios presented earlier in this article, follow the solution steps outlined below.

Prism Central VM root system partition (/) or  CMSP partition (/dev/sdXX):
Consider engaging Nutanix Support. Gather the output of the commands below and attach it to the support case:

nutanix@pcvm$ allssh df -h
nutanix@pcvm$ sudo du -h --max-depth=1 / 2>/dev/null
nutanix@pcvm$ ncc health_checks system_checks pcvm_disk_usage_check

Scenario 1

Click here to display detailed information in this scenario:

Scenario 2
If your Prism Central instance matches this scenario, refer to KB-12707 Scenario #2 and open a case with Nutanix Support for assistance with in recovering from the issue.

Scenario 3

Click here to display detailed information in this scenario:

Scenario 4
Nutanix is aware of the issue. The fix for this issue will be made available in a future PC release. For a workaround, engage Nutanix Support.

Scenario 5
If you see catalina.out log file is consuming a lot of space, use the following command to restart the prism service on the PCVM. 

Click here to display detailed information in this scenario:

Scenario 6

Follow KB-6082 to clear the inode usage.

Document ID:HT516503
Original Publish Date:05/16/2024
Last Modified Date:05/23/2024