NCC Health Check: inode_usage_check

NCC Health Check: inode_usage_check

NCC Health Check: inode_usage_check

Description

The NCC check inode_usage_check verifies whether the number of free inodes on CVMs is getting low.

An inode contains information (metadata) about a file in a file system, including where the data is stored, file name, file permissions and so on. Each directory and file has a corresponding inode, in case of a big file size – more than one inode. A maximum number of inodes for each filesystem is defined at FS creation time and cannot be increased.

The inode_usage_check returns the following statuses:

  • PASS - if inode usage is at 75 percent or below
  • WARN - if inode usage is between 75 and 90 percent
  • FAIL - if inode usage is above 90 percent

Running the NCC Check

It can be run as part of the complete NCC check by running:

nutanix@cvm$ ncc health_checks run_all

or individually as:

nutanix@cvm$ ncc health_checks hardware_checks disk_checks inode_usage_check

You can also run the checks from the Prism web console Health page: select Actions > Run Checks. Select All checks and click Run.

This check is scheduled to run every 5 minutes, by default.
This check will generate an alert A1027 after 1 failure across scheduled intervals.

Sample output

For Status: PASS

Running : health_checks hardware_checks disk_checks inode_usage_check
[==================================================] 100%
/health_checks/hardware_checks/disk_checks/inode_usage_check on the node [ PASS ]
----------------------------------------------------------------------------------------+
+---------------+
| State | Count |
+---------------+
| Pass  | 1     |
| Total | 1     |
+---------------+
Plugin output written to /home/nutanix/data/logs/ncc-output-latest.log

For Status: WARN

/health_checks/hardware_checks/disk_checks/inode_usage_check on the node [ WARN ]
----------------------------------------------------------------------------------------+
Detailed information for inode_usage_check:
Node x.y.z.10:
FAIL: '/dev/md2' (mounted at '/home') inode usage at %76 (greater than threshold, %75)
Refer to KB 1532 for details on inode_usage_check

For Status: FAIL

/health_checks/hardware_checks/disk_checks/inode_usage_check on the node [ FAIL ]
----------------------------------------------------------------------------------------+
Detailed information for inode_usage_check:
Node x.y.z.10:
FAIL: '/dev/md2' (mounted at '/home') inode usage at %91 (greater than threshold, %90)
Refer to KB 1532 for details on inode_usage_check

Output messaging

Check ID 1004
Description  Check if current inode usage is high.
Causes of failure Inode usage is high.
Resolutions Reduce disk usage or replace disk
Impact Cluster performance may be significantly degraded. In the case of multiple nodes with the same condition, the cluster may become unable to service I/O requests.
Alert ID A1027
Alert Smart Title Disk Inode Usage High on Controller VM svm_ip_address
Alert Title Disk Inode Usage High
Alert Message Inode usage for one or more disks on Controller VM svm_ip_address has exceeded inode_usage_threshold%.

Solution

NOTE: The AOS versions older than 6.5.3 are vulnerable to a problem in which all inodes of / partition are consumed naturally, leading to a cluster downtime and potential VM workload disruptions.
If your AOS version is below 6.5.3, promptly upgrade the cluster to 6.5.3 or a later version after addressing the inode alert.

 

The NCC health check inode_usage_check fails when one or more filesystems on disks is running out of free inodes or the overall cluster storage is running out of free inodes.

Ensure that the inodes are free on the CVM (Controller VM) which is reported in the failure section of the NCC check:

nutanix@cvm:~$ df -i

Example output(note the IUSe% column):

nutanix@cvm:~$ df -i -t ext4
Filesystem Inodes   IUsed IFree    IUse% Mounted on
/dev/md1   655360   58570 596790   9%    /
/dev/loop0 65536    60    65476    1%    /tmp
/dev/md2   2621440  25753 2595687  1%    /home
/dev/sdc1  61054976 81234 60973742 1%    /home/nutanix/data/stargate-storage/disks/9XGxxxS2
  • If the inode usage is high for / partition and the directory "/var/spool/postfix/maildrop" is consuming thousands of inodes, then refer to Nutanix KB-6082.  The following command can be used to check the number of inodes consumed by "/var/spool/postfix/maildrop" directory.
    nutanix@NTNX-CVM:~$ sudo du --inode /var/spool/postfix/maildrop
    <number of inodes consumed>     /var/spool/postfix/maildrop
  • If you observe that /home is indicating a high inode usage percentage, then determine which directory or directories have high inode usage using the following command:
    nutanix@cvm$ sudo find /home -xdev -type d -size +100k
    

Note: If you see any other directory (like /home/nutanix/data/stargate-storage/disks/<serial>) with high inode usage percentage, use the following command:

nutanix@cvm$ sudo find /home/nutanix/data/stargate-storage/disks/<serial> -xdev -type d -size +100k

Once the above outputs are collected, engage Support.

Additional Information

Document ID:HT516508
Original Publish Date:05/21/2024
Last Modified Date:05/23/2024