NetApp Monitoring, v3.8.1

RC1 for 3.8.1 now available

Version 3.8.0_03 will be available shortly in the Q-Portal  as the first release candidate for version 3.8.1 of check_netapp_pro. This release contains the following changes:

  • StorageUtilization output units can be controlled by --factor=ki|Mi|Gi|...
  • New check: check_netapp_quotas.pl – monitors quotas on a cdot filer.
  • ShelfEnvironment: False positive for voltage-sensors in DataONTAP 9.1P1 (solved with temporary switch --DataONTAP_91P1)
  • DiskPaths: Bugfix and additional switch --port_pattern_ok=AAAA | BBBB | ...

We are happy to receive your feedback, please write to developer@netapp-monitoring.info.

NetApp Monitoring, Quotas, v3.9.0

Quota Check

We’ve developed a check, check_netapp_quotas, that can be used to monitor quotas on a Cluster Mode filer and we plan to release it along with the next release. We are happy to provide a test release for this check. Version 3.8.0_02 will be available today in the Q-Portal for beta testers. Please contact the distributor in order to obtain the necessary access rights.

Check Logic and Parameter

What does this check look like? As with the previous release  (check_netapp 2.x), this check is rather simple to use. The hostname is the only parameter to be assigned. Thresholds are not necessary since the plugin operates based on the soft and hard limits on the filer.

./check_netapp_quotas.pl -H filer91
All checked quotas within limits on filer91.
| srv1.vol1_disk-used=0kiB;0;0;0; srv1.vol1_files-used=6;0;0;0; srv1.vol1.*_disk-used=0kiB;0;51200;0; srv1.vol1.*_files-used=0;0;0;0;

However, --qtree or --volume  can be used to control where the monitoring should take place.

And that is the current situation. If you would like to request any changes you are welcome to leave a comment below or send us an email to developer@netapp-monitoring.info.

NetApp Monitoring, SnapMirror

SnapMirrorMetrics and unfinalized Transfers

We received a feature request for SnapMirrorMetrics:

Since we often have lagtime false positives because complete transfers are still being finalized on the volume, we would like to have a switch for the SnapMirrorMetrics check: ‑‑finalizing_relation_is_ok

We are currently discussing two possible solutions:

The easy solution: --finalizing=OK|WARN

This switch would lead the check to complete with OK (or WARNING) if the lagtime has been exceeded and under the condition that the respective transfer has the status finalizing. 

At this point we should remind ourselves of the meaning of finalizing. According to the SDK:

‚finalizing‘ – SnapMirror transfers are enabled, currently in the post-transfer phase for vault or extended data protection incremental transfers

The clean solution: --finalizing_plus=12h

In contrast to the solution proposed above, this approach would extend the threshold for up to 12 hours, depending on the status of the transfers. This way we would prevent a transfer stuck in a endless finalizing loop of always returning OK and not being identified by the monitoring system.

An example to demonstrate its usage:

SnapMirrorMetrics ... --what=lag_time -w 1d -c 2d --finalizing_plus=12h

This example would result in a WARNING for a lagtime > 36h and a CRITICAL alarm after 2,5 days, even for a ‚finalizing‘ relationship status. For all other relationship states a WARNING will be triggered after a day, as was previously the case, and a CRITICAL after two days.

Your opinion?

We are leaning more towards the second, cleaner approach. Certainly, preventing false positives is an important concern but it should never lead to false negatives, i.e. undetected errors, becoming a possibility, no matter how improbable their appearance. In the end, our goal is to fulfill the needs of our customers. If you are interested, leave us a comment below or send us an email to developer@netappmonitoring.info.

NetApp Monitoring, tmp, v3.8.0

Stable Release 3.8.0 now available

Today we are completing our tests for the next stable release 3.8.0, now available in the Q-Portal.

The most important new features are:

  • New check: FCPAdapter checks fcp-adapters status (online, …)
  • Usage-check: --check_only=500GiB..1TiB (ranges to specify different thresholds depending on the volumes total size)

Additionally one should know that getters now ignore temp-directories and that ShelfEnvironment getter no longer needs a --node switch.

Detailed information about changes in 3.8.0

As usual, the history contains a list of all changes.

All blog articles about release 3.8.0 .

DiskPaths, NetApp Monitoring, tmp

Should DiskPaths be more flexible?

We received the following inquiry about a possible extension for DiskPaths.

One of our clients has a single node Metro Cluster – i.e. a node per page instead of a HA pair. As a result the check is complaining that there are only two instead of 4 available paths. The same problem occurs for a Single Node Cluster (Entry Level System) that only sees one instead of two paths for internal disks.

We suggest adding a switch so that we can configure the number of paths that are OK.

This raises the question on whether or not to implement a switch --paths=1|2|4|8 in DiskPaths. The command could then be typed as follows:

$ ./check_netapp_pro.pl DiskPaths -H filer
# 8 paths would still be the default setting

$ ./check_netapp_pro.pl DiskPaths -H filer --paths=2
# dedicated declaration for a single node Metro Cluster

If you are interested in this future extension, please leave a comment bellow or send us an email: developer@netapp-monitoring.info.

Logfile Monitoring, NetApp Monitoring, tmp

Logchecker – new Testing Release

We improved our event log check (EMS). Since the check is still a test version, which you can request it directly from the developers (developer@netapp-monitoring.info).

Changes

  • date and time are displayed – in the first version we used UNIX-timestamp („epoch seconds“).
  • source is displayed for each entry
  • the option --severity is compliant with syslog
  • improved documentation: INSTALL.txt explains the installation, --examples provides a few configuration examples.

Find more information about our logfile check in the category NetApp Logfile Monitoring.

NetApp Monitoring, v3.8.0

RC1 for 3.8.0

Unstable Version 3.7.1_04 is our first release candidate for 3.8.0 (not the second one, as mistakenly mentioned in one of our previous blog posts).

The most important change are:

This unstable version is available to our beta-eligible customers in the Q-Portal.

Additional Information

All blog entries related to the changes in 3.8.0

Release History