NetApp Monitoring, v3.9.0

3.9.0 Release available

The Release 3.9.0 originally planed for Autumn is already available now. Please keep in mind that this release changes the store format. So we recommend to study the manual especially the chapter Upgrading to Version 3.9.0. 

Details regarding the new features und fixes are documented als always in the Release History online.

When planing the upgrade please take into account that we have reduced support capacities during summer-time.

NetApp Monitoring, Performance Monitoring

Solving perfdata display issues

Due to a customer request, we would like to show how the switch --perfdata_uom_string can be used in situations where monitoring systems have issues with unit of measurement (UOM) strings (the unit for performance data) and can not be displayed properly.

For example, if the WAFL check returns the CP counter per second (/s) it can happen that the monitoring system thinks that there is no performance data because it can’t process the string wafl=0/s. In this case, we get an empty graph, such as the one below:

image.png

Using the switch --perfdata_uom_string=persec one can resolve this issue. Performance data will now be recognized and the graph will correctly display the data:

image.png

NetApp Monitoring

Ignore Case for -exclude

The parameter ‑‑exclude= can be used to exclude specific instances from a check. For instance, a group of volumes (provided they have a naming convention) can be excluded, e.g.:

$ ./check_netapp_pro.pl Usage ‑o volume ‑‑exclude=VMWare

This snippet excludes all volumes containing ‚VMWare‘ in their name (AB_VMWare_vol1, AB_VMWare_vol2, …).

But the devil is in the detail: storage admins have to take great care for case-sensitive names otherwise unwanted side effects can occur: for volumes AB_VMWare_vol1 and AB_vmware_vol2 the pattern above would only exclude the first volume.

Using the new switch ‑‑ignore_case we no longer have this issue.

$ ./check_netapp_pro.pl Usage ‑o volume ‑‑exclude=VMWare ‑‑ignore_case

In this case, all possible notations of VMWare would be matched, such as:

  • VMWare
  • VMware
  • VmWaRe
  • vmware

In the same way, ‑‑ignore_case also works for ‑‑include.

‑‑ignore_case will be included in version 3.9.0, available as of today as version 3.8.1_05 for beta testers in the Q-Portal.

NetApp Monitoring, v3.9.0

RC1 for 3.9.0 now available

In version 3.9.0 we are introducing a new store format that will increase overall check performance.

Customers wanting to test out this new release can now download the first release candidate, version 3.8.1_04 from the Q-Portal.

Please make sure to read the (short) section about the upgrade in the installation instructions included with the checks, since the store files have to be converted before the upgrade. Also bear in mind that there is no way of converting the files back to the old format. When downgrading all store files have to be deleted beforehand (we know that this is generally no big deal).

Apart from the quicker store format, this version includes a number of bug fixes and the new LunSize check. Please refer to the release history for more details.

check_netapp_anycli, NetApp Monitoring

Alarming failed disks on an old filer

Neither the checks nor the CLI command sysconfig -c were able to find failed disks on a Ontap 8.2.4 (7mode) system. aggr status –f reports: „Broken disks (empty)“. The only way to detect them was sysconfig -a, returning a long list, similar to the one below:


134L84  : NETAPP   X414_HV60A15 NA03 560.0GB 520B/sect (LXY…4N)
134L85  : NETAPP   X414_HV60A15 NA03 560.0GB 520B/sect (LXY…1N)
134L86  : NETAPP   X414_HV60A15 NA03   0.0GB 0B/sect (Failed)
134L87  : NETAPP   X414_HV60A15 NA03 560.0GB 520B/sect (LXY…HN)
134L88  : NETAPP   X414_S160A15 NA08 560.0GB 520B/sect (6SL…NF)

This make for an interesting use case for check_netapp_anycli.pl.

This is how these failed disks are being alarmed:

./check_netapp_anycli.pl -H my_old_netapp --in=sysconfig --in=-a --out="Failed|failed" --like_result=CRITICAL --unlike_result=OK

CRITICAL - output matches pattern 'Failed|failed'
Please note, that the syntax for check_netapp_anycli.pl will change with version 3.8.2 (planned release for June 2017). The example above already uses the new syntax.
Logfile Monitoring, NetApp Monitoring

More improvements for check_netapp_events

The log file checker for the EMS log has seen a few improvements:

  • look behind can now be set in minutes, hours or days (e.g. --lookback=2h, evaluates entries from the last 7200 seconds)
  • you can now use the switch --authfile to point to a credentials file so that username and password don’t have to be written in the command line or in the configuration file. This file is partly compatible with the authfile of check_netapp_pro (host sections are not supported yet, see --help)

Since the plugin is still an unofficial plugin, download links have to be requested from the developer.