Ensuring Nagios service restarts with Puppet and collected Nagios resources

Nagios is somewhat old and clunky. One thing that can really help is managing it indirectly, via a configuration management solution like Puppet.

The best part of this combo is that with Puppet you can also take advantage of exported resources as a way of automagically generating monitors for your Puppetized machines.

The Problem

A recent thread on Reddit however revealed some problems that arise from the use of the (unmaintained?) Nagios resources in Puppet. I set out to demonstrate a solution to two questions:

How to ensure old exported resources, once removed from Puppet, are also removed from the Nagios config
How to ensure that when these resources are removed, the Nagios service is restarted to put the change into effect

An attempt

I figured setting a Nagios config subdirectory to purge, and having that notify the Nagios service, would take care of everything.

But the following did not work:

# client.pp
@@nagios_host { $::fqdn:
  host_name             => $::fqdn,
  alias                 => $::fqdn,
  address               => $::ipaddress,
  tag                   => 'nagiosconfig',
  target                => "/etc/nagios/conf.d/${::fqdn}.cfg",
}

# server.pp
   # Extra nagios config files go into conf.d: misc configs as well as our templates
file { '/etc/nagios/conf.d/':
  ensure  => 'directory',
  purge   => true,
  recurse => true,
  notify  => Service['nagios'],
}
Nagios_host <<| tag == 'nagiosconfig' |>>

And why not? I blame the jankiness of the Nagios resources.

With the above I found that setting purge => true on the config subdirectory can Puppet get into a race condition between purging the File directory resource and whatever the hell the Nagios resources are doing.

That is, you wind up with the collected config targets being alternately created and deleted, in variable order, on each run. Not a comfortable situation for your monitoring system…

And all because the Nagios types take their actions without managing the config files themselves as File resources!

The solution

If you make a slight modification to explicitly manage the targeted Nagios config files then you gain both automatic removal of uncollected resources and can notify the Nagios service.

You’ll have a handful of potentially huge Nagios files this way, but at least you can enforce their relationship to the service!

#client.pp
@@nagios_host { $::fqdn:
  # [...]
  tag                   => 'nagiosconfig',
  target                => '/etc/nagios/conf.d/hosts.cfg',
}
@@nagios_service { '${::fqdn}_someservice':
  # [...]
  tag                   => 'nagiosconfig',
  target                => '/etc/nagios/conf.d/services.cfg',
}

#server.pp
file { '/etc/nagios/conf.d/':
  ensure  => 'directory',
  purge   => true,
  recurse => true,
  notify  => Service['nagios'],
}
-> file { '/etc/nagios/conf.d/hosts.cfg':
  ensure => 'file',
}
-> file { '/etc/nagios/conf.d/services.cfg':
  ensure => 'file',
}
Nagios_host <<| tag == 'nagiosconfig' |>>

This can be split up however you like so long as Puppet on the Nagios server is explicitly managing the target files.

Now Puppet is explicitly managing the collected Nagios host and service configuration files, and any changes to the conf.d directory are guaranteed to restart the Nagios service.

Done!

^ Page image based on work by Alex Yomare, Pixabay License.