A Blog for Those With a Big Appetite for IT Knowledge...: Munin and Alerting: Method 3

Integration With Nagios: Via a Nagios Plugin

If you don't want to use passive checks. You can use check_munin_rrd plugin.

Basically Munin-node data get stored on the munin server as usual and Nagios is reading those data to check the status of the node.

$ /usr/lib/nagios/plugins/check_munin_rrd.pl --help

Monitor server via Munin-node pulled data
Usage: /usr/lib/nagios/plugins/check_munin_rrd.pl -H -M
[-D ] -w -c [-V]
-h, --help
   print this help message
-H, --hostname=HOST
   name or IP address of host to check
-M, --module=MUNIN MODULE
   Munin module value to fetch
-D, --domain=DOMAIN
   Domain as defined in munin
-w, --warn=INTEGER
   warning level
-c, --crit=INTEGER
   critical level
-v --verbose
   Be verbose
-V, --version
   prints version number
check_munin_rrd.pl (nagios-plugins 1.4.2) 0.9
The nagios plugins come with ABSOLUTELY NO WARRANTY. You may redistribute
copies of the plugins under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING.

Previous implementation was using a check from Nagios directly onto Munin-node which is overkill since the Munin server gets the data already via cron.

You need to define a

new command :

define command{
   command_name check_munin
   command_line /usr/lib/nagios/plugins/check_munin_rrd.pl -H $HOSTALIAS$ -M $ARG1$ -w $ARG2$ -c $ARG3$
   }
new service template :

# generic service template definition check via munin
define service{
   name generic-munin-service ; The 'name' of this service template
   active_checks_enabled 1 ; Active service checks are enabled
   passive_checks_enabled 0 ; Passive service checks are enabled/accepted
   parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)
   obsess_over_service 1 ; We should obsess over this service (if necessary)
   check_freshness 0 ; Default is to NOT check service 'freshness'
   notifications_enabled 1 ; Service notifications are enabled
   event_handler_enabled 1 ; Service event handler is enabled
   flap_detection_enabled 1 ; Flap detection is enabled
   failure_prediction_enabled 1 ; Failure prediction is enabled
   process_perf_data 1 ; Process performance data
   retain_status_information 1 ; Retain status information across program restarts
   retain_nonstatus_information 1 ; Retain non-status information across program restarts
   notification_interval 0 ; Only send notifications on status change by default.
   is_volatile 0
   check_period 24x7
   normal_check_interval 5 ; This directive is used to define the number of "time units" to wait before scheduling the next "regular" check of the service.
   retry_check_interval 3 ; This directive is used to define the number of "time units" to wait before scheduling a re-check of the service.
   max_check_attempts 2 ; This directive is used to define the number of times that Nagios will retry the service check command if it returns any state other than an OK state. Setting this value to 1 will cause Nagios to generate an alert without retrying the service check again.
   notification_period 24x7
   notification_options w,u,c,r
   contact_groups admins
   register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
   }

Don't use smaller value for normal_check_interval, munin updates data every 5 minutes.

new service example :

# check the disk usage via munin
define service{
   hostgroup_name web-servers
   service_description disk-usage
   check_command check_munin_rrd!df!75!90
   use generic-munin-service
   }

A Blog for Those With a Big Appetite for IT Knowledge...

Wednesday, July 21, 2010

Munin and Alerting: Method 3

No comments:

Post a Comment

Pages

Search This Blog

Followers

Blog Archive

About Me