If you don't want to use passive checks. You can use check_munin_rrd plugin.
Basically Munin-node data get stored on the munin server as usual and Nagios is reading those data to check the status of the node.
$ /usr/lib/nagios/plugins/check_munin_rrd.pl --help
Monitor server via Munin-node pulled data
Usage: /usr/lib/nagios/plugins/check_munin_rrd.pl -H -M
-h, --help
print this help message
-H, --hostname=HOST
name or IP address of host to check
-M, --module=MUNIN MODULE
Munin module value to fetch
-D, --domain=DOMAIN
Domain as defined in munin
-w, --warn=INTEGER
warning level
-c, --crit=INTEGER
critical level
-v --verbose
Be verbose
-V, --version
prints version number
check_munin_rrd.pl (nagios-plugins 1.4.2) 0.9
The nagios plugins come with ABSOLUTELY NO WARRANTY. You may redistribute
copies of the plugins under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING.
Previous implementation was using a check from Nagios directly onto Munin-node which is overkill since the Munin server gets the data already via cron.
You need to define a
- new command :
define command{
command_name check_munin
command_line /usr/lib/nagios/plugins/check_munin_rrd.pl -H $HOSTALIAS$ -M $ARG1$ -w $ARG2$ -c $ARG3$
}
- new service template :
# generic service template definition check via munin
define service{
name generic-munin-service ; The 'name' of this service template
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 0 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 0 ; Default is to NOT check service 'freshness'
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
failure_prediction_enabled 1 ; Failure prediction is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
notification_interval 0 ; Only send notifications on status change by default.
is_volatile 0
check_period 24x7
normal_check_interval 5 ; This directive is used to define the number of "time units" to wait before scheduling the next "regular" check of the service.
retry_check_interval 3 ; This directive is used to define the number of "time units" to wait before scheduling a re-check of the service.
max_check_attempts 2 ; This directive is used to define the number of times that Nagios will retry the service check command if it returns any state other than an OK state. Setting this value to 1 will cause Nagios to generate an alert without retrying the service check again.
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}
Don't use smaller value for normal_check_interval, munin updates data every 5 minutes.
- new service example :
# check the disk usage via munin
define service{
hostgroup_name web-servers
service_description disk-usage
check_command check_munin_rrd!df!75!90
use generic-munin-service
}
No comments:
Post a Comment