Nagios – Cool Tips and Tricks
Jim Clark
jclark@itconvergence.com
Introduction & Agenda
• About Me
• Cool Tips and Tricks
• Released Scripts
• Questions and Answers
About Me
About Me
• Have been in the IT industry since
1988
• Have been using Nagios since
~2003
• Switched to XI ~2...
Nagios Environment
Add new NRPE check without restarting
• Reason for implementing
• 100+ AIX servers
• Understaffed AIX admin group
• Ne...
Add new NRPE check without restarting
• Add this check command
• command[check_whatever]=/usr/opt/nagio
s/libexec/open_...
Check by ssh with password
I know, I know…bad! bad! BAD!
Sometimes though, you just can’t do things
the proper method. ...
Check by ssh with password
• Use this command definition in Nagios
• $USER1$/check_freenas $ARG1$ $ARG2$
$ARG3$ $HOSTAD...
Check by ssh with local script
• Reason for implementation
• Only have to modify the scripts in one location,
the Nagio...
Check by ssh with local script
• Known issues
• Must be a script, it can not be a binary. At
least I haven’t found the ...
Alert Different Groups Based on Day of
Week
• Reason for implementation
• The group works 4 day and 3 day shifts. One
...
Alert Different Groups Based on Day of
Week
• define serviceescalation{
host_name ASPIT01P
service_description *
cont...
Check for new *nix mount point
• Reason for implementing
• We monitor all mount point separate as each one
may have a d...
Check for new *nix mount point
• Bash script
#!/bin/bash
if [[ $("$@") == "DISK UNKNOWN - free space:|" ]]
then
echo ...
Check for new *nix mount point
• Example usage from cli
• /usr/local/nagios/libexec/check_new_disk
/usr/local/nagios/li...
Custom SNMP Trap Handling
• Reason for implementing
• I use sitescan to monitor building health at
the data center and ...
Custom SNMP Trap Handling
• What I did
• Modify snmptt.conf and changed the line calling
the script to the new filename...
Newer On-Call Handling
• Reason for implementing
• Last year I gave a presentation on how we had
previously incorporate...
Newer On-Call Handling
• Script details
• Does not create the on-call data files. These
need supplied manually or by so...
Newer On-Call Handling
Script: Check E-Mail Subject
• Reason for implementing
• We send an email with a virus every 30 minutes
to an outside a...
Script: Acknowledge by Email
• Reason for implementing
• Multiple Nagios servers
• Some servers behind special firewall...
Script: Acknowledge by Email
• Details
• Script is located on the Exchange
• It is an NTLM fork of the script NagMailAc...
Script: Check E-Mail Delivery
• Reason for implementing
• Need to verify email is flowing
• Script details
• Uses NTLM...
Script: Check E-Mail Delivery
• Script
command="php
/usr/local/nagios/bin/email_delivery.phps "*** Check
for E-Mail Wo...
Conclusion
• There are other scripts of mine located on
the exchange under the owner ‘banditbbs’
• I am always browsing...
Questions?
Any questions?
Thanks!
The End
Jim Clark
jclark@itconvergence.com
of 28

Nagios Conference 2014 - James Clark - Nagios Cool Tips and Tricks

James Clark's presentation on Nagios Cool Tips and Tricks. The presentation was given during the Nagios World Conference North America held Oct 13th - Oct 16th, 2014 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/conference
Published on: Mar 3, 2016
Published in: Technology      
Source: www.slideshare.net


Transcripts - Nagios Conference 2014 - James Clark - Nagios Cool Tips and Tricks

  • 1. Nagios – Cool Tips and Tricks Jim Clark jclark@itconvergence.com
  • 2. Introduction & Agenda • About Me • Cool Tips and Tricks • Released Scripts • Questions and Answers
  • 3. About Me About Me • Have been in the IT industry since 1988 • Have been using Nagios since ~2003 • Switched to XI ~2010 • Work for IT Convergence as Global Manager – Monitoring • Personal web page is http://www.bandits-home-on-the-web.com
  • 4. Nagios Environment
  • 5. Add new NRPE check without restarting • Reason for implementing • 100+ AIX servers • Understaffed AIX admin group • Needed a way to add a new plugin without needing to restart the NRPE service
  • 6. Add new NRPE check without restarting • Add this check command • command[check_whatever]=/usr/opt/nagio s/libexec/open_scripts/$ARG1$ $ARG2$ $ARG3$ • Restart NRPE one last time • Security Concerns • As long as you nest it down one folder as I did, use SSL, have NRPE locked to only_from the proper IP, the security issues should be relatively small
  • 7. Check by ssh with password I know, I know…bad! bad! BAD! Sometimes though, you just can’t do things the proper method. Plus, it is only on my personal network  • Install ‘sshpass’ on your Nagios server • Create a bash script • #!/bin/sh • sshpass -p $1 ssh $2@$4 $3
  • 8. Check by ssh with password • Use this command definition in Nagios • $USER1$/check_freenas $ARG1$ $ARG2$ $ARG3$ $HOSTADDRESS$ • ARG1=Password, ARG2=User, ARG3=command to run
  • 9. Check by ssh with local script • Reason for implementation • Only have to modify the scripts in one location, the Nagios server • How to implement • For a bash script use • ssh nagios@$HOSTADDRESS$ 'bash -s' -- < $USER1$/$ARG1$ $ARG2$ • For a perl script use • ssh nagios@$HOSTADDRESS$ 'perl - $ARG3$' -- < $USER1$/$ARG1$ $ARG2$
  • 10. Check by ssh with local script • Known issues • Must be a script, it can not be a binary. At least I haven’t found the proper command yet. • Nagios Core 4 / NagiosXI 2014 and newer versions require a wrapper around the command instead of just using the command directly
  • 11. Alert Different Groups Based on Day of Week • Reason for implementation • The group works 4 day and 3 day shifts. One group covers Monday – Thursday and the other Friday – Sunday. • Method used • Escalations • Special time periods • Contact groups
  • 12. Alert Different Groups Based on Day of Week • define serviceescalation{ host_name ASPIT01P service_description * contact_groups pkms_01p-mon-thu first_notification 1 escalation_period mon-thu last_notification 0 notification_interval 15 } • define serviceescalation{ host_name ASPIT01P service_description * contact_groups pkms_01p-fri-sun first_notification 1 escalation_period fri-sun last_notification 0 notification_interval 15 } • define serviceescalation{ host_name ASPIT01P service_description * contact_groups pkms_01p-managers first_notification 3 last_notification 0 notification_interval 15 }
  • 13. Check for new *nix mount point • Reason for implementing • We monitor all mount point separate as each one may have a different contact group • If Unix admins add a new mount point they may forget to inform monitoring to start monitoring it • Nagios Command • $USER1$/check_new_disk $USER1$/check_nrpe -n -H $HOSTADDRESS$ -t 30 -c check_disk -a ‘$ARG1$’
  • 14. Check for new *nix mount point • Bash script #!/bin/bash if [[ $("$@") == "DISK UNKNOWN - free space:|" ]] then echo “OK: No new drives!”; exit 0; else echo “CRITICAL: New drives!”; exit 2; fi;
  • 15. Check for new *nix mount point • Example usage from cli • /usr/local/nagios/libexec/check_new_disk /usr/local/nagios/libexec/check_nrpe -n -H 10.97.235.15 -t 30 -c check_disk -a ‘-w 1000 -c 500 -A -x / -x /usr -x /home -x /tmp -x /u01 - x /proc -x /opt -x /tomaxbin -i ‘/var*$’ -i ‘^/notes*$”
  • 16. Custom SNMP Trap Handling • Reason for implementing • I use sitescan to monitor building health at the data center and send traps to Nagios. • Unfortunately those traps are not very good and the data requires manipulation before writing the trap to Nagios. • What I did • Make a copy of snmptraphandling.py to snmptraphandlingss.py.
  • 17. Custom SNMP Trap Handling • What I did • Modify snmptt.conf and changed the line calling the script to the new filename and send over all important data. • Modify snmptraphandlingss.py to do what I need. • Changed line in snmptt.conf • EXEC /usr/local/bin/snmptraphandlingss.py “$r” “SNMP Traps” “$s” “$@” “$-*” “$*”
  • 18. Newer On-Call Handling • Reason for implementing • Last year I gave a presentation on how we had previously incorporated on-call. That method had one flaw, it required daily restarts of Nagios. • Wanted a way for Nagios to display who is on-call • Script details • Only works with NagiosXI • Comes with a component to add a link on the main menu to display who is on-call
  • 19. Newer On-Call Handling • Script details • Does not create the on-call data files. These need supplied manually or by some other method (We use SharePoint to schedule and it automatically writes out data files). • Works with escalations as well • Adds new notification handlers that maintain following user’s notification preferences in their XI account
  • 20. Newer On-Call Handling
  • 21. Script: Check E-Mail Subject • Reason for implementing • We send an email with a virus every 30 minutes to an outside address • Our checker should catch it and send an alert email • We check the account every 30 minutes for the presence of that email • Script details • Can be found on the Exchange • Uses NTLM for auth
  • 22. Script: Acknowledge by Email • Reason for implementing • Multiple Nagios servers • Some servers behind special firewalls so can not use Nagios Mobile or other solutions • No need for on call individuals to carry around tablets or laptops if they can use their phones to easily acknowledge alerts
  • 23. Script: Acknowledge by Email • Details • Script is located on the Exchange • It is an NTLM fork of the script NagMailAck but uses NTLM auth • Every Nagios server has it’s own identity string that gets added to the email subject when replying • All Nagios servers can monitor the same email account for replies and just search for subjects with their identity
  • 24. Script: Check E-Mail Delivery • Reason for implementing • Need to verify email is flowing • Script details • Uses NTLM for authentication • Sends an email with a specific subject and then reconnects and verifies that email is in the inbox. • Uses my check_email_subject script • Uses phpmailer to send the email
  • 25. Script: Check E-Mail Delivery • Script command="php /usr/local/nagios/bin/email_delivery.phps "*** Check for E-Mail Working"“ eval $command command2="/usr/local/nagios/libexec/check_email_s ubject.rb "*** Check for E-Mail Working"“ eval $command2
  • 26. Conclusion • There are other scripts of mine located on the exchange under the owner ‘banditbbs’ • I am always browsing the Nagios forums and offering help when I can • There are a few other nagios scripts and hints on my personal web page linked earlier in this presentation
  • 27. Questions? Any questions? Thanks!
  • 28. The End Jim Clark jclark@itconvergence.com

Related Documents