The Industry Standard In IT Infrastructure
Monitoring
Who are using Nagios
Agenda
• What is Nagios
• What can you do with Nagios
• Features
• Basic
o Architecture
o Terminology
• Monitoring ...
What is Nagios Core
Open Source system and network monitoring
application
With Nagios you can
• Monitor your entire IT infrastructure
• Spot problems before they occur
• Know immediately when p...
Features
• Monitoring of network services
• SMTP
• POP3
• HTTP
• PING and more
• Monitoring of host resources
• Pro...
Features
• Ability to define network host hierarchy/groups
• Allowing detection of and distinction between hosts that ar...
Features
• Automatic log file rotation
• Support for implementing redundant monitoring hosts
• Optional web interface f...
Basics
Basics
Basics
Basics
Basics
Basics
Definitions
• Host
• Service
• Contacts
• Commands
• TimePeriod
• Eventhandlers
Basics
Host
Defines a physical server, workstation, device, etc. that resides on your
network.
Basics
Host
define host{
host_name remotehost
alias some Remote Host
address 192.168.1.50
contacts admin
max_check_...
Basics
Service
• Its a service that runs on the host.
• Actual service on the host like POP, SMTP, HTTP, etc.)
• Metri...
Basics
Service
Basics
Service
define service {
host_name linux-server
service_description check-disk-sda1
check_command check-disk!/...
Basics
Contacts
Identify someone who should be contacted in the event of a problem.
define contact{
contact_name admin...
Basics
Commands
define command{
name check_http
command_name check_http
command_line $USER1$/check_http -I $HOSTADDRE...
Basics
Time Period
Valid times for notifications and service checks.
define timeperiod{
timeperiod_name nonworkhours
...
Event Handlers
Event handlers are optional system commands (scripts or executables)
that are run whenever a host or serv...
Event Handlers
Event handlers are executed when a service or host:
• Is in a SOFT problem state
• Initially goes into a...
Basics
Other Blocks
• contactgroup
• servicegroup
• servicedependency
• serviceescalation
• serviceextinfo
• hostde...
Monitoring Services
Nagios can be used to monitor Public and Private Services
• Private Services
• CPU load
• Memory u...
Monitoring Private Services
• Plugins/Addons are mostly used for monitoring private services.
• NRPE addon is installed ...
Monitoring Private Services
• NCSA addon (Nagios Service Check Adapter))
• Allows you to send passive check results from...
Monitoring Public Services
• Check plugins first @ Nagios Exchange
• Walk through
• Create host in file within cfg dir ...
State Types
• Based on variable max_check_attempts
• The SOFT state is logged, when
• Number of checks haven’t complete...
State Types
• HARD state is logged, when
• Number of checks have completed
• When a host or service transitions from on...
Active / Passive Checks
Active Checks
● Initiated by the Nagios process
● Ran on a regularly scheduled basis
Active / Passive Checks
Passive Checks
● Passive checks are initiated and performed
by external applications/processes ...
Nagios in Action
Demo Time : http://nagioscore.demos.nagios.com/
Reports
• Availability Report
Report for uptime and services
• Trends Report
Graphical breakdown of of state of partic...
Reports
• Alert History Report
Record of historical alerts
Reports
• Alert Summary Report
Reports
• Alert Histogram Report
Frequency graph of host and service alerts
Reports
• Notification Report
Provides historical record of notifications sent to contacts
Summary
• Infra monitoring
• Anomaly Outage detection
• Automatic Problem remedy
• Schedule Downtime
• Outage Alerts ...
Advice for Beginners
• Relax - it's going to take some time.
• Use the quickstart instructions.
• Read the documentatio...
Next Steps
• Get your hands dirty
• Get training
Live / Self paced training
• Get certified
Nagios Certified Professi...
References
• Nagios Documentation
• Nagios Online Demo
• Slideshare
• NRPE Blog
Thank You
of 44

Nagios, Getting Started.

Nagios, is a World standard when it comes to monitoring the IT infrastructure. This presentation would help you to Getiing started with Nagios.
Published on: Mar 3, 2016
Published in: Software      
Source: www.slideshare.net


Transcripts - Nagios, Getting Started.

  • 1. The Industry Standard In IT Infrastructure Monitoring
  • 2. Who are using Nagios
  • 3. Agenda • What is Nagios • What can you do with Nagios • Features • Basic o Architecture o Terminology • Monitoring • State Types • Active / Passive Checks • Reports
  • 4. What is Nagios Core Open Source system and network monitoring application
  • 5. With Nagios you can • Monitor your entire IT infrastructure • Spot problems before they occur • Know immediately when problems arise • Share availability data with stakeholders • Detect security breaches • Plan and budget for IT upgrades • Reduce downtime and business losses
  • 6. Features • Monitoring of network services • SMTP • POP3 • HTTP • PING and more • Monitoring of host resources • Processor load • Disk usage and more • Simple plugin design that allows users to easily develop their own service checks • Parallelized service checks
  • 7. Features • Ability to define network host hierarchy/groups • Allowing detection of and distinction between hosts that are down and those that are unreachable • Contact notifications when service or host problems occur and get resolved via • Email • Pager • or user-defined methods • Ability to define event handlers to be run during service or host events for proactive problem resolution
  • 8. Features • Automatic log file rotation • Support for implementing redundant monitoring hosts • Optional web interface for viewing • Current network status • Notification • Problem history • Log file and more
  • 9. Basics
  • 10. Basics
  • 11. Basics
  • 12. Basics
  • 13. Basics
  • 14. Basics Definitions • Host • Service • Contacts • Commands • TimePeriod • Eventhandlers
  • 15. Basics Host Defines a physical server, workstation, device, etc. that resides on your network.
  • 16. Basics Host define host{ host_name remotehost alias some Remote Host address 192.168.1.50 contacts admin max_check_attempts 3 check_period 24x7 notification_interval 60 notification_period 24x7 }
  • 17. Basics Service • Its a service that runs on the host. • Actual service on the host like POP, SMTP, HTTP, etc.) • Metric associated with the host (response to a ping, number of logged in users, free disk space, etc.
  • 18. Basics Service
  • 19. Basics Service define service { host_name linux-server service_description check-disk-sda1 check_command check-disk!/dev/sda1 max_check_attempts 5 check_interval 5 retry_interval 3 check_period 24x7 notification_interval 30 notification_period 24x7 notification_options w,c,r contact_groups admins }
  • 20. Basics Contacts Identify someone who should be contacted in the event of a problem. define contact{ contact_name admin alias admin host_notifications_enabled 1 service_notifications_enabled 1 service_notification_period 24x7 host_notification_period 24x7 service_notification_options w,u,c,r host_notification_options d,u,r service_notification_commands notify-by-email host_notification_commands host-notify-by-email email admin@organisation.com address1 abc.def@organisation.com }
  • 21. Basics Commands define command{ name check_http command_name check_http command_line $USER1$/check_http -I $HOSTADDRESS$ $ARG1$ } define host { .. address 192.168.1.50 .. } define service { .. check_command check-disk!/dev/sda1 .. }
  • 22. Basics Time Period Valid times for notifications and service checks. define timeperiod{ timeperiod_name nonworkhours alias Non-Work Hours sunday 00:00-24:00 week monday 00:00-09:00,17:00-24:00 tuesday 00:00-09:00,17:00-24:00 wednesday 00:00-09:00,17:00-24:00 thursday 00:00-09:00,17:00-24:00 friday 00:00-09:00,17:00-24:00 saturday 00:00-24:00 }
  • 23. Event Handlers Event handlers are optional system commands (scripts or executables) that are run whenever a host or service state change occurs. • Restarting a failed service • Entering a trouble ticket into a helpdesk system • Logging event information to a database • Cycling power on a host
  • 24. Event Handlers Event handlers are executed when a service or host: • Is in a SOFT problem state • Initially goes into a HARD problem state • Initially recovers from a SOFT or HARD problem state define service { .. event_handler command_name event_handler_enabled [0/1] .. }
  • 25. Basics Other Blocks • contactgroup • servicegroup • servicedependency • serviceescalation • serviceextinfo • hostdependency • hostescalation • hostextinfo
  • 26. Monitoring Services Nagios can be used to monitor Public and Private Services • Private Services • CPU load • Memory usage • Disk usage • Logged in users • Running processes • Publicly available services that are provided by Linux servers • HTTP • FTP • SSH • SMTP
  • 27. Monitoring Private Services • Plugins/Addons are mostly used for monitoring private services. • NRPE addon is installed on the target servers (Nagios Remote Plugin Executor) • Its is an addon that allows you to execute plugins on remote Linux/Unix hosts
  • 28. Monitoring Private Services • NCSA addon (Nagios Service Check Adapter)) • Allows you to send passive check results from remote Linux/Unix to the Nagios daemon running on the monitoring server. • This is very useful in distributed and redundant/failover monitoring setups.
  • 29. Monitoring Public Services • Check plugins first @ Nagios Exchange • Walk through • Create host in file within cfg dir • Define Service for each process/service that needs to be monitored. • Service uses pre-defined/custom defined commands. • Define contacts who would receive notifications and take action.
  • 30. State Types • Based on variable max_check_attempts • The SOFT state is logged, when • Number of checks haven’t completed yet • When a service or host recovers from a soft error. This is considered a soft recovery.
  • 31. State Types • HARD state is logged, when • Number of checks have completed • When a host or service transitions from one hard error state to another error state (e.g. WARNING to CRITICAL). • ex. Running to Down • When a service check results in a non-OK state and its corresponding host is either DOWN or UNREACHABLE. • When a host or service recovers from a hard error state. This is considered to be a hard recovery. • Contacts are notified of the host or service problem or recovery.
  • 32. Active / Passive Checks Active Checks ● Initiated by the Nagios process ● Ran on a regularly scheduled basis
  • 33. Active / Passive Checks Passive Checks ● Passive checks are initiated and performed by external applications/processes ● Passive check results are submitted to Nagios for processing • Used for • Checks that are asynchronous in nature ● Located behind a firewall and cannot be checked actively from the monitoring host
  • 34. Nagios in Action Demo Time : http://nagioscore.demos.nagios.com/
  • 35. Reports • Availability Report Report for uptime and services • Trends Report Graphical breakdown of of state of particular host, service.
  • 36. Reports • Alert History Report Record of historical alerts
  • 37. Reports • Alert Summary Report
  • 38. Reports • Alert Histogram Report Frequency graph of host and service alerts
  • 39. Reports • Notification Report Provides historical record of notifications sent to contacts
  • 40. Summary • Infra monitoring • Anomaly Outage detection • Automatic Problem remedy • Schedule Downtime • Outage Alerts • Alert Escalations • Historical Reporting • Maintenance Planning
  • 41. Advice for Beginners • Relax - it's going to take some time. • Use the quickstart instructions. • Read the documentation. • visiting the Nagios Support Forum at http://support.nagios. com/forum/.
  • 42. Next Steps • Get your hands dirty • Get training Live / Self paced training • Get certified Nagios Certified Professional Nagios Certified Administrator • Use it to Monitor your infra.
  • 43. References • Nagios Documentation • Nagios Online Demo • Slideshare • NRPE Blog
  • 44. Thank You

Related Documents