hi.
hi.
dave@librato.com
@davejosephsen
github: djosephsen
Graphing
Graphing Nagios
dave@librato.com
@davejosephsen
github: djosephsen
Graphing Nagios
Design Patterns for Metrics
Processing
Things that extract,
transport and process
Things that measure Things that store and graph
All of the above
Today At 4:30
Pattern1: Centralized Polling
Pattern1: Centralized Polling
Pattern1: Centralized Polling
Pattern1: Centralized Polling
Pattern1: Centralized Polling
you ok?
Pattern1: Centralized Polling
you ok?
yup.
Pattern1: Centralized Polling
you ok?
yup.
Pattern1: Centralized Polling
Hey, I said you ok?
Pattern1: Centralized Polling
DUDE HELLO?!
Pattern1: Centralized Polling
host[2] is NOT OK
Pattern1: Centralized Polling
ya ok?
Pattern1: Centralized Polling
Yeah I’m ok | rta=50.582001ms; pl=0%;0
data)
Pattern1: Centralized Polling
(performance data)
}
Yeah I’m ok | rta=50.582001ms; pl=0%;0
Pattern1: Centralized Polling
all good (perfdata)
Pattern1: Centralized Polling
all good (perfdata)
Pattern1: Centralized Polling
Pattern1: Centralized Polling
Pattern1: Centralized Polling
you ok?
yup.
Pattern1: Centralized Polling
<——Limits Resolution
Pattern1: Centralized Polling
<——Limits Resolution
<——Limits Scalability
Pattern2: Autonomous Agents
Pattern2: Autonomous Agents
Pattern2: Autonomous Agents
Pattern2: Autonomous Agents
Pattern2: Autonomous Agents
Pattern2: Autonomous Agents
Pattern2: Autonomous Agents
Still Doesn’t Scale
Pattern3: The Rollup Pattern
Pattern3: The Rollup Pattern
Pattern3: The Rollup Pattern
Pattern3: The Rollup Pattern
Pattern3: The Rollup Pattern
Pattern3: The Rollup Pattern
Pattern3: The Rollup Pattern
Pattern3: The Rollup Pattern
Pattern3: The Rollup Pattern
Pattern3: The Rollup Pattern
gmond gmetad
Pattern3: The Rollup Pattern
gmond gmetad
Pattern3: The Rollup Pattern
gmond gmetad
Pattern3: The Rollup Pattern
Pattern3: The Rollup Pattern
Complicated, and therefore
Hard to reason about, and therefore
Hard to Maintain
—>
That’s a server right? —>
That’s a thread —>
Auto-Scaling Group 1b-17 east
instance-eaa6238ad5ff00
instance-d3b07384d11
instance-3edec49
instance-d3b07384d11 instance-3edec49
ZOMG IT’S 4:00 ON FRIDAY! FINISH ALL THE WORKS!!
That already doesn’t
exist anymore —>
Pattern 4: Emitter/Reporter
Instrumentation as code
output, _ := daves.function(input)
metric.time( output, _ := daves.function(input))
func handleInput(inBytes *[]byte, type string){!
!
if type == “A” {!
handleTypeA(inBytes)!
!
} else if type == “B” {!...
func handleInput(inBytes *[]byte, type string){!
!
if type == “A” {!
handleTypeA(inBytes)!
!
} else if type == “B” {!...
rrdtool create target.rrd !
--start 1023654125 !
--step 300 !
DS:mem:GAUGE:600:0:671744 !
RRA:AVERAGE:0.5:12:24 !
RRA...
BOB.SOMEWHERE.COM
THROW METRICS
HERE
THROW METRICS
HERE
http://github.com/shawn-sterling/graphios
Carbon Whisper Graphite
Carbon
2003/TCP
Carbon
2003/TCP
thing.cpu.load .021 1412366606
format:
<metric path>SPACE<metric value>SPACE<metric timestamp>
Carbon
Whisper
2003/TCP
1412366606
Whisper
1412366606
Whisper
1412366606
Whisper
1412366606
Whisper
1412366606
1412366606
Whisper
1412366606
http://play.grafana.org
Woes
Woes
HTTP POST
Carbon
JSON+UDP
LevelDB, RocksDB, HyperLevelDB, or LMDB
select value from response_times
where time > '2013-08-12 23:32:01.232' and
time < '2013-08-13';
1 second, 1 year
250MB
1 second, 1 year
25GB
100 Measurements
1 second, 1 year
2.5TB
100 Measurements
100 servers
2.5TB
| | 2 miutes
10 Seconds
| |
| | 1 minute
(stored for 1 year)
1-minute resolution data
| | | |
10-second resolution data
(stored for 24 hours)
1-minute, 10cents/month
5-minute, 5cents/month
Whitelist metrics here
Questions?
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
of 121

Nagios Conference 2014 - David Josephsen - Graphing Nagios

David Josephsen's presentation on Graphing Nagios. The presentation was given during the Nagios World Conference North America held Oct 13th - Oct 16th, 2014 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/conference
Published on: Mar 3, 2016
Published in: Technology      
Source: www.slideshare.net


Transcripts - Nagios Conference 2014 - David Josephsen - Graphing Nagios

  • 1. hi.
  • 2. hi. dave@librato.com @davejosephsen github: djosephsen Graphing
  • 3. Graphing Nagios dave@librato.com @davejosephsen github: djosephsen
  • 4. Graphing Nagios
  • 5. Design Patterns for Metrics Processing
  • 6. Things that extract, transport and process Things that measure Things that store and graph All of the above
  • 7. Today At 4:30
  • 8. Pattern1: Centralized Polling
  • 9. Pattern1: Centralized Polling
  • 10. Pattern1: Centralized Polling
  • 11. Pattern1: Centralized Polling
  • 12. Pattern1: Centralized Polling you ok?
  • 13. Pattern1: Centralized Polling you ok? yup.
  • 14. Pattern1: Centralized Polling you ok? yup.
  • 15. Pattern1: Centralized Polling Hey, I said you ok?
  • 16. Pattern1: Centralized Polling DUDE HELLO?!
  • 17. Pattern1: Centralized Polling host[2] is NOT OK
  • 18. Pattern1: Centralized Polling ya ok?
  • 19. Pattern1: Centralized Polling Yeah I’m ok | rta=50.582001ms; pl=0%;0 data)
  • 20. Pattern1: Centralized Polling (performance data) } Yeah I’m ok | rta=50.582001ms; pl=0%;0
  • 21. Pattern1: Centralized Polling all good (perfdata)
  • 22. Pattern1: Centralized Polling all good (perfdata)
  • 23. Pattern1: Centralized Polling
  • 24. Pattern1: Centralized Polling
  • 25. Pattern1: Centralized Polling you ok? yup.
  • 26. Pattern1: Centralized Polling <——Limits Resolution
  • 27. Pattern1: Centralized Polling <——Limits Resolution <——Limits Scalability
  • 28. Pattern2: Autonomous Agents
  • 29. Pattern2: Autonomous Agents
  • 30. Pattern2: Autonomous Agents
  • 31. Pattern2: Autonomous Agents
  • 32. Pattern2: Autonomous Agents
  • 33. Pattern2: Autonomous Agents
  • 34. Pattern2: Autonomous Agents Still Doesn’t Scale
  • 35. Pattern3: The Rollup Pattern
  • 36. Pattern3: The Rollup Pattern
  • 37. Pattern3: The Rollup Pattern
  • 38. Pattern3: The Rollup Pattern
  • 39. Pattern3: The Rollup Pattern
  • 40. Pattern3: The Rollup Pattern
  • 41. Pattern3: The Rollup Pattern
  • 42. Pattern3: The Rollup Pattern
  • 43. Pattern3: The Rollup Pattern
  • 44. Pattern3: The Rollup Pattern gmond gmetad
  • 45. Pattern3: The Rollup Pattern gmond gmetad
  • 46. Pattern3: The Rollup Pattern gmond gmetad
  • 47. Pattern3: The Rollup Pattern
  • 48. Pattern3: The Rollup Pattern Complicated, and therefore Hard to reason about, and therefore Hard to Maintain
  • 49. —>
  • 50. That’s a server right? —>
  • 51. That’s a thread —>
  • 52. Auto-Scaling Group 1b-17 east instance-eaa6238ad5ff00 instance-d3b07384d11 instance-3edec49
  • 53. instance-d3b07384d11 instance-3edec49 ZOMG IT’S 4:00 ON FRIDAY! FINISH ALL THE WORKS!!
  • 54. That already doesn’t exist anymore —>
  • 55. Pattern 4: Emitter/Reporter
  • 56. Instrumentation as code
  • 57. output, _ := daves.function(input)
  • 58. metric.time( output, _ := daves.function(input))
  • 59. func handleInput(inBytes *[]byte, type string){! ! if type == “A” {! handleTypeA(inBytes)! ! } else if type == “B” {! handleTypeB(inBytes)! ! }else if type == “C” {! handleTypeC(inBytes)! ! }! }
  • 60. func handleInput(inBytes *[]byte, type string){! ! if type == “A” {! handleTypeA(inBytes)! ! } else if type == “B” {! handleTypeB(inBytes)! ! }else if type == “C” {! handleTypeC(inBytes)! ! }! } aCounter.Inc(1) bCounter.Inc(1) cCounter.Inc(1)
  • 61. rrdtool create target.rrd ! --start 1023654125 ! --step 300 ! DS:mem:GAUGE:600:0:671744 ! RRA:AVERAGE:0.5:12:24 ! RRA:AVERAGE:0.5:288:31
  • 62. BOB.SOMEWHERE.COM
  • 63. THROW METRICS HERE
  • 64. THROW METRICS HERE
  • 65. http://github.com/shawn-sterling/graphios
  • 66. Carbon Whisper Graphite
  • 67. Carbon 2003/TCP
  • 68. Carbon 2003/TCP thing.cpu.load .021 1412366606 format: <metric path>SPACE<metric value>SPACE<metric timestamp>
  • 69. Carbon Whisper 2003/TCP 1412366606
  • 70. Whisper 1412366606
  • 71. Whisper 1412366606
  • 72. Whisper 1412366606
  • 73. Whisper 1412366606
  • 74. 1412366606
  • 75. Whisper 1412366606 http://play.grafana.org
  • 76. Woes
  • 77. Woes
  • 78. HTTP POST Carbon JSON+UDP
  • 79. LevelDB, RocksDB, HyperLevelDB, or LMDB
  • 80. select value from response_times where time > '2013-08-12 23:32:01.232' and time < '2013-08-13';
  • 81. 1 second, 1 year 250MB
  • 82. 1 second, 1 year 25GB 100 Measurements
  • 83. 1 second, 1 year 2.5TB 100 Measurements 100 servers
  • 84. 2.5TB
  • 85. | | 2 miutes 10 Seconds | | | | 1 minute
  • 86. (stored for 1 year) 1-minute resolution data | | | | 10-second resolution data (stored for 24 hours)
  • 87. 1-minute, 10cents/month 5-minute, 5cents/month Whitelist metrics here
  • 88. Questions?

Related Documents