NATS
A nervous system
for modern
distributed systems
Derek Collison
@derekcollison
https://github.com/derekcollison
derek@apcera.com
derek.collison@gmail.com
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Derek Collison
Google 6yrs
TIBCO > 10yrs
Architected TIBCO Rendezvous and EMS
Architected the OpenPaaS CloudFoundry
Buildi...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Background
•MicroServices Architectures
•Event-Driven Architectures
•HTTP as an interface only goes so far
•1:N / 1:1 of N...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Networks
•IP: TCP and UDP
•Streaming vs limited packet size and
unreliability
•Effective 1:N -> UDP Broadcast /
Multicast
...
Networks
•Multicast has too much admin, failed
•Multicast trunked or disallowed
•UDP BC TOR trunked in most Cloud
Platforms
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Basic Messaging Patterns
✓Publish-Subscribe
✓Queuing
✓Request-Reply
Messaging - Publish Subscribe
1 : N
Publisher
Subscriber
Subscriber
SubscriberSubject
Messaging - Queuing
1 : 1
Publisher Queue
Subscriber
Subscriber
Message #1
Subscriber
Messaging - Queuing
1 : 1
Publisher Queue
Subscriber
Subscriber
Message #2
Subscriber
Messaging - Queuing
1 : 1
Publisher Queue
Subscriber
Subscriber
Message #3
Subscriber
Messaging - Request Reply
1 : 1
Publisher
Reply Subscriber
Subscriber
SubscriberSubject
Messaging - Request Reply
1 : N
Reply
Publisher
Subscriber
Subscriber
SubscriberSubject
Messaging Use Cases
✓Addressing, discovery
✓Command and control - Control Plane
✓Load-balancing
✓N-way scalability
✓Locati...
Why
Pub-Sub?
Publish-Subscribe
✓A radio vs a phone call
✓E.g. Wallstreet quote distribution
✓programatic trading
✓fairness and delivery...
Queueing
Queueing
Publish or
Subscribe
operation?
Queueing
Publish is
Store and
Forward
Queueing
Subscribe is
distributed
queueing
Request-
Reply
Request-Reply
✓Don’t assume audience!
✓How many responders?
✓Always built on Publish-Subscribe
Enterprise
Messaging Patterns
✓Persistence
✓Store & Forward
✓Distributed Transactions
✓Enhanced Delivery Models
Delivery
Delivery Models
✓At Most Once
✓At Least Once
✓Exactly Once
Delivery Models
Exactly
Once is very
HARD!
If you do it
Correctly
What if we
looked at the
problem
differently?
Should
it do
everything?
OR..
Should
it do
much less?
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
What NATS is..
✓High-Performance
✓Always on and available
✓Extremely light-weight
✓Fire and Forget - At Most Once
✓Pub/Sub...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
What NATS is NOT..
✓Enterprise Messaging System
✓Persistence
✓Transactions
✓Enhanced Delivery Models
✓Queueing Product
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
What is Unique?
✓Clustered mode server
✓Cluster aware clients
✓Go, Node.js, Java, Scala, Python, Ruby
✓Auto-pruning of int...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Performance
• Originally written to support CloudFoundry
• In use by CloudFoundry, HTC, Baidu, Apcera and
others
• Writt...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Text-Based Protocol
✓Easy to get started with new clients
✓Does not affect performance
✓Can telnet directly to server
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Monitoring
✓HTTP based monitoring
✓Modeled off of /varz in Google
✓Simple JSON payloads
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Clients
✓Go
✓Node.js
✓Java/Scala
✓Ruby
✓Python
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Clustering
Client
Connection
GNATSD
GNATSD
GNATSD
Clustering
Client
Connection
GNATSD
GNATSD
GNATSD
Clustering
Client
Connection
GNATSD
GNATSD
GNATSD
X
Clustering
Client
Connection
GNATSD
GNATSD
GNATSD
X
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Auto-Pruning
✓Able to express limited interest a priori
✓Systems uses circuit breakers
✓1:1 Requests to large N is very ef...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Summary
✓Modeled to be always-on dial-tone
✓Always available - NATS protects itself
✓High-Performance server
✓Clustered Se...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Futures
✓NGINX C++ client to OSS
✓Performance gains in server and clients
✓C/C++, LUA clients
✓Monitoring dashboards
✓Auto...
Background
• Good Performance is good
•Predictably Good Performance is king!
•Measure everything (can’t fix what you don’t ...
Resources
https://nats.io
https://registry.hub.docker.com/u/apcera/gnatsd/
https://github.com/apcera/gnatsd
http://www.sli...
Questions?
of 75

NATS - A new nervous system for distributed cloud platforms

NATS is an open-source, high-performance, lightweight cloud messaging system. NATS was created by Derek Collison, Founder/CEO of Apcera who has spent 20+ years designing, building, and using publish-subscribe messaging systems. Unlike traditional enterprise messaging systems, NATS has an always-on dial tone that does whatever it takes to remain available. This forms a great base for building modern, reliable, and scalable cloud and distributed systems.
Published on: Mar 3, 2016
Published in: Technology      
Source: www.slideshare.net


Transcripts - NATS - A new nervous system for distributed cloud platforms

  • 1. NATS A nervous system for modern distributed systems
  • 2. Derek Collison @derekcollison https://github.com/derekcollison derek@apcera.com derek.collison@gmail.com
  • 3. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Why Even Listen to Me?
  • 4. Derek Collison Google 6yrs TIBCO > 10yrs Architected TIBCO Rendezvous and EMS Architected the OpenPaaS CloudFoundry Building Messaging Systems and Solutions > 20yrs
  • 5. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Why Messaging?
  • 6. Background •MicroServices Architectures •Event-Driven Architectures •HTTP as an interface only goes so far •1:N / 1:1 of N Patterns •Cascading Request/Reply •Subject/Topic based routing
  • 7. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success a brief Network Recap
  • 8. Networks •IP: TCP and UDP •Streaming vs limited packet size and unreliability •Effective 1:N -> UDP Broadcast / Multicast •Late 90s TCP becomes only fast-path option
  • 9. Networks •Multicast has too much admin, failed •Multicast trunked or disallowed •UDP BC TOR trunked in most Cloud Platforms
  • 10. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Messaging
  • 11. Basic Messaging Patterns ✓Publish-Subscribe ✓Queuing ✓Request-Reply
  • 12. Messaging - Publish Subscribe 1 : N Publisher Subscriber Subscriber SubscriberSubject
  • 13. Messaging - Queuing 1 : 1 Publisher Queue Subscriber Subscriber Message #1 Subscriber
  • 14. Messaging - Queuing 1 : 1 Publisher Queue Subscriber Subscriber Message #2 Subscriber
  • 15. Messaging - Queuing 1 : 1 Publisher Queue Subscriber Subscriber Message #3 Subscriber
  • 16. Messaging - Request Reply 1 : 1 Publisher Reply Subscriber Subscriber SubscriberSubject
  • 17. Messaging - Request Reply 1 : N Reply Publisher Subscriber Subscriber SubscriberSubject
  • 18. Messaging Use Cases ✓Addressing, discovery ✓Command and control - Control Plane ✓Load-balancing ✓N-way scalability ✓Location Transparency ✓Fault-Tolerance
  • 19. Why Pub-Sub?
  • 20. Publish-Subscribe ✓A radio vs a phone call ✓E.g. Wallstreet quote distribution ✓programatic trading ✓fairness and delivery embargo ✓Don’t assume the Audience!
  • 21. Queueing
  • 22. Queueing Publish or Subscribe operation?
  • 23. Queueing Publish is Store and Forward
  • 24. Queueing Subscribe is distributed queueing
  • 25. Request- Reply
  • 26. Request-Reply ✓Don’t assume audience! ✓How many responders? ✓Always built on Publish-Subscribe
  • 27. Enterprise Messaging Patterns ✓Persistence ✓Store & Forward ✓Distributed Transactions ✓Enhanced Delivery Models
  • 28. Delivery
  • 29. Delivery Models ✓At Most Once ✓At Least Once ✓Exactly Once
  • 30. Delivery Models Exactly Once is very HARD!
  • 31. If you do it Correctly
  • 32. What if we looked at the problem differently?
  • 33. Should it do everything?
  • 34. OR..
  • 35. Should it do much less?
  • 36. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success NATSnats.io
  • 37. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success the Inspiration
  • 38. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success
  • 39. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success What is NATS?
  • 40. What NATS is.. ✓High-Performance ✓Always on and available ✓Extremely light-weight ✓Fire and Forget - At Most Once ✓Pub/Sub ✓Distributed Queues ✓Request/Reply
  • 41. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success What is NATS NOT?
  • 42. What NATS is NOT.. ✓Enterprise Messaging System ✓Persistence ✓Transactions ✓Enhanced Delivery Models ✓Queueing Product
  • 43. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Disclaimer! I built NATS for myself!
  • 44. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success What’s Unique?
  • 45. What is Unique? ✓Clustered mode server ✓Cluster aware clients ✓Go, Node.js, Java, Scala, Python, Ruby ✓Auto-pruning of interest graph ✓Always Pub/Sub, NO Assumptions ✓Distributed queueing across clusters ✓Text-based protocol
  • 46. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Performance
  • 47. Performance • Originally written to support CloudFoundry • In use by CloudFoundry, HTC, Baidu, Apcera and others • Written first in Ruby -> 150k msgs/sec • Rewritten at Apcera in Go (Client and Server) • First pass -> 500k msgs/sec • Current Performance -> 5-6m msgs/sec
  • 48. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Performance 4k payloadsCourtesy - http://www.bravenewgeek.com/dissecting-message-queues/
  • 49. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Demo
  • 50. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success More Info slideshare.net/derekcollison/gophercon-2014
  • 51. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Text-Based?
  • 52. Text-Based Protocol ✓Easy to get started with new clients ✓Does not affect performance ✓Can telnet directly to server
  • 53. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Demo telnet demo.nats.io 4222
  • 54. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Monitoring
  • 55. Monitoring ✓HTTP based monitoring ✓Modeled off of /varz in Google ✓Simple JSON payloads
  • 56. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Demo curl demo.nats.io:8222/varz curl demo.nats.io:8222/connz
  • 57. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Clients
  • 58. Clients ✓Go ✓Node.js ✓Java/Scala ✓Ruby ✓Python
  • 59. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Clustered
  • 60. Clustering Client Connection GNATSD GNATSD GNATSD
  • 61. Clustering Client Connection GNATSD GNATSD GNATSD
  • 62. Clustering Client Connection GNATSD GNATSD GNATSD X
  • 63. Clustering Client Connection GNATSD GNATSD GNATSD X
  • 64. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Auto-Pruning
  • 65. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Big DEAL! (to me)
  • 66. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Why?
  • 67. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success 1:1 of large N (think Google)
  • 68. Auto-Pruning ✓Able to express limited interest a priori ✓Systems uses circuit breakers ✓1:1 Requests to large N is very efficient! ✓Easily accessible in protocols ✓All clients support in Request/Reply
  • 69. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Summary
  • 70. Summary ✓Modeled to be always-on dial-tone ✓Always available - NATS protects itself ✓High-Performance server ✓Clustered Servers / Cluster aware Clients ✓Clients in many languages, contribute!
  • 71. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Futures
  • 72. Futures ✓NGINX C++ client to OSS ✓Performance gains in server and clients ✓C/C++, LUA clients ✓Monitoring dashboards ✓Auto-configuration service
  • 73. Background • Good Performance is good •Predictably Good Performance is king! •Measure everything (can’t fix what you don’t know) •Understand your data •Understand your user experience • Don’t be a failure of your own success Thanks!
  • 74. Resources https://nats.io https://registry.hub.docker.com/u/apcera/gnatsd/ https://github.com/apcera/gnatsd http://www.slideshare.net/derekcollison/ gophercon-2014
  • 75. Questions?

Related Documents