Over the years, I’ve used a lot of server monitoring systems. Big enterprisey ones like Zabbix, Zenoss, and Hyperic, smaller ones like munin and monit, stuff in the middle like Graphite, and hosted solutions like New Relic. Throughout the search, I never found one that hit the sweet spot for me. They were either too complicated and required too much setup upfront or too limited in what they had to offer. Like any good developer frustrated with the tools available, I set off on building my own.
The result (still in its infancy) is Salmon. Its aim is to be a simple, yet powerful (enough) server monitoring and alerting system. Salmon itself is a simple project, but takes advantage of some great open source libraries for much of its advanced functionality.
Salt
Salt is the new kid on the block for server configuration management (think Chef and Puppet). Billing it as a configuration management tool is underselling it though. It’s really a remote execution framework. I hadn’t wrapped my head around this until a chat with a few of the guys from SaltStack at PyCon this year. Salt is a perfect tool for stats collection on remote servers. It offers a lot right out of the box:
1. It is easily installable on multiple platforms (including Windows). We basically get a monitoring agent (aka “salt-minion”) for free.
2. It uses an efficient, encrypted transport via ZeroMQ.
3. Its can be configured to provide limited access to un-privileged users.
4. We’re already using it on our servers to handle configuration management.
A simple cron job can shell out to the Salt master and collect much of the data we need with Salt’s built-in modules.
Whisper Database
Whisper is a component from Graphite that is very good at storing historical data in a fixed size database (similar to RRDtool). It allows us to track metrics over time and capture slices of time for graphing on the front-end web interface.
Django
Django powers the web interface of Salmon. At the moment, it’s a single app with two views: current system status and historical view of a single server. A management command handles collecting the data (via Salt) and storing it to the database (Whisper for historical data, SQLite for current data). It also handles emailing alerts when the results don’t pass a user-defined check.
We want Salmon to be easy to use for developers who aren’t necessarily familiar with Python or Django, so we took inspiration from Sentry and made it a standalone Django project, using logan.
Help Wanted
I managed to rope Yann Malet into contributing and we received our first pull request shortly after release, but we’re at a place where Salmon could use more eyeballs. We use it in production and it works for basic use cases, but we’d like to support more complex use cases as well. A few ideas:
- Develop additional Salt modules similar to what is found in Munin’s contrib
- Support for graphing multiple values on a single graph
- Handle data sampling to calculate things like average read/writes per second
- More tests and docs
If you’re interested in Salmon, check out the Github repo or come say hi at #lincolnloop
on Freenode.