Primary image for Logging to Systemd's Journal in Python

Logging to Systemd's Journal in Python

I’ll be the first to admit that I was skeptical about Systemd and its companion log aggregator Journald, but after working with if for a few months, I’ve seen the light (or drunk the Kool Aid… not sure).

koolaid

Typically I log everything to flat log files on the disk. It’s safe and “grep-able”. If I’m using a log aggregator, I’ll configure something like filebeat to tail the flat files and ship them to the log aggregator.

Systemd’s journal is a totally different beast. It stores logs in a binary format that are not readable without going through the journalctl command first. It gives you a system-wide log aggregator with rich querying abilities. This article on Digital Ocean’s blog is a good intro to its usage and capabilities.

A Note on Structured Logs

To get the most out of logging to the journal (or any log aggregator), you want to send it structured logs. Before, we might send a log line like this:

python
log.info('User %s successfully completed transaction %s.', 
          user.pk, transaction.pk)

While this is certainly a helpful log message, it is difficult to filter on. What if I want to see all the logs for this user? I probably need to be a grep/awk wizard to match not only this message, but others that have slightly different formats. This is where structured logs can really shine. A similar log line in a more structured format might look like this:

python
log.info('transaction completed', extra={
    'USER': user.pk,
    'STATUS': 'success',
    'TRANSACTION': transaction.pk
})

Once entries with this data are pushed to the journal, they can easily be filtered. For example, to find all logs for the user with ID 59032:

bash
# journalctl USER=59032

Note: Your extra dictionary keys must be in all caps to be captured and filtered in the journal.

Sending Python Logs to the Journal

If your app is running under Systemd, anything it outputs to the console will be picked up and logged to the journal. It is only logged as plain-text, however, so you lose a lot of the nice filtering capabilities of the journal. Luckily, the friendly developers of Systemd already have this scenario covered for us. They publish a Python library which includes a log handler to push structured logs to the journal. It can be installed via:

pip install systemd-python

Note: It uses C extensions and will require the Systemd development headers to build. On recent Ubuntu’s, install libsystemd-dev.

Once installed, you need to setup the handler in your code. You probably don’t want it to always log to the journal, but only when it is running as a Systemd service. To accomplish that, I set a special environment variable in the systemd service definition, then use that to extend our standard Django logging config:

python
if 'SYSLOG_IDENTIFIER' in os.environ:
    try:
        # verify lib is installed
        from systemd import journal
        LOGGING['formatters']['default'] = {
            'format': '%(message)s'
        }
        LOGGING['handlers']['console'] = {
            'class': 'systemd.journal.JournalHandler',
            'SYSLOG_IDENTIFIER': os.environ['SYSLOG_IDENTIFIER']
        }
    except ImportError:
        pass

The SYSLOG_IDENTIFIER will be applied to every journal entry and can help us with filtering as noted above.

Is it Better?

There are some clear benefits to sending your logs to the journal:

  • Aggregation of logs from multiple services
  • Rich filtering, e.g. show me all logs that happened in a five minute window
  • Never worrying about log rotation, logs filling up the disk, etc.

That being said, it has its drawbacks as well:

  • Requires extra application configuration an additional Python library (that requires compilation)
  • Binary storage format requires going through the journal for access
  • External tooling is slow to catch-up. Log aggregators (across multiple hosts) have limited support for pulling logs from the journal.

What do you think? Do you log your services to the journal, files, or something else? Are you using structured logs? This is a topic that doesn’t get a ton of attention in the Python/Django community, so I’d like to hear what other shops are doing.

Peter Baumgartner

About the author

Peter Baumgartner

Peter is the founder of Lincoln Loop, having built it up from a small freelance operation in 2007 to what it is today. He is constantly learning and is well-versed in many technical disciplines including devops, …