First off, realtime websites are hard. The current toolset is rudimentary. When I first started building Ginger, I thought I must be doing it wrong because of all the trial-and-error and pieces I was building from scratch. After watching Geoff Schmidt’s keynote at DjangoCon, I realized it’s not that I’m dumb, it’s that everyone is reinventing their own wheel. His co-creation, Meteor, promises to fix that, but until then, we’re stuck with the tools we have.

One thing nearly every realtime site needs is a replay log. Public networks are inherintly unstable (doubly so for mobile). If a client disconnects for a short period of time, users will expect to receive any data it missed when it reconnects. That’s where a replay log comes in. It keeps a log of all activity and, when a client reconnects, it streams all the data it is missing.

The architecture of the solution is platform-agnostic (JavaScript, Python, Ruby), but I’ll use Python and JavaScript in my examples because that’s what we use.

Step 1: Create the Replay Logs

Most realtime sites consist of a set of channels that users subscribe to and can push/pull data through. We want to maintain a replay log for each of those channels. The easiest way to do this is to put a thin wrapper around the code that publishes to the channel. For us, that looks something like this (Python):

import json
import time
import redis

db = redis.Redis()

def broadcast(channel, event_name, content):
    """
    Push relevant details to Redis pub/sub channel 
    and save to replay log.
    """
    timestamp = int(time.time() * 1000)
    # flatten data to put on queue
    flattened = json.dumps({'event_name': event_name, 'content': content, 
                            'timestamp': timestamp})
    # Notify realtime server that new message is ready.
    db.publish("mq:{0}".format(channel), flattened)
    # Save message to replay log
    db.zadd("replay:{0}".format(channel), flattened, timestamp)

While this probably isn’t code you can copy/paste into your site, you should get the idea. The last line is our replay log for the channel as a sorted set, ordered by the timestamp (this is important for later).

Step 2: Unspool a Replay Log on Reconnect

To keep this short, I’ll leave detecting disconnect/reconnects as an exercise for the user (let me know in the comments if you want another post). The client should be keeping track of the timestamps the server sends down. When it reconnects, it sends a message to the server saying, “The last message I got had this timestamp. Is there anything I missed?”. The server checks the replay logs for each channel the client is subscribed to and sends down anything with a newer timestamp (JavaScript/Node.js):

// db is a Redis connection
// pubsub is a Redis pub/sub channel that streams data to the client
'unspool': function (channel, since) {
    // given a channel and timestamp,
    // check the replay log for newer items and send if they exist
    var replayLog = 'replay:' + channel;
    db.zrangebyscore(replayLog, '(' + since, '+inf', function (err, messages) {
        _.each(messages, function (message) {
            pubsub.emit("message", key, message);
        });
    });
}

If you aren’t familiar with zrangebyscore, read the docs. It does exactly what we need here (remember we set the score as the timestamp in Step 1).

Step 3: Prune the Replay Logs

Now that we’ve got it working, we need to make sure the replay logs don’t grow unbounded. We have a scheduled task that runs and removes everything that is older than a certain limit. In Python, that looks like this:

def prune_replay_logs():
    """
    Iterates over all the replay logs, deleting all items that are older
    than the expiration settings
    """
    # lookup all replay logs by wildcard
    log_keys = db.keys('replay:*')
    timestamp = int(time.time() * 1000)
    # create timestamp of oldest item to keep
    oldest = timestamp - settings.EXPIRE_REPLAY_LOGS_MS
    # for each log, delete everything older
    for key in log_keys:
        db.zremrangebyscore(key, 0, oldest)

Done!

Step back and revel in the awesomeness of Redis. Sorted sets provide an elegant way to bolt-on replay logs to an existing site. I have a feeling a lot of people have built similar solutions, but my searches failed to provide any details.

Already have a solution for reconnecting realtime clients? I’d love to hear about it in the comments!