As you start scaling an application out horizontally (adding more servers/instances), you may run into a problem that requires distributed locking. That’s a fancy term, but the concept is simple. Sometimes you have to be sure that when a block of code is running (usually modifying data somewhere), no other instances runs that same block of code. Ideally, you can design your code not to require locks, but sometimes it is inevitable.
In general, locks are used when you need to modify state (e.g. the database) in an atomic manner. Some examples of when you might need a distributed lock:
- Cron jobs/scheduled tasks whose runtime may exceed the interval with which they are triggered
- Flushing a write-back cache to the database.
- Bulk processing files
If running the code on multiple servers simultaneously would result in corrupted or duplicate data, you probably need to use a distributed lock. The code in question will acquire the lock before execution. Once it has the lock, any other attempt to acquire it will fail.
Implementations
Lock data must be stored in a location accessible to all application instances. If this is a standard Django site, you probably already have two such systems available to you, the cache and the database. The tricky part about locking is that it must be atomic to avoid a race condition when two processes try to acquire the lock simultaneously. Thankfully, this is already a solved problem in the standard cache and database backends used by Django. For these backends, you can find third-party libraries which expose the functionality in Python.
Redis
SETNX
is the Redis primitive that is used for locking. The django-redis
package provides a context manager for this functionality:
Memcached
Memcached’s ADD
works similar to Redis’ SETNX
. I don’t have experience with any libraries that implement a context manager around this, but both django-cache-lock and sherlock appear to provide it.
Postgres
Postgres has a pg_advisory_lock
function which is utilized by django-pglocks.
MySQL
MySQL provides a GET_LOCK
function for distributed locks. It is exposed via a context manager in the django-mysql library.
Additional Considerations
Timeouts
While context managers should release the lock when they exit, there’s always the possibility that your application crashes before that can happen. Without a timeout on the lock, it will be held forever and prevent any similar code from running. Refer to the docs of the library you choose to see how to specify a timeout. You should set this value to something longer than it should ever take the code to execute, but short enough, it doesn’t prevent successive runs from executing.
Encountering a Lock
What your application does when it encounters a lock that has already been acquired is going to be specific to its requirements. Most implementations will throw an exception if the lock can’t be acquired, so you’ll usually wrap this code in a try
/except
. Possibilities of what to do in the exception case include:
- Retry later
- Wait for the lock to be released
- Log an error
- Do nothing
Database Row Locking
If your code needs exclusive access to modify specific rows in the database and you want to prevent any other modification of those rows while the code executes, database row locking may be a better option. Django provides functionality for this with the select_for_update
queryset method.
Distributed locking sounds like a difficult technical issue, but in general, it is a solved problem. Any Django app should be able to grab a mature off-the-shelf solution to the problem. The only problems to solve are identifying the code that requires a lock and what should happen if a lock is encountered.
Photo by marcos mayer on Unsplash