.. raw:: latex

  \mainmatter
  \pagestyle{chfooter}
  \includepdf[pages={1}]{intro.pdf}

================
The Big Picture
================


Introduction
============

It's not uncommon to hear people say "Django doesn't scale". Depending on how
you look at it, the statement is either completely true or patently false.
Django, on its own, doesn't scale. The same can be said of Ruby on Rails, Flask,
PHP, or any other language used by a database-driven dynamic website. The good
news, however, is that Django interacts beautifully with a suite of caching and
load balancing tools that will allow it to scale to as much traffic as you can
throw at it. Contrary to what you may have read online, it can do so without
replacing core components often labeled as "too slow" such as the database ORM
or the template layer.

.. index:: Disqus, Instagram, Pinterest

Django's scaling success stories are almost too numerous to list at this
point. It backs Disqus, Instagram, and Pinterest. Want some more proof?
Instagram was able to sustain over 30 million users on Django with only 3
engineers (2 of which had no back-end development experience). Back in 2013, Disqus was
serving 8 *billion* page views per month. You can be certain that by the time you're
reading this, the bigger players are serving many multiples of that. Those are some
**huge** numbers. These teams have proven Django most certainly *does* scale. Our
experience here at Lincoln Loop backs it up. We've built big Django sites capable of
spending the day on the Reddit homepage without breaking a sweat.

Every site has unique needs and different pain points requiring extra attention
to operate at scale. You may be surprised, however, to learn that their general
approaches all look very similar. Perhaps even more surprising is
that many parts of this infrastructure aren't even unique to Django
applications. The techniques we'll describe are widely used across high traffic
sites of many frameworks and languages.

Our point is this: Django scales, and the tactics described in this book will
help you build sites capable of withstanding millions of page views per day
and hundreds, if not thousands, of concurrent users. We have years of
experience applying these tactics on heavily trafficked production sites.
It works for us and we're confident it will work for you too.


Philosophy
===============

.. epigraph::

  Simplicity is a prerequisite for reliability.

  -- Edsger W. Dijkstra

For our team at Lincoln Loop, the guiding philosophy in designing high-traffic
Django sites is **simplicity**. Unfortunately, undisciplined developers will always
trend towards complexity. Without making a conscious effort to fight
complexity at every turn, it is too easy to waste time building complex,
unmaintainable monstrosities that will bite you down the road.

Simplicity means:

1. Using as few moving parts as possible to make it all work. "Moving parts" may
   be servers, services or third-party software.
2. Choosing proven and dependable moving parts instead of the new hotness.
3. Using a proven and dependable architecture instead of blazing your own trail.
4. Deflecting traffic away from complex parts and toward fast, scalable, and
   simple parts.

Simple systems are easier to scale, easier to understand, and easier to
develop. Of course, any non-trivial web application will bring its own unique
set of complex problems to solve but by keeping the rest of the stack simple,
you and your team can spend more time focusing on the product rather than on
scaling and infrastructure.


The Pain Points
===============

Django apps, and for that matter, most web applications share many common
performance characteristics. Here are the most frequent pain points we
encounter building performant web applications; they should look familiar to you.

Database
^^^^^^^^

A relational database (eg, Postgres, MySQL) is usually the slowest and most
complex beast in the stack. One option is to replace it with a faster and less
complex "NoSQL" database, but in many cases, that pushes the complexity into
your application and squarely into the hands of your developers. We've found
it simpler to keep it down in a proven RDBMS and handle the pain via caching.

Templates
^^^^^^^^^

Templates get complex quickly. To make matters worse, Django's template engine
has made a trade-off for simplicity and usability over speed. We could replace
it with a faster templating engine like Jinja2, but it will still be the second
slowest part of our stack. We can avoid the pain via caching.

Python
^^^^^^

Python is "fast enough" for many workloads and the trade-off it provides by
having mature developer tools and a mature ecosystem is well worth it. The same can be
said of just about every other mature dynamic scripting language. But we can serve
requests faster from a web accelerator (e.g., Varnish) that
can serve cached responses before a request even gets to the Python layer.


Cache All the Things
=====================

By now you probably see where we're headed. The simplest general approach is
to cache all the way down the stack. No matter how fast and how well tuned
your stack is, it will never be as fast as a dedicated cache.

Serving the entire HTTP response directly out of cache is ideal. When it isn't
possible, as many parts of the response as possible should come from cache.
Calls to the database can be kept to a bare minimum by implementing a caching
layer there as well.

All this caching might sound like a nightmare for readers who know Phil
Karlton's famous quote,

.. epigraph::

    There are only two hard things in Computer Science: cache invalidation and
    naming things.

In the following chapters, we'll teach you safe caching techniques to ensure
your users never see stale content (unintentionally). Additionally, we'll show
you how to tune your stack so it is as fast as possible, even on a cache miss.

Why the rush to cache?
^^^^^^^^^^^^^^^^^^^^^^

Multi-layer caching lets us push the bulk of our traffic away from the more
complex and custom built software onto battle-tested, high performance, open
source software.

.. image:: img/journey.*
    :align: center
    :width: 600

At each layer, load may be distributed horizontally across multiple systems.
But the farther down the stack any given request travels, the slower and more
taxing it will be on the infrastructure. Your goal, therefore, is to serve as
much of your traffic from as high up the stack as possible.

The common players in this stack are:

* **Load Balancer:**

  * Open Source: Traefik, HAProxy, Nginx, Varnish
  * Managed: All major cloud providers offer hosted load balancing solutions

* **Web Accelerator:**

  * Open Source: Varnish, Nginx + Memcached
  * Managed: Fastly, Cloudflare

* **App Server:** uWSGI, Gunicorn, Apache/mod_wsgi
* **Cache:** Memcached, Redis
* **Database:** Postgres, MySQL/MariaDB


The Journey of a Request
========================

At first glance, all these different pieces of software can be daunting. In
our consulting practice, we've seen sites that get the fundamentals of these
functional elements wrong and end up with a fragile infrastructure held
together with bailing wire and duct tape. It's critical to understand the
purpose of each one and how they interact with each other before moving
forward.

Use your imagination and pretend you are a passenger in a magical vehicle
that's taken the form of an HTTP request and is traversing the layers of the web
stack. The journey starts in the browser where an unassuming user sends you on
your way by typing the domain of your website in the address bar.

A DNS lookup will happen (unless you've set a high :abbr:`TTL (time to live)`
and the lookup is already cached). The lookup will point your vehicle to the IP
address of a load balancer and send you rocketing off across the information
superhighway toward your first stop.

.. index::
  seealso: load balancer; Varnish director

Load Balancer
^^^^^^^^^^^^^^

Your first stop is the load balancer whose main responsibility is to dispatch
traffic to the underlying infrastructure. It acts as a single proxy point that
receives requests from the internet and dispatches them to healthy application
servers (aka, the pool). It also does health checks and removes app servers from
the pool if they are determined to be misbehaving.

Most load balancers let you choose an algorithm (e.g., round robin, least
connections) for distributing requests to the application servers. It may also
be possible to specify weights to force some servers to receive more traffic
than others.

For most cases, round robin is a safe default. Routing traffic to the server
with the least number of connections sounds like an amazing idea, but it can be
problematic in some scenarios. Take, for example, adding application servers to
the pool during a traffic spike. The new server will go from zero connections to
a flood of connections as soon as it joins the pool. This can lead to an
undesirable result: the new server is overwhelmed, declared unhealthy, and taken out of
the rotation.

.. index:: TLS termination

The load balancer is a good place to do TLS termination. This is the act of
decrypting a request coming in via HTTPS and passing it down the stack as HTTP.
It's good to do this early on in the stack. Speaking HTTP is easier and the load
balancer usually has the spare CPU cycles to handle this task.

Depending on your choice of software, the load balancer may also have some
overlapping functionality with the next layer on our journey, the web
accelerator.

.. index:: Varnish

Web Accelerator
^^^^^^^^^^^^^^^

As your vehicle passed through the load balancer, it directed you to one of
possibly many web accelerators at the next level of the stack. The web
accelerator (aka, caching HTTP reverse proxy) is the first line of defense for
your application servers farther down the stack. (In this book, we'll focus on
Varnish\ [#]_, our preferred web accelerator solution.)

One of the first tasks for the web accelerator is to determine if this is a
request for a resource where the response varies with each user. For many
applications it might seem like *every* request varies per user. There are
some tricks we'll show you later to work around this, but the basic question
at the web accelerator is this: is this page unique to you or the same for
everyone?

If the response is user-specific, it will wave your vehicle on to the next layer
in the stack. If not, it will see if it already has a copy of the response in
its internal cache. If it's found, your vehicle's journey stops here and you're
sent back to the browser with the cached response. If it isn't found in
cache, it will send you down the stack, but take a snapshot of the response on
your way back so it can store it in the cache.

Ideally most requests' journeys end here. The web accelerator absorbs traffic
spikes generated by a marketing campaign or viral content on sites like Reddit
or Facebook.

Your journey is going to keep going, however. Next stop the application server!

.. [#] https://www.varnish-cache.org/

.. index::
  seealso: application server; uWSGI

Application Server
^^^^^^^^^^^^^^^^^^

Up to now, you've been zooming along the high-speed interstate highway but as you
start to pull into the application server, the road gets a little more winding
and your pace starts to slow down.

The application server has a simple task, it turns your HTTP request into a
:abbr:`WSGI (web server gateway interface)` request that Python can
understand. (Our preferred application server is uWSGI\ [#]_.) There
are lots of lanes of cars passing through the application (aka WSGI) server
and on the other side you catch sight of Django. The winding road now becomes
city streets complete with turns and stop signs.

.. [#] http://uwsgi-docs.readthedocs.org/en/latest/

Django
^^^^^^

The Django streets should look familiar to you. You go through the middleware,
hand off your URL to the dispatcher who points you towards one of the views in the
application. You notice, however, that there are a few differences between this
Django application and the ones you hack on on your laptop.

.. TODO: Do we want to recommend the per-site cache? Is it useful with Varnish in front?

Some requests are getting responses and zipping home in the middleware zone.
That's Django's per-site cache in action. As you enter the view you notice
that instead of having to stop and wait for every database query, some return
out of cache almost immediately (a database query cache). Rather than twisting
and turning through the template level, you notice some of the blocks simply
fly by (template caching).

While slow compared to the highway you were on earlier, the trip through Django
was pretty fast. You've now got the full response in tow and need to head home
to the browser. On your way, you'll pass back through each layer, checking in a
copy of your response with the per-site cache in Django and again with the web
accelerator on your way out to the internet.

-----

How long did your little request journey take? Surprisingly, all this happens in
just a fraction of a second.

When you start looking at your own application's response times, here are some
rough numbers you can shoot for. If your application is five times slower,
there's going to be a lot of room for improvement.

========= =========================
Estimated Response Times
===================================
10ms      Varnish cache hit
35ms      Django per-site cache hit
100-300ms Django with warm cache
500ms-2s  Django with cold cache
========= =========================

Breaking it down further, for requests passing through to Django the total time
spent should not be overwhelmingly dominated by a single component (database,
cache, etc.). The majority of time should be spent working in Python with no
more than 30% or so spent in any given component.

How's your Django app compare? Is there room for improvement? In the next
chapter we'll explore the development process and show you how and where to
optimize your application as you build it.