Most of the Django projects I work with take advantage of django.contrib.auth
. It manages
users and groups and is tightly coupled with django.contrib.admin
. In this post, we are going to
explore how it resists a potential attacker.
The study below assumes an attacker obtained the encrypted password column from the database:
mysql> select password from auth_user limit 1;
+-------------------------------------------------------------------------------+
| password |
+-------------------------------------------------------------------------------+
| pbkdf2_sha256$20000$H0dPx8NeajVu$GiC4k5kqbbR9qWBlsRgDywNqC2vd9kqfk7zdorEnNas= |
+-------------------------------------------------------------------------------+
Django stores passwords in the following format:
<algorithm>$<iterations>$<salt>$<hash>
As computers get faster, hashing a password also gets faster. This is important because the only protection password hashing offers is to make “guessing” the encrypted password more time consuming.
The 2 knobs you can turn to increase this time are the algorithm
and the number of
iterations
. Since Django 1.4 (released in 2012), the default algorithm it uses is PBKDF2 as recommended by NIST. It is interesting to note that since then the iteration parameter has been increased at least 4 times: 1, 2, 3, 4.
With this in mind, I was interested to know how many passwords I could crack with a limited budget of less than $100 and few days. Performing a similar experiment against your own database will help you evaluate the effectiveness of your own password policy. You are about to discover that it is easier than you might think. In fact, most of the password “recovery” tools can handle Django’s password encryption format out-of-the-box.
Recovering Django passwords
The demonstration below uses a Google Cloud server and hashcat with some basic tuning. It performs a dictionary attack with the well known rockyou
dictionary which is easily found on the internet.
Provisioning the server
First initialize the gcloud CLI with:
$ gcloud init
Then, in the developer console create a project, assign it a budget under billing information, and enable the Google Compute API.
At this point, you should be able to list the resources available.
$ gcloud compute regions describe europe-west1 | grep -B1 GPU
- limit: 0.0
metric: NVIDIA_K80_GPUS
--
- limit: 0.0
metric: NVIDIA_P100_GPUS
--
- limit: 0.0
metric: PREEMPTIBLE_NVIDIA_K80_GPUS
--
- limit: 0.0
metric: PREEMPTIBLE_NVIDIA_P100_GPUS
--
- limit: 0.0
metric: NVIDIA_V100_GPUS
If, like me, your limits are 0.0
on all these cards you will need to submit a quota increase. For this experiment, I requested a quota of 8 NVIDIA K80 GPUs. According to Google, this step might take a few days. I was asked to add $280 to my account and the quota increase was granted the next day.
Based on the Google estimator a cloud server with 8 NVIDIA K80 GPUS costs ~$63 per day or $2.70 per hour. In practice, the preemptible
mode made it even cheaper.
With my quota increased, I started up the instance:
$ gcloud compute instances create yml-gpu-4 \
--machine-type n1-standard-4 \
--zone europe-west1-b \
--accelerator type=nvidia-tesla-k80,count=8 \
--image-family ubuntu-1604-lts \
--image-project ubuntu-os-cloud \
--maintenance-policy TERMINATE \
--restart-on-failure \
--preemptible
Setting up the server
Next, I install some utilities that I find useful in this context:
$ sudo apt install gnupg2 tmux htop p7zip-full git-core
Then, install the latest NVIDIA driver with this script:
#!/bin/bash
echo "Checking for CUDA and installing."
# Check for CUDA and try to install.
if ! dpkg-query -W cuda-8-0; then
# The 16.04 installer works with 16.10.
curl -O https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
dpkg -i ./cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
apt-get update
apt-get install cuda-8-0 -y
fi
# Enable persistence mode
nvidia-smi -pm 1
Install hashcat:
$ wget https://hashcat.net/files/hashcat-4.1.0.7z
$ 7z x hashcat-4.1.0.7z
$ git clone https://github.com/praetorian-inc/Hob0Rules
Extracting the password hashes
I restricted the analysis to users that have been granted access to Django’s administration interface.
$ echo "select password from auth_user where is_staff;"| manage.py dbshell > django_hashes.txt
Running the attack
With everything in place, I ran the attack using the password dictionary.
$ ./hashcat64.bin -m 10000 ~/django_hashes.txt ~/rockyou.txt
You can spice up the dictionary with a rule-based attack. Note, this will drastically increase the time required to perform the attack.
$ ./hashcat64.bin -m 10000 -r ~/Hob0Rules/hob064.rule ~/django_hashes.txt ~/rockyou.txt
This is a faster alternative but the downside is that you lose the estimated completion time.
$ ./hashcat64.bin -w 3 --stdout -r ~/Hob0Rules/hob064.rule ~/rockyou.txt \
| ./hashcat64.bin -m 10000 ~/django_hashes.txt
Results
On completion, you can harvest your passwords from hashcat:
$ ./hashcat64.bin -m 10000 --show ~/django_hashes.txt
I tested this approach on few different real-world datasets and was able to recover ~10-25% of the hashes in each set within a few hours. In all, I cracked 246 passwords and spent $73 on Google Cloud. My cost per recovered hash came out to $0.30.
Conclusion
This article barely scratches the surface of what you can do with hashcat. Lots of time can be invested to make it run faster on your hardware as well as fine-tuning the attack type or dictionary you are using.
I learned that brute forcing passwords is easier and cheaper than it sounds. If you are concerned about the security of your passwords, I recommend researching the following options to reduce your potential risk:
- Scramble/anonymize your database dump for developers and non-production systems to reduce the risk of data leak.
- Rate-limit login pages.
- Keep your administration interface off the open internet. Use SSH, VPN, etc.
- Enforce password complexity and length requirements.
- Update Django to at least the latest LTS.
- Increase the number iterations and/or choose a more expensive algorithm to compute.