Treehugging

Brian Luft <brian@lincolnloop.com>

DjangoCon 2010 Portland, OR

Enough About Me

Working with Django for a few years.

With Lincoln Loop since 2008.

Clients include National Geographic, PBS, Nasuni, and redbeacon.

Introduction

Where do we find structured data?

  • Hierarhical Categories
  • Site navigation
  • eCommerce Product Catalog
  • Social network / Recommendations
  • Threaded Comments
  • Everywhere...

Trees and Relational Databases

A table is not a tree. We need a few tricks.

Example Tree

http://dev.mysql.com/tech-resources/articles/hierarchical-data.html

Relational Models

  • Adjacency List
  • Nested Sets
  • Materialized Path

Adjacency List

  • Adds a column for parent ID.
  • Root nodes are represented by NULL.
    +-------------+----------------------+--------+
    | category_id | name                 | parent |
    +-------------+----------------------+--------+
    |           1 | ELECTRONICS          |   NULL |
    |           2 | TELEVISIONS          |      1 |
    |           3 | TUBE                 |      2 |
    |           4 | LCD                  |      2 |
    |           5 | PLASMA               |      2 |
    |           6 | PORTABLE ELECTRONICS |      1 |
    |           7 | MP3 PLAYERS          |      6 |
    |           8 | FLASH                |      7 |
    |           9 | CD PLAYERS           |      6 |
    |          10 | 2 WAY RADIOS         |      6 |
    +-------------+----------------------+--------+

Adjancency List Queries

Select whole tree:

1 SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4
2 FROM category AS t1
3 LEFT JOIN category AS t2 ON t2.parent = t1.category_id
4 LEFT JOIN category AS t3 ON t3.parent = t2.category_id
5 LEFT JOIN category AS t4 ON t4.parent = t3.category_id
6 WHERE t1.name = 'ELECTRONICS';

Adjancency List Queries

+-------------+----------------------+--------------+-------+
| lev1        | lev2                 | lev3         | lev4  |
+-------------+----------------------+--------------+-------+
| ELECTRONICS | TELEVISIONS          | TUBE         | NULL  |
| ELECTRONICS | TELEVISIONS          | LCD          | NULL  |
| ELECTRONICS | TELEVISIONS          | PLASMA       | NULL  |
| ELECTRONICS | PORTABLE ELECTRONICS | MP3 PLAYERS  | FLASH |
| ELECTRONICS | PORTABLE ELECTRONICS | CD PLAYERS   | NULL  |
| ELECTRONICS | PORTABLE ELECTRONICS | 2 WAY RADIOS | NULL  |
+-------------+----------------------+--------------+-------+

Adjancency Lists Queries

Select one path:

1 SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4
2 FROM category AS t1
3 LEFT JOIN category AS t2 ON t2.parent = t1.category_id
4 LEFT JOIN category AS t3 ON t3.parent = t2.category_id
5 LEFT JOIN category AS t4 ON t4.parent = t3.category_id
6 WHERE t1.name = 'ELECTRONICS' AND t4.name = 'FLASH';

Adjancency Lists Queries

+-------------+----------------------+-------------+-------+
| lev1        | lev2                 | lev3        | lev4  |
+-------------+----------------------+-------------+-------+
| ELECTRONICS | PORTABLE ELECTRONICS | MP3 PLAYERS | FLASH |
+-------------+----------------------+-------------+-------+

Adjancency Lists Pros/Cons

  • Fast writes, slow reads
  • Fragile update operations
  • You have to know what level in the tree your item is at when you look it up
  • Easy to orphan sub-trees
  • Easy to start with, but maintaining the tree integrity takes a lot of extra work

Nested Sets

http://dev.mysql.com/tech-resources/articles/hierarchical-data.html

Nested Sets

Preorder tree traversal algorithm

+-------------+----------------------+-----+-----+
| category_id | name                 | lft | rgt |
+-------------+----------------------+-----+-----+
|           1 | ELECTRONICS          |   1 |  20 |
|           2 | TELEVISIONS          |   2 |   9 |
|           3 | TUBE                 |   3 |   4 |
|           4 | LCD                  |   5 |   6 |
|           5 | PLASMA               |   7 |   8 |
|           6 | PORTABLE ELECTRONICS |  10 |  19 |
|           7 | MP3 PLAYERS          |  11 |  14 |
|           8 | FLASH                |  12 |  13 |
|           9 | CD PLAYERS           |  15 |  16 |
|          10 | 2 WAY RADIOS         |  17 |  18 |
+-------------+----------------------+-----+-----+

Nested Sets

http://dev.mysql.com/tech-resources/articles/hierarchical-data.html

Nested Sets

Retrieve the whole tree

1 SELECT node.name
2 FROM nested_category AS node,
3 nested_category AS parent
4 WHERE node.lft BETWEEN parent.lft AND parent.rgt
5 AND parent.name = 'ELECTRONICS'
6 ORDER BY node.lft;

Query works regardless of the depth of tree.

Nested Sets

  • Efficient reads
  • High maintenance cost for write/delete

Materialized Paths

  • Every node in the tree has a "path" attribute.
    +-------------+----------------------+--------+
    | category_id | name                 | path   |
    +-------------+----------------------+--------+
    |           1 | ELECTRONICS          |   1    |
    |           2 | TELEVISIONS          |   1.1  |
    |           3 | TUBE                 |   1.1.1|
    |           4 | LCD                  |   1.1.2|
    |           5 | PLASMA               |   1.1.3|
    |           6 | PORTABLE ELECTRONICS |   1.2  |
    |           7 | MP3 PLAYERS          |   1.2.1|
    |           8 | FLASH                |   1.2.2|
    |           9 | CD PLAYERS           |   1.3  |
    |          10 | 2 WAY RADIOS         |   1.4  |
    +-------------+----------------------+--------+

Materialized Paths

  • Queries are simple and fast
  • Effectively denormalizes parent/child foreign keys
  • Writes slow
  • Requires maintenance to keep things ordered properly

Django Apps

  • django-mptt
  • django-treebeard

django-mptt vs. django-treebeard

http://www.qompr.com/charts/63;django-hierarchical-tree-data/

django-mptt vs. django-treebeard

Relational Model

Treebeard:

  • Nested Sets
  • Adjacency List
  • Materialized Path

MPTT:

  • Nested Sets

django-mptt vs. django-treebeard

Forms / Admin

Both provide Move Node forms.

django-mptt provides a TreeNodeChoiceField (ModelChoiceField)

django-mptt vs. django-treebeard

Active

django-mptt is being maintained

django-treebeard has slightly more active development

django-mptt vs. django-treebeard

Front-end

Treebeard: get_annotated_list

MPTT: a few nice template tags / filters

django-mptt vs. django-treebeard

Real-world Examples

django-treebeard: django-page-cms

django-mptt: django-cms

django-mptt vs. django-treebeard

Overall Impressions

  • treebeard model creation methods on model
  • django-mptt related items methods

Benchmarks

https://tabo.pe/projects/django-treebeard/docs/tip/benchmarks.html

Miscellaneous

  • django-treemenus
  • Neo4j
  • Suckerfish/Superfish

Playing Around With Apps

Fast-forward:

1 mkvirtualenv treehugging
2 pip install -e git+http://github.com/batiste/django-mptt.git#egg=django-mptt
3 pip install django-treebeard
4 pip install django-extensions
5 pip install ipython
6 django-admin.py startproject treehugging
7 django-admin.py startapp demo_mptt

Questions?

References and Resources

Managing Hierarchical Data in MySQL

http://dev.mysql.com/tech-resources/articles/hierarchical-data.html

Working With Taxonomy Trees using Django and jQuery

http://www.quanative.com/2010/01/01/working-with-taxonomy-trees-using-django-and-jquery/

django-treebeard docs

https://tabo.pe/projects/django-treebeard/docs/1.61/

Django Tree Libraries

http://blog.elsdoerfer.name/2010/02/28/django-tree-libraries/