[Solved]-Hierarchical Data Models: Adjacency List vs. Nested Sets


Nested sets are better for performance, if you don’t need frequent updates or hierarchical ordering.

If you need either tree updates or hierarchical ordering, it’s better to use parent-child data model.

It’s easily constructed in Oracle and SQL Server 2005+, and not so easily (but still possible) in MySQL.


I would use the Modified Preorder Tree Traversal algorithm, MPTT, for this sort of hierarchical data. This allows great performance on traversing the tree and finding children, if you don’t mind a bit of a penalty on changes to the structure.

Luckily Django has a great library available for this, django-mptt. I’ve used this in a number of projects with a lot of success. There’s also django-treebeard which offers several alternative algorithms, but I haven’t used it (and it doesn’t seem as popular as mptt anyway).


According to these articles:


“MySQL is the only system of the big four (MySQL, Oracle, SQL Server, PostgreSQL) for which the nested sets model shows decent performance and can be considered to stored hierarchical data.”



The Adjacency List is much easier to maintain and Nested Sets are a lot faster to query.

The problem has always been that converting an Adjacency List to Nested Sets has taken way too long thanks to a really nasty “push stack” method that’s loaded with RBAR. So people end up doing some really difficult maintenance in Nested Sets or not using them.

Now, you can have your cake and eat it, too! You can do the conversion on 100,000 nodes in less than 4 seconds and on a million rows in less than a minute! All in T-SQL, by the way! Please see the following articles.

Hierarchies on Steroids #1: Convert an Adjacency List to Nested Sets

Hierarchies on Steroids #2: A Replacement for Nested Sets Calculations

Leave a comment