Converting an existing Django model to Django-MPTT

Jerry Stratton, February 12, 2016

For a long time I’ve been painfully aware of a glaring inefficiency in my custom blog software built off of Django. A lot of what it does relies on the tree model: folders within folders. Finding all the front-page articles for my main blog means getting all the children of each category, and then any descendants of those children, and so on. Previewing a new blog article can take a minute or more, mainly because of the recursive nature of building the sidebar. There are only seven articles on the front page, but they require looking through a tree with 1,505 articles spread over five levels.

I’ve been aware of various solutions for turning Django into more of a tree-hugger, but they’ve tended either to not work well with an existing installation, or if they did, their documentation hid that. For example, django-mptt until a few years ago hid the very important .rebuild method, which is necessary for building the tree data for an existing dataset.

Having just finished the basic installation, however, django-mptt is very helpful. On my very simple, not very tree-like blog, the Walkerville Weekly Reader, publish time dropped by around 13% to 21%; since it only took about two minutes anyway, that’s not necessarily a big deal. But Mimsy Were the Borogoves, which I’ve been running since the nineties, has a lot more articles and a much more complex structure. Publish time dropped from about 30 minutes to about 10 minutes. That’s worth the installation troubles.

Converting my model to use django-mptt

I installed django-mptt by downloading version 0.8.0, unpacking it, and running setup, but, despite the documentation claiming that it’s okay with Django 1.8.x, it performed an automatic upgrade to Django 1.9.1. I had to downgrade again using:

sudo pip uninstall django
sudo pip install django==1.8.8

I’m using django-ajax-selects version 1.3.6, which doesn’t work with Django 1.9.x. While the 1.4 version is compatible with Django 1.9, the 1.4 version doesn’t work on inlines, and inlines are the main reason I use ajax-selects. I have a list of 9,000 URLs and 14,000 pages that can be attached via inline to any page, not to mention the keywords and authors and so forth. Without ajax-select, it takes forever for browsers to render pages because those become pull-down menus.

Converting to django-mptt is very easy. I added this to the top of my models.py:

from mptt.models import MPTTModel, TreeForeignKey, TreeManager

I had to include TreeManager because I have a custom manager. If you use the standard manager, you shouldn’t need it.

[toggle code]

class PageManager(models.Manager):
…
class Page(models.Model):
- parent = models.ForeignKey(‘self’, blank=True, null=True)

became:

[toggle code]

class PageManager(TreeManager):
…
class Page(MPTTModel):
- parent = TreeForeignKey(‘self’, blank=True, null=True)

I also had a “level” method on my Page model, which I needed to rename because it interfered with the MPTTModel’s level column.

I used “./manage.py sqlall pages” to find the new columns and indexes, and then added them by hand in SQLite:

ALTER TABLE pages_page ADD COLUMN lft INTEGER UNSIGNED NOT NULL DEFAULT 0;
ALTER TABLE pages_page ADD COLUMN rght INTEGER UNSIGNED NOT NULL DEFAULT 0;
ALTER TABLE pages_page ADD COLUMN tree_id INTEGER UNSIGNED NOT NULL DEFAULT 0;
ALTER TABLE pages_page ADD COLUMN level INTEGER UNSIGNED NOT NULL DEFAULT 0;
CREATE INDEX "pages_page_329f6fb3" ON "pages_page" ("lft");
CREATE INDEX "pages_page_e763210f" ON "pages_page" ("rght");
CREATE INDEX "pages_page_ba470c4a" ON "pages_page" ("tree_id");
CREATE INDEX "pages_page_20e079f4" ON "pages_page" ("level");

Your index names are likely to be different.

After verifying that everything still worked, I built the tree data:

Page.objects.rebuild()

It took about ten minutes for it to create the tree data.

At that point, I left it alone for a few days to make sure I hadn’t introduced anything that broke the blog.

Code

Your code changes will depend on your code, of course. In my case, I had two methods that needed to go:

[toggle code]

#get all levels of responses to a blog article or members of a blog category
def getBlogArticles(self):
- articles = self.children().exclude(provider='nisus')
- possiblyUpdated = Page.objects.live().filter(parent__in=articles, provider='db').exclude(pagekey__keyword__key='Voices')
- for article in possiblyUpdated:
  - articles = articles | article.parent.getBlogArticles()
- if self.blogCategory:
  - articles = articles.order_by('rank', '-added')
- else:
  - articles = articles.order_by('-added')
- return articles
#uses list to reduce size of query, as I think that without it the query is too long for sqlite
def alldescendants(self):
- idescendants = list(Page.objects.filter(parent=self).values_list('id', flat=True))
- descendants = []
- while idescendants:
  - descendants.extend(idescendants)
  - idescendants = list(Page.objects.filter(parent__in=idescendants).values_list('id', flat=True))
- descendants = Page.objects.filter(parent__in=descendants, live='public')
- return descendants
def modified(self):
- descendants = self.alldescendants()
- try:
  - mostrecent = descendants.order_by('-edited')[:1].get().edited
  - lastmodified = max(mostrecent, self.edited)
- except:
  - lastmodified = self.changed()
- return lastmodified

As you can see, I had two methods of getting all descendants before installing django-mptt. One was to recurse the method on every child and then every child’s child, as I did with getBlogArticles. This had the advantage of always working, but being very slow. The second, in alldescendants was to get the id of all children, and then get every page that had a parent in that list of ids, and then repeat that for the new set of pages, and so on, until no pages had parents in the furthest generation. This had the advantage of being faster, but the disadvantage of creating gigantic SQL statements.¹

[toggle code]

#get all levels of responses to a blog article or members of a blog category
def getBlogArticles(self):
- articles = self.get_descendants().filter(provider='db').exclude(pagekey__keyword__key='Voices')
- articles = articles & Page.objects.live()
- if self.blogcategory:
  - articles = articles.order_by('rank', '-added')
- else:
  - articles = articles.order_by('-added')
- return articles
def alldescendants(self, include_self=False):
- return self.get_descendants(include_self=include_self).filter(live='public')
def modified(self):
- descendants = self.alldescendants(include_self=True)
- return descendants.order_by('-edited')[:1].get().edited

One immediate benefit, besides the massive speed increase for the more complex blog, is that I no longer have to deal with those incredibly large flat lists of ids. When trying to find the most recent modification within a folder, doing so under a parent with too many descendants ran the risk of an error, at which point I just defaulted to the modification time of the parent². Using MPTT, however, the modified function no longer needs the try/except clause.

If you’re running a large tree, and traversing the tree is becoming time-consuming, look into django-mptt.

Note that there’s actually a bug in alldescendants which I only noticed checking the behavior of the new method against the behavior of the old one: the final parent__in should have included not just the descendants of self, but self as well. Otherwise, it won’t include self’s immediate children, but only its grandchildren and greater grandchildren.
↑
This also had the effect of hiding a bug in modified, which is that if the page has no descendants, .get() will fail, and so use the exception code which is to get the current page’s edited date—which turns out to be the correct date.
↑

Django MPTT: “Utilities for implementing Modified Preorder Tree Traversal with your Django Models and working with trees of Model instances.”
django-ajax-selects: “Enables editing of ForeignKey, ManyToMany and CharField using jQuery UI Autocomplete.”
Mimsy Were the Borogoves: The official blog of Jerry Stratton.
The Walkerville Weekly Reader: In the end times, one newspaper dared to call God to task for His hypocrisy. That newspaper was not us, we swear it. Not the eternal flames!

More Django

Two search bookmarklets for Django: Bookmarklets—JavaScript code in a bookmark—can make working with big Django databases much easier.
Fixing Django’s feed generator without hacking Django: It looks like it’s going to be a while before the RSS feed generator in Django is going to get fixed, so I looked into subclassing as a way of getting a working guid in my Django RSS feeds.
ModelForms and FormViews: This is just a notice because when I did a search, nothing came up. Don’t use ModelForm with FormView, use UpdateView instead.
Django: fix_ampersands and abbreviations: The fix_ampersands filter will miss some cases where ampersands need to be replaced.
Custom managers for Django ForeignKeys: I’ve got one really annoying model for keywords. There’s one category of keywords that, by default, should not show up when used as a ForeignKey for most models. Key word: most.
29 more pages with the topic Django, and other related pages

Comments?

The undiscovered comment form, whose bourn no poster returns.

Your email, URL, and location are optional—but I won’t be able to contact you if you don’t leave a working email. Your email does not get displayed, your URL and location do. Your name is required but may vary as the needs of the day demand, or you can just use the anonymous Hark Thrice name. You can use the following tags: <em>, <a>, <blockquote>. Use them wisely and post intelligently. Comments may take some time to approve, especially if I’m stuck in a Mexican jail.

If you have private comments, or questions about this page, please, leave a message on the Negative Space Comments Page.

Lost?

If you’re looking for something here, use the search box in the navigation to limit your search to this part of the site, or use the Negative Space search page.

Jerry

We’re stuck with modernity for the foreseeable future. — Michael Anton (Modernity and Its Discontents)

Contents of Negative Space™ as a whole Copyright © 1994-2024 Jerry Stratton. Individual copyrights remain held by their respective authors unless they specify otherwise. Site titles, such as Negative Space, Strange Bedfellows, Biblyon Broadsheet, Highland Games, and FireBlade Coffeehouse are trademarks of Jerry Stratton.

Code and code snippets, to the extent that they are copyrightable, may be re-distributed under the terms of the GNU General Public License 3.

Converting an existing Django model to Django-MPTT last modified May 27th, 2017.

Your comment
Your name
Your email
Your web page
Your location

Mimsy Were the Borogoves

Converting an existing Django model to Django-MPTT

Converting my model to use django-mptt

Code

More Django

Editorials

Books, Movies, & Music

Technology & Hacks

Food

42 Astounding Scripts

Walkerville Reader

Biblyon Broadsheet

About Mimsy

Comments?

Lost?

Mimsy Were the Borogoves

Converting an existing Django model to Django-MPTT

Converting my model to use django-mptt

Code

More Django

Editorials

Books, Movies, & Music

Technology & Hacks

Food

42 Astounding Scripts

Walkerville Reader

Biblyon Broadsheet

Blogroll

Keep in touch

About Mimsy

Comments?

Lost?