Django ORM performance tricks

Tue 19 January 2010

Lately, I have been trying to enhance the response time of http://zofa.be which is written using django. I have done this mainly for fun, since, with about 30 000 pageviews a year, ZoFa is not precisely of the size of YouTube and can be handled easily by my shared hosting on djangohosting.ch (when they are not DDOSed, but that is another story).

Here are my three most useful tips after this experience:

  1. As for any optimization effort, you should first be able to mesure performance. For Django, the de facto standard tool seems to be the Django Debug Toolbar even if it is officially not part of the Django framework itself. This tool is so great that it hurts. I would do many things to get such a quality tool in my daily java development. For my special case, what was the most interesting was the report on the number of sql queries done for each page. To give you a hint of how good it is, check this video. (Notice the jazz soundtrack, the jazz references are a hallmark of the Django culture). The great feature of the Debug toolbar for optimizing your queries is that it allows you to see directly in your browser all the queries done to render the current page.

  2. Use the select_related() queryset method, documented here. The explanation is much better than whatever I could write, so read it.

  3. Using the select_related() method amounts to one idea : get all the objects you need in one sql query rather than doing one query per object (Nevertheless, beware loading too much objects using those tricks, you should load just enough). A similar idea is the following : when you need to access all the objects of a queryset, you should cast it first to a list, since it will avoid that each element of the queryset is loaded independently.

Put in an example, you should do this :

items=Items.objects.all()
items= list(items)
for item in items:
print str(item)

instead of this :

 items=Items.objects.all()
for item in items:
print str(item)

The first snippet makes one query, while the second makes as many queries as there are items. Notice that if you pass a queryset to a template that will display all its elements, you should cast it to a list too, for similar reasons. I first read about this in the Django documentation, but unfortunately, I am unable to find it again right now.

One finale note : since the site is heavily customized (à la Facebook) for each user, I am not sure I would have so much interest in using caching (even if I do think that the template caching introduced in Django 1.2 could be very efficient). That said, if I ever get performance problems (which would in fact be a happy problem), that would be the first thing on the to do list.

Edit: I just discovered the page in the official Django documentation about Data Base Access Optimization. It makes this current post pretty useless...

Feed - About me

comments powered by Disqus