[Fixed]-Django querysets + memcached: best practices


Querysets are lazy, which means they don’t call the database until they’re evaluated. One way they could get evaluated would be to serialize them, which is what cache.set does behind the scenes. So no, this isn’t a waste of time: the entire contents of your Tournament model will be cached, if that’s what you want. It probably isn’t: and if you filter the queryset further, Django will just go back to the database, which would make the whole thing a bit pointless. You should just cache the model instances you actually need.

Note that the third point in your initial set isn’t quite right, in that this has nothing to do with Apache or preforking. It’s simply that a view is a function like any other, and anything defined in a local variable inside a function goes out of scope when that function returns. So a queryset defined and evaluated inside a view goes out of scope when the view returns the response, and a new one will be created the next time the view is called, ie on the next request. This is the case whichever way you are serving Django.

However, and this is important, if you do something like set your queryset to a global (module-level) variable, it will persist between requests. Most of the ways that Django is served, and this definitely includes mod_wsgi, keep a process alive for many requests before recycling it, so the value of the queryset will be the same for all of those requests. This can be useful as a sort of bargain-bas*m*nt cache, but is difficult to get right because you have no idea how long the process will last, plus other processes are likely to be running in parallel which have their own versions of that global variable.

Updated to answer questions in the comment

Your questions show that you still haven’t quite grokked how querysets work. It’s all about when they are evaluated: if you list, or iterate, or slice a queryset, that evaluates it, and it’s at that point the database call is made (I count serialization under iterating, here), and the results stored in the queryset’s internal cache. So, if you’ve already done one of those things to your queryset, and then set it to the (external) cache, that won’t cause another database hit.

But, every filter() operation on a queryset, even one that’s already evaluated, is another database hit. That’s because it’s a modification of the underlying SQL query, so Django goes back to the database – and returns a new queryset, with its own internal cache.

Leave a comment