[Fixed]-How to have a "random" order on a set of objects with paging in Django?

13👍

Exactly how random must these be? Does it have to be different for each user, or is it merely the appearance of randomness that is important?

If it is the latter, then you can simply add a field called ordering to the model in question, and populate it with random integers.

Otherwise, unless the recordset is small (and, given it is being paged, I doubt it), then storing a separate random queryset for each session could become a memory issue very quickly unless you know that the user base is very small. Here is one possible solution that mimics randomness but in reality creates only 5 random sets:

import random
from django.core import cache
RANDOM_EXPERIENCES=5

def my_view(request):
    if not request.session.get('random_exp'):
        request.session['random_exp']=random.randrange(0,RANDOM_EXPERIENCES)
    object_list = cache.get('random_exp_%d' % request.session['random_exp'])
    if not object_list:
        object_list = list(Object.objects.all().order_by('?'))
        cache.set('random_exp_%d' % request.session['random_exp'], object_list, 100)
    paginator = Paginator(object_list, 10)
    page = 1 # or whatever page we have
    display_list = paginator.page(page)
    ....

In this example, instead of creating a separate queryset for each user (resulting in potentially thousands of querysets in storage) and storing it in request.session (a less efficient storage mechanism than cache, which can be set to use something very efficient indeed, like memcached), we now have just 5 querysets stored in cache, but hopefully a sufficiently random experience for most users. If you want more randomness, increasing the value for RANDOM_EXPERIENCES should help. I think you could probably go up as high as 100 with few perfomance issues.

If the records themselves change infrequently, you can set an extremely high timeout for the cache.

Update

Here’s a way to implement it that uses slightly more memory/storage but ensures that each user can “hold on” to their queryset without danger of its cache timing out (assuming that 3 hours is long enough to look at the records).

import datetime

...

    if not request.session.get('random_exp'):
        request.session['random_exp']="%d_%d" % ( 
            datetime.datetime.strftime(datetime.datetime.now(),'%Y%m%dH'),
            random.randrange(0, RANDOM_EXPERIENCES)
        )
    object_list = cache.get("random_exp_%s" % request.session['random_exp'])
    if not object_list:
        object_list = list(Object.objects.all().order_by('?'))
        cache.set(cache_key, "random_exp_%s" % request.session['random_exp'], 60*60*4)

Here we create a cached queryset that does not time out for 4 hours. However, the request.session key is set to the year, month, day, and hour so that someone coming in sees a recordset current for that hour. Anyone who has already viewed the queryset will be able to see it for at least another 3 hours (or for as long as their session is still active) before it expires. At most, there will be 5*RANDOM_EXPERIENCES querysets stored in cache.

2👍

Try using the default Django meta option order_by?

Putting a question mark “?” results in random ordering

https://docs.djangoproject.com/en/1.3/ref/models/options/#ordering

1👍

@Jordan Reiter’s solution really great. But there’s a little problem when use it. If the record is updated, it will take a lone time to effect. Also, it use too much of cache space if the count of record is large.

I optimize it by only cache the primary key column. When records updated, it will effect immediately.

import random
from django.core import cache
from django.core.paginator import Paginator
RANDOM_EXPERIENCES=5

if not request.session.get('random_exp'):
    request.session['random_exp']=random.randrange(0,RANDOM_EXPERIENCES)
id_list = cache.get('random_exp_%d' % request.session['random_exp'])
if not id_list:
    id_list = [object['id'] for object in Object.objects.values('id').all().order_by('?')]
    cache.set('random_exp_%d' % request.session['random_exp'], id_list, 60*60*4)
paginator = Paginator(id_list, 9)
page = 1 # or whatever page we have
display_id_list = paginator.page(page)
object_list = Object.objects.filter(id__in=display_id_list)

0👍

Your best course of action is probably to convert your queryset to a list, and then shuffle it:

from random import shuffle
object_list = list(object_list)
shuffle(object_list)
... continue with pagination ...

Do note, however that converting a queryset to a list will evaluate it. This will become a performance nightmare if your Object table becomes larger.

If you want to store those objects, you can either create another table and associate user ids with a list of Object ids, or store the 100 ids or so in a session cookie. There is not much you can do about this: HTTP is stateless, persistance can be achieved by using cookies or a data store (a RDBS system more likely).

0👍

If you only need pseudo randomness, manually calling setseed could suffice (and will avoid any memory/performance issues):

from django.db import connection


seed = qs.count()
with connection.cursor() as cursor:
    cursor.execute("SELECT setseed(%s);" % seed)
qs.order_by('?')

From thereon, you can proceed as usual, with pagination. Though randomness will only change after the amount of objects in your queryset changes. This can be wanted (for example, people can say "the second entry on the second page", which is totally impossible with real randomness). You can also try to take another seed approachh, for example seed = timezone.now().strftime("%m"), to have new randomness every month.

Leave a comment