[Fixed]-Django queryset exclude() with multiple related field clauses

32đź‘Ť

âś…

This is a still-present gotcha in Django, where exclude() is not acting as the reverse of filter(). Here’s the documentation explaining the difference:

Note

The behavior of filter() for queries that span multi-value
relationships, as described above, is not implemented equivalently for
exclude(). Instead, the conditions in a single exclude() call will not
necessarily refer to the same item.

For example, the following query would exclude blogs that contain both
entries with “Lennon” in the headline and entries published in 2008:

Blog.objects.exclude(
    entry__headline__contains='Lennon',
    entry__pub_date__year=2008,
)

However, unlike the behavior when using filter(), this will not limit
blogs based on entries that satisfy both conditions. In order to do
that, i.e. to select all blogs that do not contain entries published
with “Lennon” that were published in 2008, you need to make two
queries:

Blog.objects.exclude(
    entry__in=Entry.objects.filter(
        headline__contains='Lennon',
        pub_date__year=2008,
    ),
)

What you’ve done is probably the way to go.

👤M Somerville

6đź‘Ť

I implemented a quick-and-dirty solution so I could move on, expecting it to be horribly inefficient; however, on inspection of the SQL generated, it turned out to not be that bad:

>>> User.objects.exclude(id__in=User.objects.filter(preferences__preference="PREF_A", preferences__value=True))

I thought that the ORM would load the results of the subordinate query into the webserver’s memory before completing (this is a problem because our production app will have in the millions of users), but in fact it properly uses a subquery:

>>> User.objects.exclude(id__in=User.objects.filter(preferences__preference="PREF_A", preferences__value=True)).values('id').query.sql_with_params()
(u'SELECT "sgauth_user"."id" FROM "sgauth_user" WHERE NOT ("sgauth_user"."id" IN (SELECT U0."id" FROM "sgauth_user" U0 INNER JOIN "feeds_preference" U1 ON (U0."id" = U1."user_id") WHERE (U1."preference" = %s  AND U1."value" = %s )))', ('PREF_A', True))

I’m putting this up as one possible answer, but I am still interested if there is a way to do it with the straightforward exclude clause, or a way of generating a query through the ORM that works with straightforward joins and no subquery whatsoever.

2đź‘Ť

You can use the new django SubQuery to avoid doing 2 queries to the server:

User.objects.exclude(id__in=SubQuery(User.objects.filter(preferences__preference="PREF_A", preferences__value=True)))
👤Patrik Beck

Leave a comment