[Django]-Django: how to do get_or_create() in a threadsafe way?

11đź‘Ť

âś…

This must be a very common situation. How do I handle it in a threadsafe way?

Yes.

The “standard” solution in SQL is to simply attempt to create the record. If it works, that’s good. Keep going.

If an attempt to create a record gets a “duplicate” exception from the RDBMS, then do a SELECT and keep going.

Django, however, has an ORM layer, with it’s own cache. So the logic is inverted to make the common case work directly and quickly and the uncommon case (the duplicate) raise a rare exception.

👤S.Lott

51đź‘Ť

Since 2013 or so, get_or_create is atomic, so it handles concurrency nicely:

This method is atomic assuming correct usage, correct database
configuration, and correct behavior of the underlying database.
However, if uniqueness is not enforced at the database level for the
kwargs used in a get_or_create call (see unique or unique_together),
this method is prone to a race-condition which can result in multiple
rows with the same parameters being inserted simultaneously.

If you are using MySQL, be sure to use the READ COMMITTED isolation
level rather than REPEATABLE READ (the default), otherwise you may see
cases where get_or_create will raise an IntegrityError but the object
won’t appear in a subsequent get() call.

From: https://docs.djangoproject.com/en/dev/ref/models/querysets/#get-or-create

Here’s an example of how you could do it:

Define a model with either unique=True:

class MyModel(models.Model):
    slug = models.SlugField(max_length=255, unique=True)
    name = models.CharField(max_length=255)

MyModel.objects.get_or_create(slug=<user_slug_here>, defaults={"name": <user_name_here>})

… or by using unique_togheter:

class MyModel(models.Model):
    prefix = models.CharField(max_length=3)
    slug = models.SlugField(max_length=255)
    name = models.CharField(max_length=255)

    class Meta:
        unique_together = ("prefix", "slug")

MyModel.objects.get_or_create(prefix=<user_prefix_here>, slug=<user_slug_here>, defaults={"name": <user_name_here>})

Note how the non-unique fields are in the defaults dict, NOT among the unique fields in get_or_create. This will ensure your creates are atomic.

Here’s how it’s implemented in Django: https://github.com/django/django/blob/fd60e6c8878986a102f0125d9cdf61c717605cf1/django/db/models/query.py#L466 – Try creating an object, catch an eventual IntegrityError, and return the copy in that case. In other words: handle atomicity in the database.

3đź‘Ť

try transaction.commit_on_success decorator for callable where you are trying get_or_create(**kwargs)

“Use the commit_on_success decorator to use a single transaction for all the work done in a function.If the function returns successfully, then Django will commit all work done within the function at that point. If the function raises an exception, though, Django will roll back the transaction.”

apart from it, in concurrent calls to get_or_create, both the threads try to get the object with argument passed to it (except for “defaults” arg which is a dict used during create call in case get() fails to retrieve any object). in case of failure both the threads try to create the object resulting in multiple duplicate objects unless some unique/unique together is implemented at database level with field(s) used in get()’s call.

it is similar to this post
How do I deal with this race condition in django?

👤vijay shanker

3đź‘Ť

So many years have passed, but nobody has written about threading.Lock. If you don’t have the opportunity to make migrations for unique together, for legacy reasons, you can use locks or threading.Semaphore objects. Here is the pseudocode:

from concurrent.futures import ThreadPoolExecutor
from threading import Lock

_lock = Lock()


def get_staff(data: dict):
    _lock.acquire()
    try:
        staff, created = MyModel.objects.get_or_create(**data)
        return staff
    finally:
        _lock.release()


with ThreadPoolExecutor(max_workers=50) as pool:
    pool.map(get_staff, get_list_of_some_data())
👤Mastermind

Leave a comment