[Django]-Django post_save as celery task strange behavior

5👍

There are two issues with your code.

First one is the biggest, you are passing object instance to the tasks – this is explicitly marked as wrong approach in celery documentation, basically what you are doing is serializing your object in some state and passing it to celery to work on; but in meantime this object may change; as a solution you shall pass object id as parameter, so celery task can fetch it fresh:

build.delay(instance.pk)

...

@task
def build(my_key):
    instance = SomeModel.objects.get(pk=my_key)
    instance.status = 'Processing'
    instance.save()

Second issue is subtle in nature and rarely shows on the radar. First part of your code can be called in a transaction, that means that there may be a situation, where your task (in celery) will be faster than transaction commit and then your model will be saved first in the celery task, then by transaction – and here you have an issue.

If you change your code as suggested above, situation described as second issue may not happen or it will show different error.

To avoid such a problems it is good to call celery tasks from transaction.oncommit handler (introduced to Django in version 1.9)

One more comment, what I can see you are changing status of the object:

instance.status = 'Processing'

most probably as something informational but maybe used as a locking mechanism… there is very nice option select_for_update method of QuerySet, which will allow you to lock an object for the duration of transaction. This is especially nice for celery tasks, when you do:

instance = SomeModel.objects.select_for_update().get(pk=my_key)

it will stop your task waiting for other to finish (do not forget to put @transaction.atomic over this task)

if you will pass nowait=True to select_for_update – it will generate an exception without any delay, allowing you to handle situation.

👤Jerzyk

Leave a comment