[Fixed]-Commit manually in Django data migration

11👍

The best workaround I found is manually exiting the atomic scope before running the data migration:

def modify_data(apps, schema_editor):
    schema_editor.atomic.__exit__(None, None, None)
    # [...]

In contrast to resetting connection.in_atomic_block manually this allows using atomic context manager inside the migration. There doesn’t seem to be a much saner way.

One can contain the (admittedly messy) transaction break out logic in a decorator to be used with the RunPython operation:

def non_atomic_migration(func):
  """
  Close a transaction from within code that is marked atomic. This is
  required to break out of a transaction scope that is automatically wrapped
  around each migration by the schema editor. This should only be used when
  committing manually inside a data migration. Note that it doesn't re-enter
  the atomic block afterwards.
  """
  @wraps(func)
  def wrapper(apps, schema_editor):
      if schema_editor.connection.in_atomic_block:
          schema_editor.atomic.__exit__(None, None, None)
      return func(apps, schema_editor)
  return wrapper

Update

Django 1.10 will support non-atomic migrations.

8👍

From the documentation about RunPython:

By default, RunPython will run its contents inside a transaction on databases that do not support DDL transactions (for example, MySQL and Oracle). This should be safe, but may cause a crash if you attempt to use the schema_editor provided on these backends; in this case, pass atomic=False to the RunPython operation.

So, instead of what you’ve got:

class Migration(migrations.Migration):
  operations = [
      migrations.RunPython(modify_data, atomic=False),
  ]

1👍

First you need to set Migration.atomic = False

class Migration(migrations.Migration):
    atomic = False

Then in your function you can wrap certain block of code inside of transaction.atomic() to make only that block atomic

from django.db import transaction

for row in rows:
    with transaction.atomic():
        do_something(row)
        # Changes made by `do_something` will be committed by this point

Here’s the relevant documentation: https://docs.djangoproject.com/en/4.1/howto/writing-migrations/#non-atomic-migrations

Gotcha: migrations.RunPython(forwards_func, atomic=False) does NOT do what you want. It prevents django from manually putting your migration code inside a transaction, which it doesn’t do for Postgresql anyway. This atomic=False option is meant for DBs that don’t support DDL transaction, as stated in their documentation: https://docs.djangoproject.com/en/4.1/ref/migration-operations/#runpython

By default, RunPython will run its contents inside a transaction on databases that do not support DDL transactions (for example, MySQL and Oracle). This should be safe, but may cause a crash if you attempt to use the schema_editor provided on these backends; in this case, pass atomic=False to the RunPython operation.

On databases that do support DDL transactions (SQLite and PostgreSQL), RunPython operations do not have any transactions automatically added besides the transactions created for each migration.

0👍

For others coming across this. You can have both data (RunPython), in the same migration. Just make sure all the alter tables goes first. You cannot do the RunPython before any ALTER TABLE.

Leave a comment