[Answered ]-What is the best way to handle DJANGO migration data with over 500k records for MYSQL

1👍

You can work with a Subquery expression [Django-doc], and do the update in bulk:

def remove_foreign_keys_from_user_request(apps, schema_editor):
    UserRequests = apps.get_model('users', 'UserRequests')
    Action = apps.get_user('users', 'Action')
    Status = apps.get_user('users', 'ProcessingStatus')
    UserRequests.objects.update(
        action_duplicate=Subquery(
            Action.objects.filter(
                pk=OuterRef('action_id')
            ).values('name')[:1]
        ),
        status_duplicate=Subquery(
            Status.objects.filter(
                pk=OuterRef('status_id')
            ).values('name')[:1]
        )
    )

That being said, it looks that what you are doing is actually the opposite of database normalization [wiki]: usually if there is duplicated data, you make an extra model where you make one Action/Status per value, and thus prevent having the same value for action_duplicate/status_duplicate multiple times in the database: this will make the database larger, and harder to maintain.


Note: normally a Django model is given a singular name, so UserRequest instead of UserRequests.

Leave a comment