[Fixed]-Django database synchronization for an offline usage

5šŸ‘

Turns out that Iā€™m running a system like this in Django.

This is not a complete answer, just the answer that is currently solving (mostly) the problem.

  • Use of UUIDs for primary keys. That decreases greatly primary keys collision for diferent objects.
  • Use Djangoā€™s serialization framework for data interchange. The central admin site has an option to download the selected objects in the changelist to a Django-compatible serialized file. Then the user can go offline and start a local admin site, and there, upload the serialized file. When finished offline edition, the same process is used, in the ā€œofflineā€ admin site the objects are serialized to a file, and the uploaded to the central admin site.
  • The serialization frameworks is very useful, since you can get an actual (and unsaved) object, then decide to save it or not, and to modify some fields before the save.

We have run into very little trouble with this simple system, also helped since the content is properly categorized and the editors only create/edit a non overlapped set of categories.

I have talked with this with some people, and proposed me several solutions:

  • Use an timestamp field: That help decide which version to save and one to discard.
  • Use a version fields, with mayor and minor version numbers. Minor editing (like spelling corrections) only updates the minor version number, and major changes updates the mayor version number and sets the minor to 0. That way when comparing you always know which one gets higher priority. However this needs education and conventions within the editing users.
  • Object updates. A separate model, which stores updates coming from offline edits. Then a ā€˜chiefā€™ editor merges them into the actual object, helped with some additional admin views to view differences (using google-diff-match-patch and the like). An object can also be flagged to allow direct updates, that is, no storing updates and apply them directly on arrival. The inconvenience is the ā€˜chiefā€™ editor has to review the all the updates, and that depends on how much information get updated.

Hope this helps in some way. If anyone decides to implement something of this, Iā€™ll love to hear from him.

1šŸ‘

Well, I actually donā€™t know if thereā€™s a django app to do that, but I will proceed like this:

create a method for ā€œoffline_updateā€: connection to the serverā€™s database, you select all objects whose id matches the one in the local database. you update the local database. then you select the rest of the entries, and you add them to local database

create a method for ā€œonline_updateā€ the same routine, inverted.

PRO: easy to implement (Objects.all() gets you everything, then you manipulate and update, or save directly)

CONS: race conditions (what if 2 users update the same entry (not necessarily on the same time)? whoā€™s got the most updated one?)

you basically create a sort of ā€œmysql-svnā€ to keep the 2 databases updated.

Iā€™m voting +1 to your question, because itā€™s really interesting. Iā€™ve always worked by dumping the database (via mysql) and then loading into the local database. without using django.

    1šŸ‘

    I built a Django app which does this. When model instances are created on the remote/laptop version of the app they are marked as dirty and get a temporary id. The remote app checks regularly for connectivity to the master server. When there is a network connection, i.e. the app is online, it gets a permanent id for each new dirty model instance from the master server. The temporary ids are replaced with permanent ids and then the dirty instances are synchronized to the master.

    I used the Django REST framework to receive and update the dirty model instances on the master server.

    Note that this also requires running a local web server on the offline computer. I chose CherryPy for that.

    šŸ‘¤nmgeek

      Leave a comment