[Fixed]-Django multi-table inheritance alternatives for basic data model pattern

11👍

While awaiting a better one, here’s my attempt at an answer.

As suggested by Kevin Christopher Henry in the comments above, it makes sense to approach the problem from the database side. As my experience with database design is limited, I have to rely on others for this part.

Please correct me if I’m wrong at any point.

Data-model vs (Object-Oriented) Application vs (Relational) Database

A lot can be said about the object/relational mismatch,
or, more accurately, the data-model/object/relational mismatch.

In the present
context I guess it is important to note that a direct translation between data-model,
object-oriented implementation (Django), and relational database implementation, is not always
possible or even desirable. A nice three-way Venn-diagram could probably illustrate this.

Data-model level

To me, a data-model as illustrated in the original post represents an attempt to capture the essence of a real world information system. It should be sufficiently detailed and flexible to enable us to reach our goal. It does not prescribe implementation details, but may limit our options nonetheless.

In this case, the inheritance poses a challenge mostly on the database implementation level.

Relational database level

Some SO answers dealing with database implementations of (single) inheritance are:

These all more or less follow the patterns described in Martin Fowler’s book
Patterns of Application Architecture.
Until a better answer comes along, I am inclined to trust these views.
The inheritance section in chapter 3 (2011 edition) sums it up nicely:

For any inheritance structure there are basically three options.
You can have one table for all the classes in the hierarchy: Single Table Inheritance (278) …;
one table for each concrete class: Concrete Table Inheritance (293) …;
or one table per class in the hierarchy: Class Table Inheritance (285) …

and

The trade-offs are all between duplication of data structure and speed of access. …
There’s no clearcut winner here. … My first choice tends to be Single Table Inheritance

A summary of patterns from the book is found on martinfowler.com.

Application level

Django’s object-relational mapping (ORM) API
allows us to implement these three approaches, although the mapping is not
strictly one-to-one.

The Django Model inheritance docs
distinguish three "styles of inheritance", based on the type of model class used (concrete, abstract, proxy):

  1. abstract parent with concrete children (abstract base classes):
    The parent class has no database table. Instead each child class has its own database
    table with its own fields and duplicates of the parent fields.
    This sounds a lot like Concrete Table Inheritance in the database.

  2. concrete parent with concrete children (multi-table inheritance):
    The parent class has a database table with its own fields, and each child class
    has its own table with its own fields and a foreign-key (as primary-key) to the
    parent table.
    This looks like Class Table Inheritance in the database.

  3. concrete parent with proxy children (proxy models):
    The parent class has a database table, but the children do not.
    Instead, the child classes interact directly with the parent table.
    Now, if we add all the fields from the children (as defined in our data-model)
    to the parent class
    , this could be interpreted as an implementation of
    Single Table Inheritance.
    The proxy models provide a convenient way of dealing with the application side of
    the single large database table.

Conclusion

It seems to me that, for the present example, the combination of Single Table Inheritance with Django’s proxy models may be a good solution that does not have the disadvantages of "hidden" joins.

Applied to the example from the original post, it would look something like this:

class Party(models.Model):
    """ All the fields from the hierarchy are on this class """
    name = models.CharField(max_length=20)
    type = models.CharField(max_length=20)
    favorite_color = models.CharField(max_length=20)


class Organization(Party):
    class Meta:
        """ A proxy has no database table (it uses the parent's table) """
        proxy = True

    def __str__(self):
        """ We can do subclass-specific stuff on the proxies """
        return '{} is a {}'.format(self.name, self.type)


class Person(Party):
    class Meta:
        proxy = True

    def __str__(self):
        return '{} likes {}'.format(self.name, self.favorite_color)


class Address(models.Model):
    """ 
    As required, we can link to Party, but we can set the field using
    either party=person_instance, party=organization_instance, 
    or party=party_instance
    """
    party = models.ForeignKey(to=Party, on_delete=models.CASCADE)

One caveat, from the Django proxy-model documentation:

There is no way to have Django return, say, a MyPerson object whenever you query for Person objects. A queryset for Person objects will return those types of objects.

A potential workaround is presented here.

👤djvg

Leave a comment