[Fixed]-Django Full Text SearchVectorField obsolete in PostgreSQL

7👍

I worked out how to do this using custom migrations. The main caveat is that you’ll need to update these migrations manually whenever the base model (that you’re searching) changes.

Note you must be using PostgreSQL 12 for the following to work:


First, create a database column to store the tsvector:

$ python manage.py makemigrations my_app --empty

Migrations for 'my_app':
  my_app/migrations/005_auto_20200625_1933.py

Open the new migration file to edit it. We will need to create a column to store the tsvector WITHOUT any associated field in the model definition, so that Django doesn’t try to update the auto-generated field itself.

The main downside to this approach is that because this isn’t synced to the Django model, if the fields change then a new migration will need to be manually created.

#my_app/migrations/0010_add_tsvector.py

"""
Use setweight() to rank results by weight (where 'A' is highest).
Use PostgreSQL tsvector concatenation operator || to combine multiple
fields from the table. Use `coalesce` ensure that NULL is not
returned if a field is empty.

In this case, `blog_table` is the database table name, and
`textsearch` is the new column, but you can choose anything here
"""

operations = [
    migrations.RunSQL(sql="""
        ALTER TABLE "blog_content" ADD COLUMN "textsearch" tsvector
        GENERATED ALWAYS AS (
        setweight(to_tsvector('english', coalesce(body, '')), 'A') ||
        setweight(to_tsvector('english', coalesce(title, '')), 'B') ||
        ' '
        ) STORED NULL;
    """, reverse_sql="""
        ALTER TABLE "blog_content" DROP COLUMN "textsearch";
    """
    )
]

To create the new column in the database, run:

$ python manage.py migrate my_app

Then, to use the column in text searches:

#my_app/views.py

from django.db.models.expressions import RawSQL
from django.contrib.postgres.search import SearchVectorField
from django.views.generic.list import ListView


class TextSearchView(ListView):
    def get_queryset(self):
        '''Return list of top results
        
        Since there is no model field, we must manually retrieve the
        column, using `annotate`
        '''
        query = self.request.GET.get('search_term')

        return Blog.objects.annotate(
                ts=RawSQL(
                    'textsearch',
                    params=[],
                    output_field=SearchVectorField()
                )
            ).filter(
                ts=query
            )

Note that the results will already be ranked, because the weights are applied each time the tsvector updates itself.

2👍

A django ticket that requests this new feature has been rejected:


Generated/Function-based virtual columns are a huge topic that would
require many changes and have many caveats, e.g. functions must be
deterministic. They are feasible also on Oracle. This kind of features
require few weeks (even months) of works, a precise plan, and should
be preceded by a discussion on DevelopersMailingList and even DEP.

Some workarounds include:

I have found an article that showcases some coding practices that will keep you going.

Leave a comment