[Answer]-How big can a string be in a Django field if the database is Postgresql and db_index=True?

1đź‘Ť

âś…

There is indeed a limit, but it’s not tiny.

ERROR: index row requires 9400 bytes, maximum size is 8191

To trigger this:

CREATE TABLE bigtext(x text);

CREATE INDEX bigtext_x ON bigtext(x);

 INSERT INTO bigtext(x) SELECT repeat('x', 819200);

Given the error you’d expect this to fail:

INSERT INTO bigtext(x) SELECT repeat('x', 8192);

but because of compression, it won’t; you can tack an extra zero on and it’ll still fit.

Smaller, less repetitive and therefore less compressible texts will fit less before overrunning a page and failing. In theory if you had totally random garbage then only 8191 bytes should fit, but in reality it’ll still be a bit more on a utf-8 db because utf-8 doesn’t permit total randomness; probably in the vicinity of 8191 totally random utf-8 chars though.

For this reason you can’t have a simple CHECK constraint, it’s not as simple as “number of chars”.

You might find pg_column_size(...) useful; it tells you the on-disk compressed size of a datum. It won’t help you in a CHECK constraint though, because it always shows unTOASTed datums at full uncompressed size.

The PostgreSQL docs could describe this limit a lot better (or at all).

For bigger fields you can index the left n bytes, or use a tool like tsearch2 to do fulltext search instead.

👤Craig Ringer

Leave a comment