[Fixed]-Django-compressor: how to write to S3, read from CloudFront?


I wrote a wrapper storage backend around the one provided by boto


import urlparse
from django.conf import settings
from storages.backends.s3boto import S3BotoStorage

def domain(url):
    return urlparse.urlparse(url).hostname    

class MediaFilesStorage(S3BotoStorage):
    def __init__(self, *args, **kwargs):
        kwargs['bucket'] = settings.MEDIA_FILES_BUCKET
        kwargs['custom_domain'] = domain(settings.MEDIA_URL)
        super(MediaFilesStorage, self).__init__(*args, **kwargs)

class StaticFilesStorage(S3BotoStorage):
    def __init__(self, *args, **kwargs):
        kwargs['bucket'] = settings.STATIC_FILES_BUCKET
        kwargs['custom_domain'] = domain(settings.STATIC_URL)
        super(StaticFilesStorage, self).__init__(*args, **kwargs)

Where my settings.py file has…

STATIC_FILES_BUCKET = "myappstatic"
MEDIA_FILES_BUCKET = "myappmedia"
STATIC_URL = "http://XXXXXXXX.cloudfront.net/"
MEDIA_URL = "http://XXXXXXXX.cloudfront.net/"

DEFAULT_FILE_STORAGE = 'myapp.storage_backends.MediaFilesStorage'
COMPRESS_STORAGE = STATICFILES_STORAGE = 'myapp.storage_backends.StaticFilesStorage'


I made a few, different changes to settings.py

AWS_S3_CUSTOM_DOMAIN = 'XXXXXXX.cloudfront.net' #important: no "http://"
AWS_S3_SECURE_URLS = True #default, but must set to false if using an alias on cloudfront

COMPRESS_STORAGE = 'example_app.storage.CachedS3BotoStorage' #from the docs (linked below)
STATICFILES_STORAGE = 'example_app.storage.CachedS3BotoStorage'

Compressor Docs

This above solution saved the files locally as well as uploaded them to s3. This let me compress the files offline. If you aren’t gzipping, the above ought to work for serving compressed files from CloudFront.

Adding gzip adds a wrinkle:



though this resulted in an error whenever a compressible file (css and js according to storages) was being pushed to s3 during collectstatic:

AttributeError: ‘cStringIO.StringO’ object has no attribute ‘name’

This was due to some bizarre error having to do with the compression of the css/js files that I don’t understand. These files I need locally, unzipped, and not on s3, so I could do avoid the problem altogether if I tweak the storage subclass referenced above (and provided in the compressor docs).

new storage.py

from os.path import splitext 
from django.core.files.storage import get_storage_class  
from storages.backends.s3boto import S3BotoStorage  

class StaticToS3Storage(S3BotoStorage): 

    def __init__(self, *args, **kwargs): 
        super(StaticToS3Storage, self).__init__(*args, **kwargs) 
        self.local_storage = get_storage_class('compressor.storage.CompressorFileStorage')() 

    def save(self, name, content): 
        ext = splitext(name)[1] 
        parent_dir = name.split('/')[0] 
        if ext in ['.css', '.js'] and not parent_dir == 'admin': 
            self.local_storage._save(name, content) 
            filename = super(StaticToS3Storage, self).save(name, content) 
            return filename 

This then saved all .css and .js files (excluding the admin files, which I serve uncompressed from CloudFront) while pushing the rest of the files to s3 (and not bothering to save them locally, though could easily add the self.local_storage._save line).

But when I run compress, I want my compressed .js and .css files to get pushed to s3 so I create another sublcass for compressor to use:

class CachedS3BotoStorage(S3BotoStorage): 
        django-compressor uses this class to gzip the compressed files and send them to s3 
        these files are then saved locally, which ensures that they only create fresh copies 
        when they need to 
        def __init__(self, *args, **kwargs): 
            super(CachedS3BotoStorage, self).__init__(*args, **kwargs) 
            self.local_storage = get_storage_class('compressor.storage.CompressorFileStorage')() 

        def save(self, filename, content): 
            filename = super(CachedS3BotoStorage, self).save(filename, content) 
            self.local_storage._save(filename, content) 
            return filename 

Finally, given these new subclasses, I need to update a few settings:

COMPRESS_STORAGE = 'example_app.storage.CachedS3BotoStorage' #from the docs (linked below)
STATICFILES_STORAGE = 'example_app.storage.StaticToS3Storage'

And that is all I have to say about that.


Seems like the problem was actually fixed upstream in Django, https://github.com/django/django/commit/5c954136eaef3d98d532368deec4c19cf892f664

The problematic _get_size method could probably patched locally to work around it for older versions of Django.

EDIT: Have a look at https://github.com/jezdez/django_compressor/issues/100 for an actual work around.



Actually this seems also to be an issue in django-storages. When compressor compares the hashes of files on S3, django-storages doesn’t unpack the content of the Gzip’ed files, but tries to compare different hashes. I’ve opened https://bitbucket.org/david/django-storages/pull-request/33/fix-gzip-support to fix that.

FWIW, there is also https://bitbucket.org/david/django-storages/pull-request/32/s3boto-gzip-fix-and-associated-unit-tests which fixes another issue of actually saving files to S3 when having AWS_IS_GZIPPED set to True. What a yak that has been.



Additionally, for streaming distributions it’s useful to override the url function to allow rtmp:// urls, as in:

import urlparse
class VideoStorageForCloudFrontStreaming(S3BotoStorage):
    Use when needing rtmp:// urls for a CloudFront Streaming distribution. Will return
    a proper CloudFront URL.

    Subclasses must be sure to set custom_domain.
    def url(self, name):
        name = urlparse.quote(self._normalize_name(self._clean_name(name)))
        return "rtmp://%s/cfx/st/%s" % (self.custom_domain, name)

    # handy for JW Player:
    def streamer(self):
        return "rtmp://%s/cfx/st" % (self.custom_domain)


It looks like CloudFront now offers compression built in. If it is enabled a request is made to CloudFront. If CF does not have a compressed cached version stored it makes a request to the origin server (S3) which returns the uncompressed file. CloudFront will then automatically compress the file, store it in cache and serve it to viewer.

You can enable automatic compression in CF by editing "behavior" in your distribution. At the bottom where it asks "Compress files automatically" you can save it as yes.

P.S. requirement for this:

In permissions change CORS to show Content-Length i.e. <AllowedHeader>Content-Length</AllowedHeader>

More info here.


Leave a comment