[Fixed]-CSV file upload from buffer to S3


Okay, disregard my earlier answer, I found the actual problem.

According to the boto3 documentation for the upload_fileobj function, the first parameter (Fileobj) needs to implement a read() method that returns bytes:

Fileobj (a file-like object) — A file-like object to upload. At a minimum, it must implement the read method, and must return bytes.

The read() function on a _io.StringIO object returns a string, not bytes. I would suggest swapping the StringIO object for a BytesIO object, adding in the necessary encoding and decoding.

Here is a minimal working example. It’s not the most efficient solution – the basic idea is to copy the contents over to a second BytesIO object.

import io
import boto3
import csv

buff = io.StringIO()

writer = csv.writer(buff, dialect='excel', delimiter=',')
writer.writerow(["a", "b", "c"])

buff2 = io.BytesIO(buff.getvalue().encode())

bucket = 'changeme'
key = 'blah.csv'

client = boto3.client('s3')
client.upload_fileobj(buff2, bucket, key)


As explained here using the method put_object rather than upload_fileobj would just do the job right with io.STRINGIO object buffer.

So here, to match the initial example:

client = boto3.client('s3')
client.upload_fileobj(buff2, bucket, key)

would become

client = boto3.client('s3')
client.put_object(Body=buff2, Bucket=bucket, Key=key, ContentType='application/vnd.ms-excel')


Have you tried calling buff.flush() first? It’s possible that your entirely-sensible debugging check (calling getvalue()) is creating the illusion that the buff has been written to, but isn’t if you don’t call it.


You can use something like goofys to redirect output to S3.


Leave a comment