[Fixed]-Encoding gives "'ascii' codec can't encode character … ordinal not in range(128)"


If the data that you are receiving is, in fact, encoded in UTF-8, then it should be a sequence of bytes — a Python ‘str’ object, in Python 2.X

You can verify this with an assertion:

assert isinstance(content, str)

Once you know that that’s true, you can move to the actual encoding. Python doesn’t do transcoding — directly from UTF-8 to ASCII, for instance. You need to first turn your sequence of bytes into a Unicode string, by decoding it:

unicode_content = content.decode('utf-8')

(If you can trust parsed_feed.encoding, then use that instead of the literal ‘utf-8’. Either way, be prepared for errors.)

You can then take that string, and encode it in ASCII, substituting high characters with their XML entity equivalents:

xml_content = unicode_content.encode('ascii', 'xmlcharrefreplace')

The full method, then, would look somthing like this:

    content = content.decode(parsed_feed.encoding).encode('ascii', 'xmlcharrefreplace')
except UnicodeDecodeError:
    # Couldn't decode the incoming string -- possibly not encoded in utf-8
    # Do something here to report the error




I encountered this error during a write of a file name with zip file. The following failed

ZipFile.write(root+'/%s'%file, newRoot + '/%s'%file)

and the following worked

ZipFile.write(str(root+'/%s'%file), str(newRoot + '/%s'%file))

Leave a comment