MySQL is trying to store your byte string in a character column. Because the connection character set is UTF-8 but the byte string doesn't represent a valid UTF-8 sequence, it gets mangled.
To get raw bytes into the database properly you need to:
make the column a BINARY type (or generally character type with a binary collation), and
use parameterised queries to get the data into the database instead of interpolating them into the query string where they might mix with non-binary (Unicode) content.
You should use parameterised queries anyway because the string interpolation you're using now, with no escaping, is vulnerable to SQL injection. In web.py that might look like:
query_string= 'INSERT INTO %s (%s) VALUES ($value)' % (table, column)
db.query(query_string, vars= {'value': value})
(assuming that the table
and column
values are known-good.)
Doing it like this also means you don't have to worry about the dollar sign.
The other approach is to use a normal character string, encoding away the non-ASCII bytes. You're doing this with uucode in your current workaround, but base64 would be a more common alternative that's easier to get to in Python (ciphertext.encode('base64')
). Hex encoding (.encode('hex')
) is most common for the case of a hash.