Flask SQLAlchemy can't insert emoji to MySQL - python

I'm using Python 2.7 and flask framework with flask-sqlalchemy module.
I always get the following exception when trying to insert : Exception Type: OperationalError. Exception Value: (1366, "Incorrect string value: \xF09...
I already set MySQL database, table and corresponding column to utf8mb4_general_ci and I can insert emoji string using terminal.
Flask's app config already contains app.config['MYSQL_DATABASE_CHARSET'] = 'utf8mb4', however it doesn't help at all and I still get the exception.
Any help is appreciated

maybe this will someone in the future:
all i did was edit the sql connection in my config file:
SQLALCHEMY_DATABASE_URI = 'mysql://user:password#localhost/database?charset=utf8mb4'
this way i'm able to store emojis without any altering of the database or tables.
source: https://blog.miguelgrinberg.com/post/the-flask-mega-tutorial-part-iv-database/page/13%0A Comment #322

It works for me:
import pickle
data = request.get_json().get("data")
data = pickle.dumps(data)
Then you can input the "data" to the database .
You can send the "data" like "😢" ...Whatever emoji you like.
Next time, when you get the "data" from the database :
you should :
data = pickle.loads(data)
then you can get "data" as "😢"

Add config file main file and set set 'charset' => 'utf8mb4'
you have to edit field in which you want to store emoji and set collation as utf8mb4_unicode_ci

Make sure to use a proper Python Unicode object, like the ones created with the u"..." literal. In other words, the type of your object should be unicode not str:
>>> type('ą')
<type 'str'>
>>> type(u'ą')
<type 'unicode'>
Please note that this only applies to Python 2, in Python 3 all string literals are Unicode by default.

Related

Best practises when inserting a json variable into a MySQL table column of type json, using Python's pymysql library

I have a Python script, that's using PyMySQL to connect to a MySQL database, and insert rows in there. Some of the columns in the database table are of type json.
I know that in order to insert a json, we can run something like:
my_json = {"key" : "value"}
cursor = connection.cursor()
cursor.execute(insert_query)
"""INSERT INTO my_table (my_json_column) VALUES ('%s')""" % (json.dumps(my_json))
connection.commit()
The problem in my case is that the json is variable over which I do not have much control (it's coming from an API call to a third party endpoint), so my script keeps throwing new error for non-valid json variables.
For example, the json could very well contain a stringified json as a value, so my_json would look like:
{"key": "{\"key_str\":\"val_str\"}"}
→ In this case, running the usual insert script would throw a [ERROR] OperationalError: (3140, 'Invalid JSON text: "Missing a comma or \'}\' after an object member." at position 1234 in value for column \'my_table.my_json_column\'.')
Or another example are json variables that contain a single quotation mark in some of the values, something like:
{"key" : "Here goes my value with a ' quotation mark"}
→ In this case, the usual insert script returns an error similar to the below one, unless I manually escape those single quotation marks in the script by replacing them.
[ERROR] ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'key': 'Here goes my value with a ' quotation mark' at line 1")
So my question is the following:
Are there any best practices that I might be missing on, and that I can use in order to avoid my script breaking, in the 2 scenarios mentioned above, but also in any other potential examples of jsons that might break the insert query ?
I read some existing posts like this one here or this one, where it's recommended to insert the json into a string or a blob column, but I'm not sure if that's a good practice / if other issues (like string length limitations for example) might arise from using a string column instead of json.
Thanks !

Sending UTF-8 formatted emojis from Android to Python API

I have been trying to send emojis through post petitions to my server (python server-side) to store in a database. I get the full string and convert it to UTF-8, the problem is that some emojis are well sent and others throw an error on server-side Incorrect string value: '\\xF0\\x9F\\x8E\\xAE
I think this is because some emojis are converted to this %E2%9D%A4%EF%B8%8F on sending like ❤️, but others are converted to this %F0%9F%8E%AE like 🎮.
I have tested the petitions through postman and the red heart one works, but the others, with 4 codes don't and I see that error.
Here is some postman log capture
And here is the error from Python django API
OperationalError at /api/addcomment
(1366, "Incorrect string value: '\\xF0\\x9F\\x8E\\xAE' for column 'text' at row 1")
Django Version: 2.2.5
Exception Type: OperationalError
Exception Value:
(1366, "Incorrect string value: '\\xF0\\x9F\\x8E\\xAE' for column 'text' at row 1")
Exception Location: /var/www/vhosts/*/httpdocs/pythonvenv/lib/python3.5/site-packages/MySQLdb/connections.py in query, line 226
Python Executable: /var/www/vhosts/*/httpdocs/pythonvenv/bin/python
Python Version: 3.5.2
Python Path:
['/var/www/vhosts/*/httpdocs/pythonvenv/bin',
'/var/www/vhosts/*/httpdocs/app/app',
'/var/www/vhosts/*/httpdocs/app',
'/var/www/vhosts/*/httpdocs',
'/usr/share/passenger/helper-scripts',
'/var/www/vhosts/*/httpdocs/pythonvenv/lib/python35.zip',
'/var/www/vhosts/*/httpdocs/pythonvenv/lib/python3.5',
'/var/www/vhosts/*/httpdocs/pythonvenv/lib/python3.5/plat-x86_64-linux-gnu',
'/var/www/vhosts/*/httpdocs/pythonvenv/lib/python3.5/lib-dynload',
'/usr/lib/python3.5',
'/usr/lib/python3.5/plat-x86_64-linux-gnu',
'/var/www/vhosts/*/httpdocs/pythonvenv/lib/python3.5/site-packages']
I have changed the original URL with *
For more info, in phpmyadmin i cannot insert those emojis either (the 4 codes ones like the gamepad) on SQL or insert tab, but i can insert the 6 codes ones like the red heart on SQL or insert tab. I have tried several utf8 and utf8mb4 collations for both column and table.
This happens when inserting an emoji with db, table and column set to utf8mb4 or not
Any help? Thanks!
Both of these need to be set to utf8mb4:
The column charset
The database connection charset
The first one determines what strings can be stored in the column. The second determines the character set for string literals. (Oddly, if you put a 4-byte UTF-8 sequence in a string literal, MySQL can still think it's "3-byte utf8" and doesn't give an error until you try to use it)
To find if the database connection charset is the problem, you can try setting the character set on the string literal explicitly. If this works, the column encoding is fine, but the connection isn't:
insert into demo_table set `text` = _utf8mb4'🎮';
You seem to be using Django. I don't know much about Django but it looks like the connection encoding is set somewhere in the database connection options. Going by https://chriskief.com/2017/06/18/django-and-mysql-emoticons/ :
DATABASES = {
'default': {
'ENGINE':'django.db.backends.mysql',
...
'OPTIONS': {'charset': 'utf8mb4'},
}
}

Trying to save special characters in MySQL DB

I have a string that looks like this 🔴Use O Mozilla Que Não Trava! Testei! $vip ou $apoio
When I try to save it to my database with ...SET description = %s... and cursor.execute(sql, description) it gives me an error
Warning: (1366, "Incorrect string value: '\xF0\x9F\x94\xB4Us...' for column 'description' ...
Assuming this is an ASCII symbol, I tried description.decode('ascii') but this leads to
'str' object has no attribute 'decode'
How can I determine what encoding it is and how could I store anything like that to the database? The database is utf-8 encoded if that is important.
I am using Python3 and PyMySQL.
Any hints appreciated!
First, you need to make sure the table column has correct character set setting. If it is "latin1" you will not be able to store content that contains Unicode characters.
You can use following query to determine the column character set:
SELECT CHARACTER_SET_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA='your_database_name' AND TABLE_NAME='your_table_name' AND COLUMN_NAME='description'
Following Mysql document here if you want to change column character set.
Also, you need to make sure character set is properly configured for Mysql connection. Quoted from Mysql doc:
Character set issues affect not only data storage, but also
communication between client programs and the MySQL server. If you
want the client program to communicate with the server using a
character set different from the default, you'll need to indicate
which one. For example, to use the utf8 Unicode character set, issue
this statement after connecting to the server:
SET NAMES 'utf8';
Once character set setting is correct, you will be able to execute your sql statement. There is no need to encode / decode in Python side. That is used for different purposes.

python convert unicode to readable character

I am using python 2.7 and psycopg2 for connecting to postgresql
I read a bunch of data from a source which has strings like 'Aéropostale'. I then store it in the database. However, in postgresql it is ending up as 'A\u00e9ropostale'. But I want it to get stored as 'Aéropostale'.
The encoding of postgresql database is utf-8.
Please tell me how can I store the actual string 'Aéropostale' instead.
I suspect that the problem is happening in python. Please advice.
EDIT:
Here is my data source
response_json = json.loads(response.json())
response is obtained via service call and looks like:
print(type(response.json())
>> <type'str'>
print(response.json())
>> {"NameRecommendation": ["ValueRecommendation": [{"Value": "\"Handmade\""}, { "Value": "Abercrombie & Fitch"}, {"Value": "A\u00e9ropostale"}, {"Value": "Ann Taylor"}}]
From the above data, my goal is to construct a list of all ValueRecommendation.Value and store in a postgresql json datatype column. So the python equivalent list that I want to store is
py_list = ["Handmade", "Abercrombie & Fitch", "A\u00e9ropostale", "Ann Taylor"]
Then I convert py_list in to json representation using json.dumps()
json_py_list = json.dumps(py_list)
And finally, to insert, I use psycopg2.cursor() and mogrify()
conn = psycopg2.connect("connectionString")
cursor = conn.cursor()
cursor.execute(cursor.mogrify("INSERT INTO table (columnName) VALUES (%s), (json_py_list,)))
As I mentioned earlier, using the above logic, string with special charaters like è are getting stored as utf8 character code.
Please spot my mistake.
json.dumps escapes non-ASCII characters by default so its output can work in non-Unicode-safe environments. You can turn this off with:
json_py_list = json.dumps(py_list, ensure_ascii=False)
Now you will get UTF-8-encoded bytes (unless you change that too with encoding=) so you'll need to make sure your database connection is using that encoding.
In general it shouldn't make any difference as both forms are valid JSON and even with ensure_ascii off there are still characters that get \u-encoded.

UTF-8 Error when trying to insert a string into a PostgresQL BYTEA column with SQLAlchemy

I have a column in a PostgresQL table of type BYTEA. The model class defines the column as a LargeBinary field, which the documentation says "The Binary type generates BLOB or BYTEA when tables are created, and also converts incoming values using the Binary callable provided by each DB-API."
I have a Python string which I would like to insert into this table.
The Python string is:
'\x83\x8a\x13,\x96G\xfd9ae\xc2\xaa\xc3syn\xd1\x94b\x1cq\xfa\xeby$\xf8\xfe\xfe\xc5\xb1\xf5\xb5Q\xaf\xc3i\xe3\xe4\x02+\x00ke\xf5\x9c\xcbA8\x8c\x89\x13\x00\x07T\xeb3\xbcp\x1b\xff\xd0\x00I\xb9'
The relevant snippet of my SQLAlchemy code is:
migrate_engine.execute(
"""
UPDATE table
SET x=%(x)s
WHERE id=%(id)s
""",
x=the_string_above,
id='1')
I am getting the error:
sqlalchemy.exc.DataError: (DataError) invalid byte sequence for encoding "UTF8": 0x83
'\n UPDATE table\n SET x=%(x)s\n WHERE id=%(id)s\n ' {'x': '\x83\x8a\x13,\x96G\xfd9ae\xc2\xaa\xc3syn\xd1\x94b\x1cq\xfa\xeby$\xf8\xfe\xfe\xc5\xb1\xf5\xb5Q\xaf\xc3i\xe3\xe4\x02+\x00ke\xf5\x9c\xcbA8\x8c\x89\x13\x00\x07T\xeb3\xbcp\x1b\xff\xd0\x00I\xb9', 'id': '1',}
If I go into the pgadmin3 console and enter the UPDATE command directly, the update works fine. The error is clearly from SQLAlchemy. The string is a valid Python2 string. The column has type BYTEA. The query works without SQLAlchemy. Can anyone see why Python thinks this byte string is in UTF-8?
Try wrapping the data in a buffer:
migrate_engine.execute(
"""
UPDATE table
SET x=%(x)s
WHERE id=%(id)s
""",
x=buffer(the_string_above),
id='1')

Categories

Resources