I connection to Mssql server on python 2.7 by pymssql.
Connection string is:
mssql+pymssql://user:pass#server:1433/DB
And Collation of server is Cyrillic_General_CI_AS.
When i try select from table with column of varchat it return string:
u'ÎÎÎ "ÒÎÐÏÅÄÀ"'
I did try to convert it with:
"ÎÎÎ "ÒÎÐÏÅÄÀ".decode('866')
And get output:
├О├О├О "├Т├О├Р├П├Е├Д├А"
But correct String in database is:
ООО "ТОРПЕДА"
It seems like each second simbol is correct.
How to get all varchar2 strings at correct encoding?
Thank you
Just found the answer after i did post the question:
u'ÎÎÎ "ÒÎÐÏÅÄÀ"'.encode('latin1').decode('1251')
May be there is a better solution, but after few days of looking i did't found.
Related
This question already has answers here:
What is character encoding and why should I bother with it
(4 answers)
Closed 1 year ago.
I'm running a pretty basic script that is querying a MySQL table, but I'm running into an error that I can't find online. I see a lot of charmap/codec related issues online, but all have to do with reading text or data.
This error is happening when I attempt a MySQL SELECT query, or at least I think it is.
Does anyone know why Python would error on this?
My code reads like this:
Line 188 is where the error occurs.
Please look at the following answer:
Encoding Issue - MySQL
Adding the extra parameters while creating connection should resolve your issue.
con = mdb.connect('loclhost', 'root', '', 'mydb', use_unicode=True, charset='utf8')
Thanks.
alt column contains OSC control character (0x9d). Seems to be a data problem. You could try replacing the character to verify that's the issue but the long term solution should be to not allow those character in a string, convert them to html entities, or specify a connection character set.
SELECT id, REPLACE(alt, UNHEX('9D'), '') ...
See also this.
I'm getting this error when using Python's MySQL Connector library:
Incorrect date value: 'STR_TO_DATE('2017-10-19T16:57:56Z','%Y-%m-%d%#%H:%i:%s%#')' for column 'estimated_delivery'
Essentially I'm using this matching string: %Y-%m-%d%#%H:%i:%s%#
For this input: 2017-10-19T16:57:56Z
I'm confused about where this error possibly could be =/
Ah I see, I was inserting this STR_TO_DATE function into the SQL statement as a parameter to the execute function. This obviously made it inserted as a string instead of SQL syntax, silly mistake on my part.
There is one row in Mysql table as following:
1000, Intel® Rapid Storage Technology
The table's charset='utf8' when was created.
When I used python code to read it, it become the following:
Intel® Management Engine Firmware
My python code as following:
db = MySQLdb.connect(db,user,passwd,dbName,port,charset='utf8')
The weird thing was that when I removed the charset='utf8' as following:
db = MySQLdb.connect(db,user,passwd,dbName,port), the result become correct.
Why when I indicated charset='utf8' in my code, but got wrong result please?
Have you tried leaving off the charset in the connect string and then setting afterwards?
db = MySQLdb.connect(db,user,passwd,dbName,port)
db.set_character_set('utf8')
When trying to use utf8/utf8mb4, if you see Mojibake, check the following.
This discussion also applies to Double Encoding, which is not necessarily visible.
The bytes to be stored need to be utf8-encoded.
The connection when INSERTing and SELECTing text needs to specify utf8 or utf8mb4.
The column needs to be declared CHARACTER SET utf8 (or utf8mb4).
HTML should start with <meta charset=UTF-8>.
See also Python notes
I am trying to insert values to a SQL DB where I pull data from a dictionary. I ran into a problem when my program tries to enter 0xqb_QWQDrabGr7FTBREfhCLMZLw4ztx into a column named VersionId. The following is my sample code and error.
cursor.execute("""insert into [TestDB].[dbo].[S3_Files] ([Key],[IsLatest],[LastModified],[Size(Bytes)],[VersionID]) values (%s,%s,%s,%s,%s)""",(item['Key'],item['IsLatest'],item['LastModified'],item['Size'],item['VersionId']))
conn_db.commit()
pymssql.ProgrammingError: (102, "Incorrect syntax near 'qb_QWQDrabGr7FTBREfhCLMZLw4ztx'.DB-Lib error message 20018, severity 15:\nGeneral SQL Server error: Check messages from the SQL Server\n")
Based on the error I assume SQL does not like the 0x in the beginning of the VersionId string because of security issues. If my assumption is correct, what are my options? I also cannot change the value of the VersionId.
Edit: This what I get when I print that cursor command
insert into [TestDB].[dbo].[S3_Files] ([Key],[IsLatest],[LastModified],[Size(Bytes)],[VersionID]) values (Docs/F1/Trades/Buy/Person1/Seller_Provided_-_Raw_Data/GTF/PDF/GTF's_v2/NID3154229_23351201.pdf,True,2015-07-22 22:05:38+00:00,753854,0xqb_QWQDrabGr7FTBREfhCLMZLw4ztx)
Edit 2: The odd thing is that when I try to enter the insert command manually on SQL management studio, it doesn't like the (') in the path name in the first parameter, so I escaped the character, added (') to each values except the number and the command worked. At this point I am pretty stumped on why the insert is not working.
Edit 3: I decided to do a try except on every insert and I see that the ones that VersionIds that get caught have the pattern 0x..... Again, does anyone know if my assumption of security correct?
I guess that's what happens when our libraries try to be smarter than us...
No SQL server around to test, but I assume the reason the 0x values are failing is because the way pymssql passes the parameter causes the server to interprete this as a hexadecimal string and the 'q' following the '0x' does not fit its expectations of 0-9 and A-F chars.
I don't have enough information to know if this is a library bug and/or if it can be worked around; the pymssql documentation is not very extensive, but I would try the following:
if you can, check in MSSQL Profiler what command is actually coming in
build your own command as a string and see if the error persists (remember Bobby Tables before putting that in production, though: https://xkcd.com/327/)
try to work around it by adding quotes etc
swith to another library / use SQLAlchemy
I used pyodbc to access my MSSQL database.
When reading uniqueidentifier field from MSSQL, in my MacOS, I can print the correct value of udid field (e.g 4C444660-6003-13CE-CBD5-8478B3C9C984), however when I run the same code on Linux CentOS, i just see very strange string like "???E??6??????c", and the type of value is "buffer", not "str" as in MacOS.
Could you explain me why it is and how can i get correct value of uidi on linux? Thanks
In linux i use str(uuid.UUID(bytes_le=value)).upper() to get string like 4C444660-6003-13CE-CBD5-8478B3C9C984 of uniqueidentifier field
This is a few years old, but I've had to tackle this same problem recently. My solution was to simply CAST the unique identifier as a VARCHAR, which kept my Python code nice and tidy:
SELECT CAST(unique_id_column AS VARCHAR(36)) AS my_id FROM...
Then in Python, simply output row.my_id.