How to properly decode or encode a string in python?

How to properly decode or encode a string in python? - python

I have the following string that is giving me the following error when I try to upload to a database:
String_from_source = 'STRINGÊ'
String_in_dataset = 'STRING\xe6'
When I try to decode or encode, I get the following error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 17: ordinal not in range(128)
For example, I get this error for
'STRING\xe6'.encode()
'STRING\xe6'.decode()
'STRING\xe6'.encode('utf-8')
I've also tried:
'STRING\xe6'.decode('utf-8')
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe6 in position 17: unexpected end of data
Is there a way to eliminate this type of error once and for all? I really just want to exclude all special characters.

Related

Python Error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 76: invalid start byte

I´m new in Python and I looked a tutorial. I coded it like there, but it always shows:
command_output = subprocess.run(["netsh", "wlan", "show", "profiles"], capture_output = True).stdout.decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 76: invalid start byte
Can you please help me with this?

This probably means the output stored in stdout is not UTF-encoded. You can try passing another character encoding as an argument to decode. Latin-1 can handle 0x81, though it has no character to display.
>>> b'\x81'.decode('latin-1')
'\x81'
0x81 character description.

Trying to decode a NASDAQ binary file. what kind of codec do those use ? tried utf-8, unicode-escape and acsii

Getting this error with each and every codec I have tried so far.
UnicodeDecodeError:
'ascii' codec can't decode byte 0xed in position 0: ordinal not in range(128).

How to solve UnicodeDecodeError when reading csv

I am trying to open a csv file with pandas but i get this error:
test_tweets = pd.read_csv(r"C:\Users\22587\Downloads\data\test_tweets.csv")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 75: invalid start byte

0xa0 is the non breaking space. You maybe copied your data from a website and there was such an invisible character

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 2: invalid continuation byte

Basically, I was using pandas to read csv files to separate a column which had "Date + Hour" in the format "dd/mm/yy hh".
I had help here trying to write a script to separate the column in 2 different columns.
First of all, this is what the dataset looked like:
The joint field is "FECHA" and I managed to run this code on some of the csv files:
import pandas as pd,os
sal = pd.read_csv('C:/Users/drivasti/Documents/002_Script_Separa_Fecha_Hora/Anexo2_THP_UL.csv')
df=sal.join(sal['FECHA'].str.partition(' ')[[0, 2]]).rename({0: 'DATE', 2: 'HOUR'}, axis=1)
df.to_csv('C:/Users/drivasti/Documents/002_Script_Separa_Fecha_Hora/Anexo2_THP_UL_2.csv',index=False)
And they worked perfectly as seen here:
However, I encountered this error when I tried running another csv file (note that I change the name of the file everytime I have to run it, but they're all csv files):
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 2: invalid continuation byte
Now I have tried some of the answers here but none have helped:
UnicodeDecodeError: 'utf-8' codec can't decode byte
'utf-8' codec can't decode byte 0xdb in position 1:
Anyone might know how to parse this as UTF-8? or is it a problem in the field "FECHA"?

charmap codec characted encoding error

I have a thai address stored in my table and using a simple query I am getting output as
u'35/1-2 8 \u0e16\u0e19\u0e19\u0e23\u0e31\u0e15\u0e19\u0e32\u0e18\u0e34\u0e40\u0e1a\u0e28\u0e23\u0e4c \u0e1a\u0e32\u0e07\u0e01\u0e23\u0e30\u0e2a\u0e2d \u0e40\u0e21\u0e37\u0e2d\u0e07\u0e19\u0e19\u0e17\u0e1a\u0e38\u0e23\u0e35 \u0e19\u0e19\u0e17\u0e1a\u0e38\u0e23\u0e35'
I tried to decode it by following command:
QtGui.QTableWidgetItem(data[i][j].decode('utf-8'))
But I am getting this error
data[i][j] Error btnManualSearch 'charmap' codec can't encode characters in position 10-24: character maps to <undefined>

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to properly decode or encode a string in python? - python

Related

Python Error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 76: invalid start byte

Trying to decode a NASDAQ binary file. what kind of codec do those use ? tried utf-8, unicode-escape and acsii

How to solve UnicodeDecodeError when reading csv

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 2: invalid continuation byte

charmap codec characted encoding error

Categories

Resources