Python UnicodeEncodeError involving 'charmap' codec

Python UnicodeEncodeError involving 'charmap' codec - python

This code was working fine before but now when i try to write a list to a csv file I get this error -
File "C:/Users/wf5931/OneDrive - ENGIE/Documents/Python Scripts/Scrape Vehicle Reg Info/vehicleRegChecker 6.1.py", line 109, in openFile
writer.writerow(x)
File "C:\Users\wf5931\AppData\Local\Continuum\anaconda3\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2082' in position 78: character maps to <undefined
from this:
with open(vehicleRegInformation, 'w', newline='') as f:
writer = csv.writer(f)
for x in vehicleRegInfo:
writer.writerow(x)

Try adding encoding="utf-8" :
with open(vehicleRegInformation, 'w', newline='',encoding="utf-8") as f:
writer = csv.writer(f)
for x in vehicleRegInfo:
writer.writerow(x)

Add encoding to the file opening
with open(vehicleRegInformation, 'w', newline='', encoding='utf8') as f:

Related

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbf in position 355: invalid start byte

I've been trying to iterate through a csv file with the following code:
`
import csv
import os, sys
directory = "/Users/aliharam/Desktop/Lamis File"
files = []
for filename in os.listdir(directory):
f = os.path.join(directory, filename)
# checking if it is a file
if os.path.isfile(f):
files.append(f)
files.pop()
for i in files:
with open(i, 'r') as csvfile:
datareader = csv.reader(csvfile)
for row in datareader:
print(row)
`
This is the error I am getting:
Traceback (most recent call last):
File "/Users/aliharam/PycharmProjects/LamisTasks/Normalization.py", line 16, in <module>
for row in datareader:
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbf in position 355: invalid start byte
['\tAli Haram \tAli Haram ']
Process finished with exit code 1
How do I fix this?!!
I tried using
dataset = pd.read_csv(i, header= 0,
encoding= 'unicode_escape')
and
with io.open(filename, 'r', encoding='utf-8') as fn:
lines = fn.readlines()
both didn't work

The file your program reads contains character(at position 355) which does not belong to Unicode.
If we assume you are reading a Unicode encoded file, then there is an error in your data file. First you need to make sure the file your program reads is encoded in Unicode or not.

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 123: character maps to <undefined>

I wrote this code:
#app.route('/cafes')
def cafes():
with open('cafe-data.csv', newline='') as csv_file
csv_data = csv.reader(csv_file, delimiter=',')
list_of_rows = []
for row in csv_data:
list_of_rows.append(row)
return render_template('cafes.html', cafes=list_of_rows)
But got this error on my website

You need to specify the encoding on the csvfile, default is 'utf-8' but there are others like 'cp-1252' that are commonly used as well.
with open('cafe-data.csv', newline='', encoding='utf-8') as csv_file:
...

Reading CSV file. UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 336: character maps to

Trying to read two CSV files based on a function but when reading one (yelp.csv) I encounter an error:
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 336: character maps to
I tried the encoding but the error persists. I had identified the issue is when using .readlines(). Not sure how to fix this issue.
def readDataFromFile(fileName, seperator, encoding="utf8"):
with open(fileName, 'r') as panelf:
panelf.readline() # skip header
lines = []
data = panelf.readlines()
for line in data:
line = line.strip("\n").split(seperator)
lines.append(line)
return lines
panelData = readDataFromFile("Desktop/panel.csv", ",", encoding="utf-8")
yelpData = readDataFromFile("Desktop/yelp.csv", ",", encoding="utf-8")

The encoding variable is not used. Try:
def readDataFromFile(fileName, seperator, encoding="utf8"):
with open(fileName, 'r', encoding=encoding) as panelf:
panelf.readline() # skip header
lines = []
data = panelf.readlines()
for line in data:
line = line.strip("\n").split(seperator)
lines.append(line)
return lines
panelData = readDataFromFile("Desktop/panel.csv", ",", encoding="utf-8")
yelpData = readDataFromFile("Desktop/yelp.csv", ",", encoding="utf-8")

codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 318: ordinal not in range(128)

I am trying to open and readlines a .txt file that contains a large amount of text. Below is my code, i dont know how to solve this problem. Any help would be very appreciated.
file = input("Please enter a .txt file: ")
myfile = open(file)
x = myfile.readlines()
print (x)
when i enter the .txt file this is the full error message is displayed below:
line 10, in <module> x = myfile.readlines()
line 26, in decode return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 318: ordinal not in range(128)

Instead of using codecs, I solve it this way:
def test():
path = './test.log'
file = open(path, 'r+', encoding='utf-8')
while True:
lines = file.readlines()
if not lines:
break
for line in lines:
print(line)
You must give encoding param precisely.

You can also try to encode :
with open(file) as f:
for line in f:
line = line.encode('ascii','ignore').decode('UTF-8','ignore')
print(line)

#AndriiAbramamov is right, your shoud check that question, here is a way you can open your file which is also on that link
import codecs
f = codecs.open('words.txt', 'r', 'UTF-8')
for line in f:
print(line)
Another way is to use regex, so when you open the file you can remove any special character like double quotes and so on.

Write ® to csv with csv.writer

I'm trying to write strings with '®' to a csv file:
csvfile = open(destination, "wb")
csv_writer = csv.writer(csvfile, quoting=csv.QUOTE_MINIMAL, delimiter='\t')
for row in data:
csv_writer.writerow(row)
csvfile.close()
and row looks like this:
[123, "str", "str2 ®"]
The strings I'm trying to write to csv is retrieved from xml, which I believe is encoded to utf-8.
I get error:
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/home/ec2-user/django/app/models.py", line 94, in import_data
load_to_csv(out, out_data)
File "/home/ec2-user/django/utils/util.py", line 90, in load_to_csv
csv_writer.writerow(row)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xae' in position 66: ordinal not in range(128)
Then I tried to encode the string to utf-8:
csvfile = open(destination, "wb")
csv_writer = csv.writer(csvfile, quoting=csv.QUOTE_MINIMAL, delimiter='\t')
for row in data:
for i, r in enumerate(row):
if type(r) is str:
row[i] = r.encode('utf-8')
csv_writer.writerow(row)
csvfile.close()
But I still get the same error.. Could anyone help? Have been stuck for a while..

You have a Unicode value, not a byte string. Encode those:
for row in data:
row = [c.encode('utf8') if isinstance(c, unicode) else c for c in row]:
csv_writer.writerow(row)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python UnicodeEncodeError involving 'charmap' codec - python

Try adding encoding="utf-8" : with open(vehicleRegInformation, 'w', newline='',encoding="utf-8") as f: writer = csv.writer(f) for x in vehicleRegInfo: writer.writerow(x)

Add encoding to the file opening with open(vehicleRegInformation, 'w', newline='', encoding='utf8') as f:

Related

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbf in position 355: invalid start byte

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 123: character maps to <undefined>

Reading CSV file. UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 336: character maps to

codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 318: ordinal not in range(128)

Write ® to csv with csv.writer

Categories

Resources