UnicodeDecodeError for writing file - python

I know that this is a very common error, but it's the first time I've encountered it when trying to write a file.
I'm using networkx to work with graphs for network analysis, and when I try to write into any format:
nx.write_gml(G, "Graph.gml")
nx.write_pajek(G, "Graph.net")
nx.write_gexf(G, "graph.gexf")
I get:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 2, in write_pajek
File "/Library/Python/2.7/site-packages/networkx/utils/decorators.py", line 263, in _open_file
result = func(*new_args, **kwargs)
File "/Library/Python/2.7/site-packages/networkx/readwrite/pajek.py", line 100, in write_pajek
path.write(line.encode(encoding))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)
I haven't found documentation on this, so quite confused.

Wondering if you can make use of codec module to solve it or not. Just create a file object by codec as following before feeding to networkx.
ex,
import codecs
f = codecs.open("graph.gml", "w", "utf-8")

Related

Python: Can't seem to decode the textfile

I am trying to open the file i.md3 in python. Only the first line is displayed properly.
This file is correct as I can open this in C easily with structures and pointers.
How can I decode this file. I have tried many encoding techniques many of which can show only output of first line.
Without the "encoding=cp850" there is an error:
Traceback (most recent call last):
File "D:\Eclipse Workspace\IGG Project\Main.py", line 40, in <module>
line = fp1.read()
File "C:\Program Files (x86)\Python38-32\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 35: character maps to <undefined>
This file is correct as I can open this in C easily with structures and pointers.
Code:
fp1 = open("i.md3", encoding="cp850")
while 1:
line = fp1.read()
if not line:
break
print (line)
First few lines of output is in the link below:
https://i.stack.imgur.com/TGZhq.png

VADER-Sentiment-Analysis toolkit and decoding to UTF-8

I'm trying out this awesome sentiment analysis toolkit for python called Vader (https://github.com/cjhutto/vaderSentiment#python-code-example). However, I'm not even able to run their examples, because of a decoding problem (?).
I've tried the .decode('utf-8'), but it still gives me this error code:
Traceback (most recent call last):
File "/Users/solari/Codes/EmotionalTwitter/vader.py", line 22, in
<module>
analyzer = SentimentIntensityAnalyzer()
File "/usr/local/lib/python3.6/site-
packages/vaderSentiment/vaderSentiment.py", line 199, in __init__
self.lexicon_full_filepath = f.read()
File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/l
ib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 6573: ordinal not in range(128)
[Finished in 0.5s with exit code 1]
Why does it complain about this "ascii codec"? Because if I've read their documentation correctly this should be in utf-8 anyway. Also, I'm using Python 3.6.2.

About UnicodeDecodeError

I am writing a program to count the words with python(3.6), the code runs smoothly from the terminal. But if I use python IDLE, below error happens:
Traceback (most recent call last):
File "/Users/zhangchaont/python/Course Python Programming/6.7V2.py", line 122, in <module>
main()
File "/Users/zhangchaont/python/Course Python Programming/6.7V2.py", line 21, in main
for line in txtFile:
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 33: ordinal not in range(128)
How to solve this?
Since there is not much info about your code. I can only suggest instead of codecs you can also use this package.
https://github.com/iki/unidecode. The method below should solve your problem. Open your file with open method, and pass it the file_handle.read()
unidecode.unidecode_expect_nonascii(string)

UnicodeEncodeError when trying to read a graph with networkx

I have a little script that takes hashtags from Twitter with the TwitterSearch API and uses them as nodes in a graph with networkx. TwitterSearch returns the hashtags in unicode format, and I have no problems while saving the graph using the write_pajek function. Instead, when I try to read the graph with read_pajek, it returns me this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 2, in read_pajek
File "C:\Python27\lib\site-packages\networkx\utils\decorators.py", line 263, in _open_file
result = func(*new_args, **kwargs)
File "C:\Python27\lib\site-packages\networkx\readwrite\pajek.py", line 134, in read_pajek
return parse_pajek(lines)
File "C:\Python27\lib\site-packages\networkx\readwrite\pajek.py", line 170, in parse_pajek
splitline=shlex.split(str(next(lines)))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 4-5: ordinal not in range(128)
I think the problem is due to the fact that it tries to decode with the ascii codec some chinese/japanese characters, but I don't know how solve it. In the second argument of the function, you can declare the encoding of the file, which by default is "UTF-8", so in theory I shouldn't have any problem while reading it.

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 0: ordinal not in range(128)

I have written the code below that provides the path to an Excel file that will be created with the XLWT module
master_path = r"C:\Users\nbt8ye8\Documents\Docs\Report Automation\KBE Reporting\Reports"
master_excel_file_raw = "KBE Master Data.xls"
master_excel_file = os.path.join(master_path, master_excel_file_raw)
Then later in the code I create the Excel file (which works without a problem) with the code below:
master_excel_wbook = xlwt.Workbook()
master_excel_wsheet = master_excel_wbook.add_sheet("All Data", cell_overwrite_ok=True)
master_excel_wbook.save(master_excel_file)
master_excel_wbook.save(tempfile.TemporaryFile())
However when I run the code it gives me the errors below.
Traceback (most recent call last):
File "C:\Users\nbt8ye8\workspace\Report Automation\import_data.py", line 1225, in <module>
createExcelFile()
File "C:\Users\nbt8ye8\workspace\Report Automation\import_data.py", line 1219, in createExcelFile
master_excel_wbook.save(master_excel_file)
File "build\bdist.win32\egg\xlwt\Workbook.py", line 662, in save
File "build\bdist.win32\egg\xlwt\Workbook.py", line 637, in get_biff_data
File "build\bdist.win32\egg\xlwt\Workbook.py", line 599, in __sst_rec
File "build\bdist.win32\egg\xlwt\BIFFRecords.py", line 76, in get_biff_record
File "build\bdist.win32\egg\xlwt\BIFFRecords.py", line 91, in _add_to_sst
File "build\bdist.win32\egg\xlwt\UnicodeUtils.py", line 50, in upack2
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 0: ordinal not in range(128)
Does anyone know how to resolve this issue? I have tried encoding and decoding the strings and so far it has not worked but it is also highly likely I did not do it correctly. Any help would be greatly apprecieated. Thank you!
#RyanG was correct, this error was thrown because there was unicode data in my Excel file. Once I modified the Excel file to remove the unicode data the problem was resolved. Thanks again.

Categories

Resources