Non-ASCII character '\xe2' in file but no encoding declared - python

I wrote a script to extract signals from the MIT-BIH dataset using the wfdb python library. The script was working fine when I was running it on windows but I recently shifted to Mac. After installing all the dependencies I got an error when I tried to import processing from the wfdb library. This is the error I get:
SyntaxError: Non-ASCII character '\xe2' in file /usr/local/lib/python2.7/site-packages/scipy/stats/_continuous_distns.py on line 3346, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
import wfdb works fine but there seems to be a problem when I do from wfdb import processing. Is there any way to solve this issue?

Please add following line at the top of the code.
# -*- coding: utf-8 -*-
Also, avoid using non-ascii quotations.

This error is caused due to copying and pasting code from web which causes stray byte floating. You can find it by running.
with open('my_script.py', 'r') as ms:
for i, line in enumerate(ms):
if '\xe2' in line:
print(i, repr(line))
And the line and its index value will be printed where there is '\xe2':
4, "\xe2 word=string.printable(random.randint[0,61]) # Gets the random word"
Note: You should replace my_script.py with your respective .py file.

This was due to a scipy bug which has been fixed.

Related

Print Bangla in pythone

I declare some variables in Bangla without any syntax error.
But when I want to print it, its gives me the error.
SyntaxError: Non-UTF-8 code starting with '\xff' in file D:/Project/Python Tutorials Repo/condition/condition.py on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
This is my script it Github: https://github.com/banglaosc/condition/blob/master/condition.py
unicode UTF-8 change picture
Here in this image, you can see the red mark. In your code editor, you can see that your Unicode change with another format(UTF-16LE). try to convert with Unicode UTF-16LE to UTF-8 from the bottom right corner. It's working for me.
The encoding is at the bottom right in that PyCharm screenshot: UTF-16LE. PEP 263 says Python will assume a file is ASCII, but it looks like that's been switched to UTF-8 in Python 3.
Try switching the file encoding to UTF-8 by clicking the "UTF-16LE" at the bottom right. If that's not possible, declare the encoding at the top of the script like this:
# -*- coding: UTF-16LE -*-
EDIT: The following applies to python3.5+ this will not work in python2.7
When I run your code I do not get that error. Pycharm show it as an error, but the python interpreter has no issue with the characters. The exception actually raised when running this is TypeError because the variable জুকারবার্গ is an int, and you are trying to use the + on it with a string. The following will execute with no errors.
বিল_গেটস = 20
জুকারবার্গ = 30
ওয়ারেন_বাফেট = 35
ইলন_মাস্ক = 10
if বিল_গেটস > জুকারবার্গ:
print(str(বিল_গেটস) + "বেশি ধনি")
else:
print(str(জুকারবার্গ) + "বেশি ধনি")

code runs from interpreter but not in editor

I am trying to add some symbols to a text file,i can not define these symbols in editor
but it works from command line.
symbols = '$¢£¥€¤' works in interpreter but not editor(sublime),however it doesn't print these symbols correctly in command.However if i decode("utf-8") then print works fine.
symbols = '$¢£¥€¤'
s=symbols.decode("utf-8")
I use python 2.7 and sublime text editor
this is the error i get when i run using editor
SyntaxError: Non-ASCII character '\xc2' in file /home/programmer/Desktop/seleniumIns.py on line 184, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
How can i fix these to add them to my original program in editor
When you run a python file containing unicode, you need to tell the interpreter what encoding is used.
In your case put at the very first line of your script this line:
# -*- coding: utf-8 -*-
And you'll be able to use utf-8!

Non-ASCII character '\xb7' of xlsm file

I am trying to run vba_extract.py in Visual Studio by IronPython. The compilation returns an error:
An unhandled exception of type 'Microsoft.Scripting.SyntaxErrorException' occurred in IronPython.dll
Additional information: Non-ASCII character '\xb7' in file ../../python/vba_extract.py ../../../tests/test.xlsm on line 1, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
I searched this issue, and added encoding in Line 2 of vba_extract.py:
#!python
# -*- coding: utf-8 -*-
But it still does not work.
Additionally, I tried the following two lines, which returned the same error:
engine.ExecuteFile(#"../../python/irrelevant.py ../../../tests/test.xlsm");
engine.ExecuteFile(#"../../tests/test.xlsm");
So I guess the problem is from the encoding of test.xlsm.
Could anyone help?

"SyntaxError: Non-ASCII character" in running Python code

Im starting with python and when i use the interpreter and run this code:
>>>peliculas = ["movie1", "movie2", "movie3", "movie4"]
>>>print(peliculas[1])
when i use Pycharm IDE it doesnt compile:
peliculas = ["movie1", "movie2", "movie3", "movie4"]
print(peliculas[1])
This is the output:
File "/Users/user/PycharmProjects/untitled/Prueba2.py", line 1
SyntaxError: Non-ASCII character '\xc3' in file /Users/user/PycharmProjects/untitled/Prueba2.py on line 1, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
What´s wrong with python?.. do i installed it correctly?
Just add these two lines at to top of python code.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
It's a file encoding problem. Try adding this at the beginning of the file:
#coding=utf-8
In PyCharm you can specify file encoding via the File | File encoding menu, or the file encoding item on the status bar. See this help article

Python "SyntaxError: Non-ASCII character '\xe2' in file" [duplicate]

This question already has answers here:
SyntaxError: Non-ASCII character '\xa3' in file when function returns '£'
(6 answers)
Closed 2 years ago.
I am writing some python code and I am receiving the error message as in the title, from searching this has to do with the character set.
Here is the line that causes the error
hc = HealthCheck("instance_health", interval=15, target808="HTTP:8080/index.html")
I cannot figure out what character is not in the ANSI ASCII set? Furthermore searching "\xe2" does not give anymore information as to what character that appears as. Which character in that line is causing the issue?
I have also seen a few fixes for this issue but I am not sure which to use. Could someone clarify what the issue is (python doesn't interpret unicode unless told to do so?), and how I would clear it up properly?
EDIT:
Here are all the lines near the one that errors
def createLoadBalancer():
conn = ELBConnection(creds.awsAccessKey, creds.awsSecretKey)
hc = HealthCheck("instance_health", interval=15, target808="HTTP:8080/index.html")
lb = conn.create_load_balancer('my_lb', ['us-east-1a', 'us-east-1b'],[(80, 8080, 'http'), (443, 8443, 'tcp')])
lb.configure_health_check(hc)
return lb
If you are just trying to use UTF-8 characters or don't care if they are in your code, add this line to the top of your .py file
# -*- coding: utf-8 -*-
You've got a stray byte floating around. You can find it by running
with open("x.py") as fp:
for i, line in enumerate(fp):
if "\xe2" in line:
print i, repr(line)
where you should replace "x.py" by the name of your program. You'll see the line number and the offending line(s). For example, after inserting that byte arbitrarily, I got:
4 "\xe2 lb = conn.create_load_balancer('my_lb', ['us-east-1a', 'us-east-1b'],[(80, 8080, 'http'), (443, 8443, 'tcp')])\n"
Or you could just simply use:
# coding: utf-8
at top of .py file
\xe2 is the '-' character, it appears in some copy and paste it uses a different equal looking '-' that causes encoding errors.
Replace the '-'(from copy paste) with the correct '-' (from you keyboard button).
Change the file character encoding,
put below line to top of your code always
# -*- coding: utf-8 -*-
I had the same error while copying and pasting a comment from the web
For me it was a single quote (') in the word
I just erased it and re-typed it.
Adding # coding=utf-8 line in first line of your .py file will fix the problem.
Please read more about the problem and its fix on below link, in this article problem and its solution is beautifully described : https://www.python.org/dev/peps/pep-0263/
I got this error for characters in my comments (from copying/pasting content from the web into my editor for note-taking purposes).
To resolve in Text Wrangler:
Highlight the text
Go the the Text menu
Select "Convert to ASCII"
Based on PEP 0263 -- Defining Python Source Code Encodings
Python will default to ASCII as standard encoding if no other
encoding hints are given.
To define a source code encoding, a magic comment must
be placed into the source files either as first or second
line in the file, such as:
# coding=<encoding name>
or (using formats recognized by popular editors)
#!/usr/bin/python
# -*- coding: <encoding name> -*-
or
#!/usr/bin/python
# vim: set fileencoding=<encoding name> :
I had the same issue and just added this to the top of my file (in Python 3 I didn't have the problem but do in Python 2
#!/usr/local/bin/python
# coding: latin-1
If it helps anybody, for me that happened because I was trying to run a Django implementation in python 3.4 with my python 2.7 command
I my case \xe2 was a ’ which should be replaced by '.
In general I recommend to convert UTF-8 to ASCII using e.g. https://onlineasciitools.com/convert-utf8-to-ascii
However if you want to keep UTF-8 you can use
#-*- mode: python -*-
# -*- coding: utf-8 -*-
After about a half hour of looking through stack overflow, It dawned on me that if the use of a single quote " ' " in a comment will through the error:
SyntaxError: Non-ASCII character '\xe2' in file
After looking at the traceback i was able to locate the single quote used in my comment.
I had this exact issue running the simple .py code below:
import sys
print 'version is:', sys.version
DSM's code above provided the following:
1 'print \xe2\x80\x98version is\xe2\x80\x99, sys.version'
So the issue was that my text editor used SMART QUOTES, as John Y suggested. After changing the text editor settings and re-opening/saving the file, it works just fine.
I am trying to parse that weird windows apostraphe and after trying several things here is the code snippet that works.
def convert_freaking_apostrophe(self,string):
try:
issuer_rename = string.decode('windows-1252')
except:
issuer_rename = string.decode('latin-1')
issuer_rename = issuer_rename.replace(u'’', u"'")
issuer_rename = issuer_rename.encode('ascii','ignore')
try:
os.rename(directory+"/"+issuer,directory+"/"+issuer_rename)
print "Successfully renamed "+issuer+" to "+issuer_rename
return issuer_rename
except:
pass
#HANDLING FOR FUNKY APOSTRAPHE
if re.search(r"([\x90-\xff])", issuer):
issuer = self.convert_freaking_apostrophe(issuer)
I fixed this using pycharm. At the bottom of pycharm you can see file encoding. I noticed that it is UT-8. I changed it to US-ASCII
I had the same issue but it was because I copied and pasted the string as it is.
Later when I manually typed the string as it is the error vanished.
I had the error due to the - sign. When I replaced it with manually inputting a - the error was solved.
Copied string 10 + 3 * 5/(16 − 4)
Manually typed string 10 + 3 * 5/(16 - 4)
you can clearly see there is a bit of difference between both the hyphens.
I think it's because of the different formatting used by different OS or maybe just different software.
For me the problem had caused due to "’" that symbol in the quotes. As i had copied the code from a pdf file it caused that error. I just replaced "’" by this "'".
If you want to spot what character caused this just assign the problematic variable to a string and print it in a iPython console.
In my case
In [1]: array = [[24.9, 50.5]​, [11.2, 51.0]] # Raises an error
In [2]: string = "[[24.9, 50.5]​, [11.2, 51.0]]" # Manually paste the above array here
In [3]: string
Out [3]: '[[24.9, 50.5]\xe2\x80\x8b, [11.2, 51.0]]' # Here they are!
for me, the problem was caused by typing my code into Mac Notes and then copied it from Mac Notes and pasted into my vim session to create my file. This made my single quotes the curved type. to fix it I opened my file in vim and replaced all my curved single quotes with the straight kind, just by removing and retyping the same character. It was Mac Notes that made the same key stroke produce the curved single quote.
I was unable to find what's the issue for long but later I realised that I had copied a line "UTC-12:00" from web and the hyphen/dash in this was causing the problem. I just wrote this "-" again and the problem got resolved.
So, sometimes the copy pasted lines also give errors. In such cases, just re-write the copy pasted code and it works. On re-writing, it would look like nothing got changed but the error will be gone.
Plenty of good solutions here.
One challenge not really addressed in any of them is how to visually identify certain hard-to-spot non-ASCII characters that resemble other plain ASCII ones. For example, en dashes can appear almost exactly like hyphens and curly quotes look a lot like straight quotes, depending on your text editor's font.
This one-liner, which should work on Mac or Linux, will strip characters not in the ASCII printable range and show you the differences side-by-side:
# assumes Bash shell; for Bourne shell (sh), rearrange as a pipe and
# give '-' as second argument to 'sdiff' instead
sdiff --suppress-common-lines script.py <(tr -cd '\11\12\15\40-\176' <script.py)
The characters \11, \12, and \15 are tab, newline, and carriage return, respectively, in octal; the remaining range is the visible ASCII characters. (hat tip)
Another tip gleaned from this SO thread uses an inverse character class consisting of anything not in the ASCII visible range, and highlights it:
grep --color '[^ -~]' script.py
This should also work fine with the macOS / BSD version of grep.
When I have a similar issue when reading text files i use...
f = open('file','rt', errors='ignore')

Categories

Resources