Getting "bad escape" when using nltk in py3 - python

NLTK version 3.4.5. Python 3.7.4. OSX version 10.14.5.
Upgrading the codebase from 2.7, started running into this issue just now. I've done a fresh no-cache reinstall of all packages and extensions, in a fresh virtualenv. Pretty mystified as to how this could be happening to only me and I can't find anyone else having the same error online.
(venv3) gmoss$ python
Python 3.7.4 (default, Sep 7 2019, 18:27:02)
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/site-packages/nltk/__init__.py", line 150, in <module>
from nltk.translate import *
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/site-packages/nltk/translate/__init__.py", line 23, in <module>
from nltk.translate.meteor_score import meteor_score as meteor
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/site-packages/nltk/translate/meteor_score.py", line 10, in <module>
from nltk.stem.porter import PorterStemmer
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/site-packages/nltk/stem/__init__.py", line 29, in <module>
from nltk.stem.snowball import SnowballStemmer
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/site-packages/nltk/stem/snowball.py", line 314, in <module>
class ArabicStemmer(_StandardStemmer):
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/site-packages/nltk/stem/snowball.py", line 326, in ArabicStemmer
r'[\u064b-\u064c-\u064d-\u064e-\u064f-\u0650-\u0651-\u0652]'
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/re.py", line 234, in compile
return _compile(pattern, flags)
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/re.py", line 286, in _compile
p = sre_compile.compile(pattern, flags)
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/sre_compile.py", line 764, in compile
p = sre_parse.parse(p, flags)
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/sre_parse.py", line 930, in parse
p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/sre_parse.py", line 426, in _parse_sub
not nested and not items))
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/sre_parse.py", line 536, in _parse
code1 = _class_escape(source, this)
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/sre_parse.py", line 337, in _class_escape
raise source.error('bad escape %s' % escape, len(escape))
re.error: bad escape \u at position 1

The Python regular expressions dont support the \u escape, as the error message says.
It's strange though that the error comes from the nltk package. The authors of that package know for sure how to write regular expressions. Did you accidentally pick up the Python 2.7 version of the nltk package, even though it Kaminstaller in your 3.7 directory?
I expect that the nltk package has unit tests for all its code. I'd file a bug report against that package.

In case anyone else runs in to this, downgrading to 3.4.2 fixes the issue, as this is before the introduction of ArabicStemmer into the relevant file. I’ve opened an issue with nltk and hopefully it gets resolved.

To follow-up, this was a false alarm: an errant cleanup script was deleting NLTK's shared object file inside my virtual environment and I guess it was falling back to some other version.

Related

Getting error about bad escape during start of Arelle

I am trying to get Arelle working on Ubuntu linux 18.04 with Python 3.6.9.
Step-1: (Download Arelle software):
git clone https://github.com/Arelle/Arelle.git -b lxml
Step-2 Install Python LXML:
apt-get install -y python-lxml
Step-3 Install Python tk:
Due to error: 'No module named tkinter'
...I install:
apt-get install python3-tk
When it's time to start Arelle from terminal, I use:
python3 arelleGUI.pyw
I then get following error:
Traceback (most recent call last):
File "arelleGUI.pyw", line 9, in <module>
from arelle import CntlrWinMain
File "/tmp3/Arelle/arelle/CntlrWinMain.py", line 22, in <module>
from arelle import Cntlr
File "/tmp3/Arelle/arelle/Cntlr.py", line 8, in <module>
from arelle import ModelManager
File "/tmp3/Arelle/arelle/ModelManager.py", line 8, in <module>
from arelle import (ModelXbrl, Validate, DisclosureSystem)
File "/tmp3/Arelle/arelle/Validate.py", line 9, in <module>
from arelle import (ModelXbrl, ModelVersReport, XbrlConst, ModelDocument,
File "/tmp3/Arelle/arelle/ModelVersReport.py", line 9, in <module>
from arelle import (XbrlConst, XbrlUtil, XmlUtil, UrlUtil, ModelXbrl, ModelDocument, ModelVersObject)
File "/tmp3/Arelle/arelle/ModelDocument.py", line 9, in <module>
from arelle import (XbrlConst, XmlUtil, UrlUtil, ValidateFilingText, XmlValidate)
File "/tmp3/Arelle/arelle/ValidateFilingText.py", line 16, in <module>
docCheckPattern = re.compile(r"&\w+;|[^0-9A-Za-z`~!##$%&\*\(\)\.\-+ \[\]\{\}\|\\:;\"'<>,_?/=\t\n\r\m\f]") # won't match &#nnn;
File "/usr/lib/python3.6/re.py", line 233, in compile
return _compile(pattern, flags)
File "/usr/lib/python3.6/re.py", line 301, in _compile
p = sre_compile.compile(pattern, flags)
File "/usr/lib/python3.6/sre_compile.py", line 562, in compile
p = sre_parse.parse(p, flags)
File "/usr/lib/python3.6/sre_parse.py", line 855, in parse
p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
File "/usr/lib/python3.6/sre_parse.py", line 416, in _parse_sub
not nested and not items))
File "/usr/lib/python3.6/sre_parse.py", line 527, in _parse
code1 = _class_escape(source, this)
File "/usr/lib/python3.6/sre_parse.py", line 336, in _class_escape
raise source.error('bad escape %s' % escape, len(escape))
sre_constants.error: bad escape \m at position 67
I found this SO question that seems related to the issue.
This is an error in Arelle, which shows up for Python 3.6 and later. There is a pull request for it , but that is still open (since July 2017). Given that Python 3.6 has been out for quite a while, I don't know why this hasn't been fixed.
You are using the lxml branch, which has been stale for 10 years. So perhaps this error has actually been fixed (even if the pull request is still open) on the master brach, but not on the lxml branch. Try installing from master first, if that is an option for you.

"sre_constants.error: nothing to repeat" error every time trying to install using pip

I just installed Python 2.7.5 on my Windows 10 machine and I also got pip installed through setuptools. However whenever I try to install something using pip e.g. pip install numpy, I get this error message:
Traceback (most recent call last):
File "C:\Python27\Scripts\pip-script.py", line 8, in <module>
load_entry_point('pip==9.0.1', 'console_scripts', 'pip')()
File "C:\Python27\lib\site-packages\pkg_resources.py", line 318, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "C:\Python27\lib\site-packages\pkg_resources.py", line 2221, in load_entry_point
return ep.load()
File "C:\Python27\lib\site-packages\pkg_resources.py", line 1954, in load
entry = __import__(self.module_name, globals(),globals(), ['__name__'])
File "C:\Python27\lib\site-packages\pip-9.0.1-py2.7.egg\pip\__init__.py", line 26, in <module>
from pip.utils import get_installed_distributions, get_prog
File "C:\Python27\lib\site-packages\pip-9.0.1-py2.7.egg\pip\utils\__init__.py", line 27, in <module>
from pip._vendor import pkg_resources
File "C:\Python27\lib\site-packages\pip-9.0.1-py2.7.egg\pip\_vendor\pkg_resources\__init__.py", line 73, in <module>
__import__('pip._vendor.packaging.specifiers')
File "C:\Python27\lib\site-packages\pip-9.0.1-py2.7.egg\pip\_vendor\packaging\specifiers.py", line 275, in <module>
class Specifier(_IndividualSpecifier):
File "C:\Python27\lib\site-packages\pip-9.0.1-py2.7.egg\pip\_vendor\packaging\specifiers.py", line 373, in Specifier
r"^\s*" + _regex_str + r"\s*$", re.VERBOSE | re.IGNORECASE)
File "C:\Python27\Lib\re.py", line 190, in compile
return _compile(pattern, flags)
File "C:\Python27\Lib\re.py", line 242, in _compile
raise error, v # invalid expression
sre_constants.error: nothing to repeat
Since I'm pretty new to Python, I don't really understand what this error means. Any workaround?
I had a similar issue with a library that is using regular expressions (with the re.compile() function). I solved this issue installing the last version available 2.7.13 (I run a Windows 7).
Before I had the 2.7.6 version. I updated it installing the new version with the .msi installer, available in the Python web page.

I've installed the package:pywinauto successfully with "pip install pywinauto", but it always fail, why?

all,
I've installed the package:pywinauto successfully with "pip install pywinauto", but it always fails, why?
I did it in this way:
pip install pywinauto
and then under the windows cmd env, I run the python:
and then:
import pywinauto
I got the following errors:
....
>>> import pywinauto
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\pywinauto\__init__.py", line 40, in <module>
from . import findwindows
File "C:\Python27\lib\site-packages\pywinauto\findwindows.py", line 42, in <module>
from . import controls
File "C:\Python27\lib\site-packages\pywinauto\controls\__init__.py", line 36,in <module>
from . import uiawrapper # register "uia" back-end (at the end of uiawrapper module)
File "C:\Python27\lib\site-packages\pywinauto\controls\uiawrapper.py", line 44, in <module>
from ..uia_defines import IUIA
File "C:\Python27\lib\site-packages\pywinauto\uia_defines.py", line 175, in <module>
pattern_ids = _build_pattern_ids_dic()
File "C:\Python27\lib\site-packages\pywinauto\uia_defines.py", line 163, in _build_pattern_ids_dic
if hasattr(IUIA().ui_automation_client, cls_name):
File "C:\Python27\lib\site-packages\pywinauto\uia_defines.py", line 50, in __call__
cls._instances[cls] = super(_Singleton, cls).__call__(*args, **kwargs)
File "C:\Python27\lib\site-packages\pywinauto\uia_defines.py", line 60, in __init__
self.UIA_dll = comtypes.client.GetModule('UIAutomationCore.dll')
File "C:\Python27\lib\site-packages\comtypes\client\_generate.py", line 97, in GetModule
tlib = comtypes.typeinfo.LoadTypeLibEx(tlib)
File "C:\Python27\lib\site-packages\comtypes\typeinfo.py", line 485, in LoadTypeLibEx
_oleaut32.LoadTypeLibEx(c_wchar_p(szFile), regkind, byref(ptl))
File "_ctypes/callproc.c", line 950, in GetResult
WindowsError: [Error -2147312566] Error loading type library/DLL
It looks like you're using old OS Windows version like Windows XP. MS UI Automation is included into Windows Vista and later. But you may install .NET Framework 3.0+ to get UIAutomationCore.dll available even on Windows XP. If you don't need MS UI Automation technology at all, just run pip uninstall comtypes and pywinauto will work with Win32 API only.
If installing .NET Framework 3.0+ doesn't help (or if it's already installed) and you need to use Win XP, install Windows update KB971513. It solved the loading library issue for me.
I need upgrade the Winxp to Win7 or Win8, or Win10.

tarfile compressionerror bz2 module is not available

I'm trying to install twisted
pip install https://pypi.python.org/packages/18/85/eb7af503356e933061bf1220033c3a85bad0dbc5035dfd9a97f1e900dfcb/Twisted-16.2.0.tar.bz2#md5=8b35a88d5f1a4bfd762a008968fddabf
This is for a django-channels project and I'm having the following error problem
Exception:
Traceback (most recent call last):
File "/home/petarp/.virtualenvs/ErasmusCloneFromGitHub/lib/python3.5/tarfile.py", line 1655, in bz2open
import bz2
File "/usr/local/lib/python3.5/bz2.py", line 22, in <module>
from _bz2 import BZ2Compressor, BZ2Decompressor
ImportError: No module named '_bz2'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/petarp/.virtualenvs/CloneFromGitHub/lib/python3.5/site-packages/pip/basecommand.py", line 215, in main
status = self.run(options, args)
File "/home/petarp/.virtualenvs/ErasmusCloneFromGitHub/lib/python3.5/site-packages/pip/commands/install.py", line 310, in run
wb.build(autobuilding=True)
File "/home/petarp/.virtualenvs/ErasmusCloneFromGitHub/lib/python3.5/site-packages/pip/wheel.py", line 750, in build
self.requirement_set.prepare_files(self.finder)
File "/home/petarp/.virtualenvs/ErasmusCloneFromGitHub/lib/python3.5/site-packages/pip/req/req_set.py", line 370, in prepare_files
ignore_dependencies=self.ignore_dependencies))
File "/home/petarp/.virtualenvs/ErasmusCloneFromGitHub/lib/python3.5/site-packages/pip/req/req_set.py", line 587, in _prepare_file
session=self.session, hashes=hashes)
File "/home/petarp/.virtualenvs/ErasmusCloneFromGitHub/lib/python3.5/site-packages/pip/download.py", line 810, in unpack_url
hashes=hashes
File "/home/petarp/.virtualenvs/ErasmusCloneFromGitHub/lib/python3.5/site-packages/pip/download.py", line 653, in unpack_http_url
unpack_file(from_path, location, content_type, link)
File "/home/petarp/.virtualenvs/ErasmusCloneFromGitHub/lib/python3.5/site-packages/pip/utils/__init__.py", line 605, in unpack_file
untar_file(filename, location)
File "/home/petarp/.virtualenvs/ErasmusCloneFromGitHub/lib/python3.5/site-packages/pip/utils/__init__.py", line 538, in untar_file
tar = tarfile.open(filename, mode)
File "/home/petarp/.virtualenvs/ErasmusCloneFromGitHub/lib/python3.5/tarfile.py", line 1580, in open
return func(name, filemode, fileobj, **kwargs)
File "/home/petarp/.virtualenvs/ErasmusCloneFromGitHub/lib/python3.5/tarfile.py", line 1657, in bz2open
raise CompressionError("bz2 module is not available")
tarfile.CompressionError: bz2 module is not available
Clearly I'm missing bz2 module, so I've tried to installed it manually, but that didn't worked out for python 3.5, so how can I solved this?
I've did what #e4c5 suggested but I did it for python3.5.1, the output is
➜ ~ python3.5
Python 3.5.1 (default, Apr 19 2016, 22:45:11)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import bz2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.5/bz2.py", line 22, in <module>
from _bz2 import BZ2Compressor, BZ2Decompressor
ImportError: No module named '_bz2'
>>>
[3] + 18945 suspended python3.5
➜ ~ dpkg -S /usr/local/lib/python3.5/bz2.py
dpkg-query: no path found matching pattern /usr/local/lib/python3.5/bz2.py
I am on Ubuntu 14.04 LTS and I have installed python 3.5 from source.
I don't seem to have any problem with import bz2 on my python 3.4 installation. So I did
import bz2
print (bz2.__file__)
And found that it's located at /usr/lib/python3.4/bz2.py then I did
dpkg -S /usr/lib/python3.4/bz2.py
This reveals:
libpython3.4-stdlib:amd64: /usr/lib/python3.4/bz2.py
Thus the following command should hopefully fix this:
apt-get install libpython3.4-stdlib
Update:
If you have compiled python 3.5 from sources, it's very likely the bz2 hasn't been compiled in. Please reinstall by first doing
./configure --with-libs='bzip'
The same applies for python 3.6 as well. Note that this will probably complain about other missing dependencies. You will have to install the missing dependencies one by one until everything is covered.
I was able to solve it by removing the _ and changing the import to
from bz2 import BZ2Compressor, BZ2Decompressor
On ubuntu, apt-get install libbz2-dev then compile python again.

py2exe cannot find a module

I have a Python app that works fine. Now I use py2exe to create a windows executable of this app, however the resulting exe fails with complain that it lacks the configobj module
Traceback (most recent call last):
File "file1.py", line 1, in <module>
File "file2.pyc", line 10, in <module>
ImportError: No module named configobj
Line 10 in file2.py is merely from configobj import ConfigObj
I tried to explicitly add configobj to the list of packed modules by specifying -i configobj argument, but then the py2exe run fails with the similar error:
running py2exe
creating C:\path\to\proj\dist
*** generate typelib stubs ***
collected 0 stubs from 1 type libraries
*** searching for required modules ***
Traceback (most recent call last):
File " C:\path\to\proj\py2exe_setup.py", line 18, in <module>
options = {"py2exe": {"typelibs": [('{00020813-0000-0000-C000-000000000046}', 0, 1, 5)]}},
File "C:\Python26\lib\distutils\core.py", line 152, in setup
dist.run_commands()
File "C:\Python26\lib\distutils\dist.py", line 975, in run_commands
self.run_command(cmd)
File "C:\Python26\lib\distutils\dist.py", line 995, in run_command
cmd_obj.run()
File "C:\Python26\lib\site-packages\py2exe\build_exe.py", line 243, in run
self._run()
File "C:\Python26\lib\site-packages\py2exe\build_exe.py", line 296, in _run
self.find_needed_modules(mf, required_files, required_modules)
File "C:\Python26\lib\site-packages\py2exe\build_exe.py", line 1297, in find_needed_modules
mf.import_hook(mod)
File "C:\Python26\lib\site-packages\py2exe\mf.py", line 719, in import_hook
return Base.import_hook(self,name,caller,fromlist,level)
File "C:\Python26\lib\site-packages\py2exe\mf.py", line 136, in import_hook
q, tail = self.find_head_package(parent, name)
File "C:\Python26\lib\site-packages\py2exe\mf.py", line 204, in find_head_package
raise ImportError, "No module named " + qname
ImportError: No module named configobj
The configobj module is installed on my computer in its default location
Python 2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import configobj
>>> print configobj.__version__
4.7.2
>>> import py2exe
C:\Python26\lib\site-packages\py2exe\build_exe.py:16: DeprecationWarning: the sets module is deprecated
import sets
>>> print py2exe.__version__
0.6.9
What am I doing wrong ?
Reinstalling configobj from source fixed the problem. Damn me if I know why

Categories

Resources