Error when pickling ParseResult in Python 3.5.1 - python

I have test code that works in Python 2.7.11, but fails in Python 3.5.1:
import pyparsing as pp
import pickle
class Greeting():
def __init__(self, toks):
self.salutation = toks[0]
self.greetee = toks[1]
word = pp.Word(pp.alphas+"'.")
salutation = pp.OneOrMore(word)
comma = pp.Literal(",")
greetee = pp.OneOrMore(word)
endpunc = pp.oneOf("! ?")
greeting = salutation + pp.Suppress(comma) + greetee + pp.Suppress(endpunc)
greeting.setParseAction(Greeting)
string = 'Good morning, Miss Crabtree!'
g = greeting.parseString(string)
pkl = 'test .pkl'
pickle.dump(g, open(pkl, 'wb'))
pickle.load(open(pkl, 'rb'))
The error message is as follows:
Traceback (most recent call last):
File "C:/Users/Arne/parser/test.py", line 23, in <module>
pickle.load(open(pkl, 'rb'))
TypeError: __new__() missing 1 required positional argument: 'toklist'
__new__() refers to pyparsing.ParseResults.__new__(cls, toklist, name=None, asList=True, modal=True ).
Is it still in general possible to pickle objects returned by pyparsing in Python 3.5.1 or has something changed?
Could somebody provide a brief code sample of their use of pickle and pyparsing 2.0.7?
My real grammar takes a few minutes to parse, so I really would appreciate being able to store the results before further processing.

This fails with protocol=2 (optional 3rd arg to pickle.dump), but passes if you use pickle protocol = 0 or 1. On Python 2.7.10, 0 is the default protocol. On Python 3.5, pickle has protocols 0-4, and again, pickling ParseResults only works with protocols 0 and 1. But in Py3.5, the default protocol has changed to 3. You can work around this problem for now by specifying a protocol of 0 or 1.
More info on pickle protocols at https://docs.python.org/3/library/pickle.html?highlight=pickle#data-stream-format

Related

Why is my Discord Bot in GitHub not working?

When I run main.py it's giving me this error. I was just trying to replicate a bot from GitHub and I didn't know it would be this difficult, here is the GitHub:
https://github.com/Normynator/RagnaDBot
C:\testebot\RagnaDBot>python main.py
INFO:Config logging is enabled and set to: 10
DEBUG:Using proactor: IocpProactor
Traceback (most recent call last):
File "C:\testebot\RagnaDBot\main.py", line 29, in <module>
_settings = load.load_settings(_config)
File "C:\testebot\RagnaDBot\lib\load.py", line 11, in load_settings
document = yaml.load(document)
TypeError: load() missing 1 required positional argument: 'Loader'```
load.py:
#!/usr/bin/env python3
import yaml
import logging
from lib import mvp # required for yaml
from lib.mvp import MVP
def load_settings(path):
with open(path) as f:
document = f.read()
document = yaml.load(document) - i think the problem is possibly here
logging.debug(document)
return document
main.py:
# Path to the config file
_config = "config.yml"
_client = discord.Client()
_settings = load.load_settings(_config) - problem possibly be here too
_mvp_list = load.parse_mvp_list(_settings['mvp_list'])
_channel = discord.Object(id=_settings['channel_id'])
_debug_core = False
_time_mult = 60 # .sleep works with seconds, to get minutes multiply by 60
You are using PyYAML's old load() function which for the longest time defaulted to possible unsafe behavior on unchecked YAML. That is why in 2020
this default was finally deprecated.
If you don't have any tags in your YAML you should use:
document = yaml.safe_load(document)
if you do have tags in your YAML you can use
`yaml.load(document, Loader=yaml.FullLoader)`
, but note that would require registering of classes for the tags. Few (too few) programs use tags, so try the safe_load option first to see if that works.
Please note that the recommended extension for YAML files has been .yaml since 2006.

Use dll-function in python with ctypes | python 2.7 vs. 3.6

I have a C++-Library (example.dll) with some functions and want to use them from Python. I got it working with ctypes and Python 2.7, but not with Python 3.6.
Here an excerpt from example.h :
extern "C" __declspec(dllexport) LONG _stdcall func1(LPTSTR filename, long cbAddress);
The function is parsing a txt-file, loads some data to the memory and returns a handle to this data.
My code for Python 2.7 is:
from ctypes import *
mydll = WinDLL(r'C:\temp\example.dll')
txt_path = c_char_p(r'C:\temp\file.txt')
func1 = mydll['func1']
func1.restype = c_long
func1.argtypes = (c_char_p , c_void_p)
handle = func1(txt_path, None)
This is working and returns a valid handle.
With Python 3.6, line 3 causes an error:
>>> txt_path = c_char_p(r'C:\temp\file.txt')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: bytes or integer address expected instead of str instance
so I changed it to
txt_path = c_char_p(b'C:\temp\file.txt')
which is working, but leads to a problem with the function. func1 returns now 0, which is an intern error code for "could not read txt-file."
I tried to use different types, but it didn't work and I'm kind of stuck now.
Can anyone help?
The issue was simply the escaping of backslashes (someone with a cynical point of view might say the issue is Windows) the line in question txt_path = c_char_p(b'C:\temp\file.txt') had to be changed to:
txt_path = c_char_p(b'C:\\temp\\file.txt')

Error when calling flickrapi.photosets.getPhotos method

I am trying to use flickrapi from #sybren on python 3.4.
Therefore i cloned the main branch of the repo and installed the package.
Some function calls do work, but some give me this error:
Traceback (most recent call last):
File "D:\personal works\flickrWorks\flickr_derpage.py", line 20, in <module>
flickr.photosets.getPhotos(set_id)
TypeError: __call__() takes 1 positional argument but 2 were given
The call to the function is this one:
import flickrapi
import xml.etree.ElementTree as ET
# config stuff
api_key = 'fuhsdkjfsdjkfsjk'
api_secret = 'fdjksnfkjsdnfkj'
user_tbp_dev = "fednkjfnsdjkfnjksdn5"
# le program
flickr = flickrapi.FlickrAPI(api_key, api_secret)
sets = flickr.photosets.getList(user_id=user_tbp_dev)
set0 = sets.find('photosets').findall('photoset')
set_id = set0[0].get('id')
sett_photos = flickr.photosets.getPhotos(set_id)
print(ET.dump(sett_photos))
Another method which gives the same error is:
flickr.reflection.getMethodInfo("flickr.photos.search")
Any ideas what might i do wrong, or if the library has some issues (as the python3 branch is still under development).
Thanks!
The flickrapi expects the parameters to the functions to be named arguments, not positional. This works for me:
flickr.photosets_getPhotos(photoset_id=set_id, extras="license, date_upload, date_taken")
To get a list of the argument names for the Flickr calls, see the documentation here: https://www.flickr.com/services/api/flickr.photosets.getPhotos.html

pylint on in-memory file/stream

I'd like to embed pylint in a program. The user enters python programs (in Qt, in a QTextEdit, although not relevant) and in the background I call pylint to check the text he enters. Finally, I print the errors in a message box.
There are thus two questions: First, how can I do this without writing the entered text to a temporary file and giving it to pylint ? I suppose at some point pylint (or astroid) handles a stream and not a file anymore.
And, more importantly, is it a good idea ? Would it cause problems for imports or other stuffs ? Intuitively I would say no since it seems to spawn a new process (with epylint) but I'm no python expert so I'm really not sure. And if I use this to launch pylint, is it okay too ?
Edit:
I tried tinkering with pylint's internals, event fought with it, but finally have been stuck at some point.
Here is the code so far:
from astroid.builder import AstroidBuilder
from astroid.exceptions import AstroidBuildingException
from logilab.common.interface import implements
from pylint.interfaces import IRawChecker, ITokenChecker, IAstroidChecker
from pylint.lint import PyLinter
from pylint.reporters.text import TextReporter
from pylint.utils import PyLintASTWalker
class Validator():
def __init__(self):
self._messagesBuffer = InMemoryMessagesBuffer()
self._validator = None
self.initValidator()
def initValidator(self):
self._validator = StringPyLinter(reporter=TextReporter(output=self._messagesBuffer))
self._validator.load_default_plugins()
self._validator.disable('W0704')
self._validator.disable('I0020')
self._validator.disable('I0021')
self._validator.prepare_import_path([])
def destroyValidator(self):
self._validator.cleanup_import_path()
def check(self, string):
return self._validator.check(string)
class InMemoryMessagesBuffer():
def __init__(self):
self.content = []
def write(self, st):
self.content.append(st)
def messages(self):
return self.content
def reset(self):
self.content = []
class StringPyLinter(PyLinter):
"""Does what PyLinter does but sets checkers once
and redefines get_astroid to call build_string"""
def __init__(self, options=(), reporter=None, option_groups=(), pylintrc=None):
super(StringPyLinter, self).__init__(options, reporter, option_groups, pylintrc)
self._walker = None
self._used_checkers = None
self._tokencheckers = None
self._rawcheckers = None
self.initCheckers()
def __del__(self):
self.destroyCheckers()
def initCheckers(self):
self._walker = PyLintASTWalker(self)
self._used_checkers = self.prepare_checkers()
self._tokencheckers = [c for c in self._used_checkers if implements(c, ITokenChecker)
and c is not self]
self._rawcheckers = [c for c in self._used_checkers if implements(c, IRawChecker)]
# notify global begin
for checker in self._used_checkers:
checker.open()
if implements(checker, IAstroidChecker):
self._walker.add_checker(checker)
def destroyCheckers(self):
self._used_checkers.reverse()
for checker in self._used_checkers:
checker.close()
def check(self, string):
modname = "in_memory"
self.set_current_module(modname)
astroid = self.get_astroid(string, modname)
self.check_astroid_module(astroid, self._walker, self._rawcheckers, self._tokencheckers)
self._add_suppression_messages()
self.set_current_module('')
self.stats['statement'] = self._walker.nbstatements
def get_astroid(self, string, modname):
"""return an astroid representation for a module"""
try:
return AstroidBuilder().string_build(string, modname)
except SyntaxError as ex:
self.add_message('E0001', line=ex.lineno, args=ex.msg)
except AstroidBuildingException as ex:
self.add_message('F0010', args=ex)
except Exception as ex:
import traceback
traceback.print_exc()
self.add_message('F0002', args=(ex.__class__, ex))
if __name__ == '__main__':
code = """
a = 1
print(a)
"""
validator = Validator()
print(validator.check(code))
The traceback is the following:
Traceback (most recent call last):
File "validator.py", line 16, in <module>
main()
File "validator.py", line 13, in main
print(validator.check(code))
File "validator.py", line 30, in check
self._validator.check(string)
File "validator.py", line 79, in check
self.check_astroid_module(astroid, self._walker, self._rawcheckers, self._tokencheckers)
File "c:\Python33\lib\site-packages\pylint\lint.py", line 659, in check_astroid_module
tokens = tokenize_module(astroid)
File "c:\Python33\lib\site-packages\pylint\utils.py", line 103, in tokenize_module
print(module.file_stream)
AttributeError: 'NoneType' object has no attribute 'file_stream'
# And sometimes this is added :
File "c:\Python33\lib\site-packages\astroid\scoped_nodes.py", line 251, in file_stream
return open(self.file, 'rb')
OSError: [Errno 22] Invalid argument: '<?>'
I'll continue digging tomorrow. :)
I got it running.
the first one (NoneType …) is really easy and a bug in your code:
Encountering an exception can make get_astroid “fail”, i.e. send one syntax error message and return!
But for the secong one… such bullshit in pylint’s/logilab’s API… Let me explain: Your astroid object here is of type astroid.scoped_nodes.Module.
It’s also created by a factory, AstroidBuilder, which sets astroid.file = '<?>'.
Unfortunately, the Module class has following property:
#property
def file_stream(self):
if self.file is not None:
return open(self.file, 'rb')
return None
And there’s no way to skip that except for subclassing (Which would render us unable to use the magic in AstroidBuilder), so… monkey patching!
We replace the ill-defined property with one that checks an instance for a reference to our code bytes (e.g. astroid._file_bytes) before engaging in above default behavior.
def _monkeypatch_module(module_class):
if module_class.file_stream.fget.__name__ == 'file_stream_patched':
return # only patch if patch isn’t already applied
old_file_stream_fget = module_class.file_stream.fget
def file_stream_patched(self):
if hasattr(self, '_file_bytes'):
return BytesIO(self._file_bytes)
return old_file_stream_fget(self)
module_class.file_stream = property(file_stream_patched)
That monkeypatching can be called just before calling check_astroid_module. But one more thing has to be done. See, there’s more implicit behavior: Some checkers expect and use astroid’s file_encoding field. So we now have this code in the middle of check:
astroid = self.get_astroid(string, modname)
if astroid is not None:
_monkeypatch_module(astroid.__class__)
astroid._file_bytes = string.encode('utf-8')
astroid.file_encoding = 'utf-8'
self.check_astroid_module(astroid, self._walker, self._rawcheckers, self._tokencheckers)
One could say that no amount of linting creates actually good code. Unfortunately pylint unites enormous complexity with a specialization of calling it on files. Really good code has a nice native API and wraps that with a CLI interface. Don’t ask me why file_stream exists if internally, Module gets built from but forgets the source code.
PS: i had to change sth else in your code: load_default_plugins has to come before some other stuff (maybe prepare_checkers, maybe sth. else)
PPS: i suggest subclassing BaseReporter and using that instead of your InMemoryMessagesBuffer
PPPS: this just got pulled (3.2014), and will fix this: https://bitbucket.org/logilab/astroid/pull-request/15/astroidbuilderstring_build-was/diff
4PS: this is now in the official version, so no monkey patching required: astroid.scoped_nodes.Module now has a file_bytes property (without leading underscore).
Working with an unlocatable stream may definitly cause problems in case of relative imports, since the location is then needed to find the actually imported module.
Astroid support building an AST from a stream, but this is not used/exposed through Pylint which is a level higher and designed to work with files. So while you may acheive this it will need a bit of digging into the low-level APIs.
The easiest way is definitly to save the buffer to the file then to use the SA answer to start pylint programmatically if you wish (totally forgot this other account of mine found in other responses ;). Another option being to write a custom reporter to gain more control.

Is the copyreg module being used properly here?

Can anyone tell me why my code and function serialization handlers are not working below? The copyreg module is fairly unfamiliar to me, and it is not clear if the code below is written properly.
>>> import pickle, copyreg, types, marshal
>>> def average(*args):
return sum(args) / len(args)
>>> average_dump = pickle.dumps(average)
>>> del average
>>> average = pickle.loads(average_dump)
Traceback (most recent call last):
File "<pyshell#31>", line 1, in <module>
average = pickle.loads(average_dump)
AttributeError: 'module' object has no attribute 'average'
>>> copyreg.pickle(types.CodeType,
lambda code: (marshal.loads, (marshal.dumps(code),)),
marshal.loads)
>>> up = lambda co, ns, de, cl: types.FunctionType(co, globals(), na, de, cl)
>>> copyreg.pickle(types.FunctionType,
lambda function: (up, (function.__code__,
function.__name__,
function.__defaults__,
function.__closure__)),
up)
>>> def average(*args):
return sum(args) / len(args)
>>> average_dump
b'\x80\x03c__main__\naverage\nq\x00.'
>>> pickle.dumps(average)
b'\x80\x03c__main__\naverage\nq\x00.'
>>> del average; average = pickle.loads(average_dump)
Traceback (most recent call last):
File "<pyshell#39>", line 1, in <module>
del average; average = pickle.loads(average_dump)
AttributeError: 'module' object has no attribute 'average'
My expectation is that if the registered functions were working properly, then both code and function objects would be serialized. If that worked as expected, unpickling functions would also be possible.
Edit: subclassing Pickler as suggested in this answer. Does not seem to help either. The function from the example is still being serialized by name instead of the handlers from the copyreg module.
>>> import pickle, copyreg, types, marshal
>>> copyreg.pickle(types.CodeType,
lambda code: (marshal.loads, (marshal.dumps(code),)),
marshal.loads)
>>> up = lambda co, ns, de, cl: types.FunctionType(co, globals(), na, de, cl)
>>> copyreg.pickle(types.FunctionType,
lambda function: (up, (function.__code__,
function.__name__,
function.__defaults__,
function.__closure__)),
up)
>>> class MyPickler(pickle.Pickler):
def __init__(self, *args):
super().__init__(*args)
self.dispatch_table = copyreg.dispatch_table
>>> def average(*args):
return sum(args) / len(args)
>>> x = io.BytesIO(); y = MyPickler(x)
>>> y.dump(average)
>>> x.getvalue()
b'\x80\x03c__main__\naverage\nq\x00.'
If you want to serialize functions, please run the following command:
pip install dill
Once done, you can import dill and use it in place of the pickle module.
If you are running Python 3 and want easy access to pip, put a batch file in your Windows directory named pip.bat and put the following line in it (assumes you have a proxy that interferes with SSL):
py -3 -m pip %* --trusted-host pypi.org --trusted-host files.pythonhosted.org
Function's can't be pickled by-value:
Note that functions (built-in and user-defined) are pickled by “fully qualified” name reference, not by value. This means that only the function name is pickled, along with the name of the module the function is defined in. Neither the function’s code, nor any of its function attributes are pickled. Thus the defining module must be importable in the unpickling environment, and the module must contain the named object, otherwise an exception will be raised.
(http://docs.python.org/2/library/pickle.html#what-can-be-pickled-and-unpickled)

Categories

Resources