pydev multithread debugging - python

I'm trying to debug an application which makes use of the pynetdicom library. I'm not sure how relevant that specific detail is, however what IS relevant is that it makes heavy use of multithreading to run background socket listener tasks without blocking the main thread. The storescp.py example can be used to reproduce this.
Whenever I place a breakpoint that gets encountered (regardless of what thread, main or child, it gets encountered in), I get the following traceback:
Traceback (most recent call last):
File "/Applications/Aptana Studio 3/plugins/org.python.pydev_2.7.0.2013012902/pysrc/pydevd.py", line 1397, in <module>
debugger.run(setup['file'], None, None)
File "/Applications/Aptana Studio 3/plugins/org.python.pydev_2.7.0.2013012902/pysrc/pydevd.py", line 1090, in run
pydev_imports.execfile(file, globals, locals) #execute the script
File "/Users/alexw/Development/Python/kreport2/KReport2/dicomdatascraper.py", line 183, in <module>
oldDicomList = copy.copy(newData)
File "/Users/alexw/Development/Python/kreport2/KReport2/dicomdatascraper.py", line 183, in <module>
oldDicomList = copy.copy(newData)
File "/Applications/Aptana Studio 3/plugins/org.python.pydev_2.7.0.2013012902/pysrc/pydevd_frame.py", line 135, in trace_dispatch
self.doWaitSuspend(thread, frame, event, arg)
File "/Applications/Aptana Studio 3/plugins/org.python.pydev_2.7.0.2013012902/pysrc/pydevd_frame.py", line 25, in doWaitSuspend
self._args[0].doWaitSuspend(*args, **kwargs)
File "/Applications/Aptana Studio 3/plugins/org.python.pydev_2.7.0.2013012902/pysrc/pydevd.py", line 832, in doWaitSuspend
self.processInternalCommands()
File "/Applications/Aptana Studio 3/plugins/org.python.pydev_2.7.0.2013012902/pysrc/pydevd.py", line 360, in processInternalCommands
thread_id = GetThreadId(t)
File "/Applications/Aptana Studio 3/plugins/org.python.pydev_2.7.0.2013012902/pysrc/pydevd_constants.py", line 140, in GetThreadId
return thread.__pydevd_id__
File "/Users/alexw/.virtualenvs/kreport2dev/devlibs/pynetdicom/source/netdicom/applicationentity.py", line 73, in __getattr__
obj = eval(attr)()
File "<string>", line 1, in <module>
NameError: name '__pydevd_id__' is not defined
My thought is that, perhaps, in order to make things work, PyDev monkey-patches a __pydevd_id__ into spawned threads, however fails to patch those into these threads because they are, in fact, subclasses and not direct instances of threading.Thread (in this case, the worker is an instance of class Association(threading.Thread):).
Of course, I don't know PyDev well enough to confirm this theory, or else fix it. And it seems neither does the internet.
Is subclassing Thread so rarely used a pattern that it's simply not considered in the PyDev architecture? Without re-architecting the library, how could this issue be remedied?

I simply needed to look harder at that traceback.
The pynetdicom library, in its subclassing of threading.Thread, overrode __getattr__ and somewhat broke it. The problem was:
def __getattr__(self, attr):
#while not self.AssociationEstablished:
# time.sleep(0.001)
obj = eval(attr)
# do some stuff
return obj
when a nonexistent attribute is passed, a NameError is raised. This isn't caught by pydev's monkeypatching routine (if thread.__pydevd_id__ raises AttributeError, thread.__pydevd_id__ = stuff)
The solution was to update that section thusly:
def __getattr__(self, attr):
#while not self.AssociationEstablished:
# time.sleep(0.001)
try:
obj = eval(attr)
except NameError:
raise AttributeError
# do some stuff
return obj
This intercepts the NameError and raises an AttributeError instead, as __getattr__ should if the queried attribute doesn't exist.

Related

Multiprocessing using Pool class in Python giving Pickling error

I am trying to run a simple multiprocessing example in python3.6 in a zeppelin notebook(in windows) but I am not able to execute it. Below is the code that i used:
def sqrt(x):
return x**0.5
numbers = [i for i in range(1000000)]
with Pool() as pool:
sqrt_ls = pool.map(sqrt, numbers)
After running this code I am getting the following error:
Traceback (most recent call last):
File "/tmp/zeppelin_python-3196160128578820301.py", line 315, in <module>
exec(code, _zcUserQueryNameSpace)
File "<stdin>", line 6, in <module>
File "/usr/lib64/python3.6/multiprocessing/pool.py", line 266, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/lib64/python3.6/multiprocessing/pool.py", line 644, in get
raise self._value
File "/usr/lib64/python3.6/multiprocessing/pool.py", line 424, in _handle_tasks
put(task)
File "/usr/lib64/python3.6/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/usr/lib64/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function sqrt at 0x7f6f84f1a620>: attribute lookup sqrt on __main__ failed
I am not sure if its just me who is facing the issue. As i have seen so many articles where people can run the code easily. If you know the solution please help
Thanks
From the multiprocessing documentation:
Note: Functionality within this package requires that the main module be importable by the children. This is covered in Programming guidelines however it is worth pointing out here. This means that some examples, such as the Pool examples will not work in the interactive interpreter.
Notebooks are running Python interactive interpreters behind the scene, so it's probably why you get this error. You can try to run your code from within a if __name__ == '__main__': statement.
A Zeppelin notebook does not emulate a normal module well enough to support the pickling that is used to identify the correct operation to another process. You can put all the functions you want to call into a proper module that you import in the usual fashion.

AttributeError: 'str' object has no attribute 'errno'

I placed a ClientConnectionError exception in a multiprocessing.Queue that was generated by asyncio. I did this to pass an exception generated in asyncio land back to a client in another thread/process.
My assumption is that this exception occurred during the deserialization process reading the exception out of the queue. It looks pretty much impossible to reach otherwise.
Traceback (most recent call last):
File "model_neural_simplified.py", line 318, in <module>
main(**arg_parser())
File "model_neural_simplified.py", line 314, in main
globals()[command](**kwargs)
File "model_neural_simplified.py", line 304, in predict
next_neural_data, next_sample = reader.get_next_result()
File "/project_neural_mouse/src/asyncs3/s3reader.py", line 174, in get_next_result
result = future.result()
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "model_neural_simplified.py", line 245, in read_sample
f_bytes = s3f.read(read_size)
File "/project_neural_mouse/src/asyncs3/s3reader.py", line 374, in read
size, b = self._issue_request(S3Reader.READ, (self.url, size, self.position))
File "/project_neural_mouse/src/asyncs3/s3reader.py", line 389, in _issue_request
response = self.communication_channels[uuid].get()
File "/usr/lib/python3.6/multiprocessing/queues.py", line 113, in get
return _ForkingPickler.loads(res)
File "/usr/local/lib/python3.6/dist-packages/aiohttp/client_exceptions.py", line 133, in __init__
super().__init__(os_error.errno, os_error.strerror)
AttributeError: 'str' object has no attribute 'errno'
I figure it's a long shot to ask, but does anyone know anything about this issue?
Python 3.6.8, aiohttp.__version__ == 3.6.0
Update:
I managed to reproduce the issue (credit to Samuel in comments for improving the minimal reproducible test case, and later xtreak at bugs.python.org for furthing distilling it to a pickle-only test case):
import pickle
ose = OSError(1, 'unittest')
class SubOSError(OSError):
def __init__(self, foo, os_error):
super().__init__(os_error.errno, os_error.strerror)
cce = SubOSError(1, ose)
cce_pickled = pickle.dumps(cce)
pickle.loads(cce_pickled)
./python.exe ../backups/bpo38254.py
Traceback (most recent call last):
File "/Users/karthikeyansingaravelan/stuff/python/cpython/../backups/bpo38254.py", line 12, in <module>
pickle.loads(cce_pickled)
File "/Users/karthikeyansingaravelan/stuff/python/cpython/../backups/bpo38254.py", line 8, in __init__
super().__init__(os_error.errno, os_error.strerror)
AttributeError: 'str' object has no attribute 'errno'
References:
https://github.com/aio-libs/aiohttp/issues/4077
https://bugs.python.org/issue38254
OSError has a custom __reduce__ implementation; unfortunately, it's not subclass friendly for subclasses that don't match the expected arguments. You can see the intermediate state of the pickling by calling __reduce__ manually:
>>> SubOSError.__reduce__(cce)
(modulename.SubOSError, (1, 'unittest'))
The first element of the tuple is the callable to call, the second is the tuple of arguments to pass. So when it tries to recreate your class, it does:
modulename.SubOSError(1, 'unittest')
having lost the information about the OSError you were originally created with.
If you must accept arguments that don't match what OSError.__reduce__/OSError.__init__ expects, you're going to need to write your own __reduce__ override to ensure the correct information is pickled. A simple version might be:
class SubOSError(OSError):
def __init__(self, foo, os_error):
self.foo = foo # Must preserve information for pickling later
super().__init__(os_error.errno, os_error.strerror)
def __reduce__(self):
# Pickle as type plus tuple of args expected by type
return type(self), (self.foo, OSError(*self.args))
With that design, SubOSError.__reduce__(cce) would now return:
(modulename.SubOSError, (1, PermissionError(1, 'unittest')))
where the second element of the tuple is the correct arguments needed to recreate the instance (the change from OSError to PermissionError is expected; OSError actually returns its own subclasses based on the errno).
This issue was fixed and merged to master in aiohttp on 25 Sep 2019. I'll update this answer in the future if I note a version that the fix goes into (feel free to edit this answer in the future to note a version containing this update).
Git issue with the fix:
https://github.com/aio-libs/aiohttp/issues/4077

Tweepy issues with twitter bot and python

I have a few twitterbots that I run on my raspberryPi. I have most functions wrapped in a try / except to ensure that if something errors it doesn't break the program and continues to execute.
I'm also using Python's Streaming library as my source of monitoring for the tags that I want the bot to retweet.
Here is an issue that happens that kills the program although I have the main function wrapped in a try/except:
Unhandled exception in thread started by <function startBot5 at 0x762fbed0>
Traceback (most recent call last):
File "TwitButter.py", line 151, in startBot5
'<botnamehere>'
File "/home/pi/twitter/bots/TwitBot.py", line 49, in __init__
self.startFiltering(trackList)
File "/home/pi/twitter/bots/TwitBot.py", line 54, in startFiltering
self.myStream.filter(track=tList)
File "/usr/local/lib/python3.4/dist-packages/tweepy/streaming.py", line 445, in filter
self._start(async)
File "/usr/local/lib/python3.4/dist-packages/tweepy/streaming.py", line 361, in _start
self._run()
File "/usr/local/lib/python3.4/dist-packages/tweepy/streaming.py", line 294, in _run
raise exception
File "/usr/local/lib/python3.4/dist-packages/tweepy/streaming.py", line 263, in _run
self._read_loop(resp)
File "/usr/local/lib/python3.4/dist-packages/tweepy/streaming.py", line 313, in _read_loop
line = buf.read_line().strip()
AttributeError: 'NoneType' object has no attribute 'strip'
My setup:
I have a parent class TwitButter.py, that creates an object from the TwitBot.py. These objects are the bots, and they are started on their own thread so they can run independently.
I have a function in the TwitBot that runs the startFiltering() function. It is wrapped in a try/except, but my except code is never triggered.
My guess is that the error is occurring within the Streaming library. Maybe that library is poorly coded and breaks on the line that is specified at the bottom of the traceback.
Any help would be awesome, and I wonder if others have experienced this issue?
I can provide extra details if needed.
Thanks!!!
This actually is problem in tweepy that was fixed by github #870 in 2017-04. So, should be resolved by updating your local copy to latest master.
What I did to discover that:
Did a web search to find the tweepy source repo.
Looked at streaming.py for context on the last traceback lines.
Noticed the most recent change to the file was the same problem.
I'll also note that most of the time you get a traceback from deep inside a Python library, the problem comes from the code calling it incorrectly, rather than a bug in the library. But not always. :)

Error with multiprocessing, atexit and global data

Sorry in advance, this is going to be long ...
Possibly related:
Python Multiprocessing atexit Error "Error in atexit._run_exitfuncs"
Definitely related:
python parallel map (multiprocessing.Pool.map) with global data
Keyboard Interrupts with python's multiprocessing Pool
Here's a "simple" script I hacked together to illustrate my problem...
import time
import multiprocessing as multi
import atexit
cleanup_stuff=multi.Manager().list([])
##################################################
# Some code to allow keyboard interrupts
##################################################
was_interrupted=multi.Manager().list([])
class _interrupt(object):
"""
Toy class to allow retrieval of the interrupt that triggered it's execution
"""
def __init__(self,interrupt):
self.interrupt=interrupt
def interrupt():
was_interrupted.append(1)
def interruptable(func):
"""
decorator to allow functions to be "interruptable" by
a keyboard interrupt when in python's multiprocessing.Pool.map
**Note**, this won't actually cause the Map to be interrupted,
It will merely cause the following functions to be not executed.
"""
def newfunc(*args,**kwargs):
try:
if(not was_interrupted):
return func(*args,**kwargs)
else:
return False
except KeyboardInterrupt as e:
interrupt()
return _interrupt(e) #If we really want to know about the interrupt...
return newfunc
#atexit.register
def cleanup():
for i in cleanup_stuff:
print(i)
return
#interruptable
def func(i):
print(i)
cleanup_stuff.append(i)
time.sleep(float(i)/10.)
return i
#Must wrap func here, otherwise it won't be found in __main__'s dict
#Maybe because it was created dynamically using the decorator?
def wrapper(*args):
return func(*args)
if __name__ == "__main__":
#This is an attempt to use signals -- I also attempted something similar where
#The signals were only caught in the child processes...Or only on the main process...
#
#import signal
#def onSigInt(*args): interrupt()
#signal.signal(signal.SIGINT,onSigInt)
#Try 2 with signals (only catch signal on main process)
#import signal
#def onSigInt(*args): interrupt()
#signal.signal(signal.SIGINT,onSigInt)
#def startup(): signal.signal(signal.SIGINT,signal.SIG_IGN)
#p=multi.Pool(processes=4,initializer=startup)
#Try 3 with signals (only catch signal on child processes)
#import signal
#def onSigInt(*args): interrupt()
#signal.signal(signal.SIGINT,signal.SIG_IGN)
#def startup(): signal.signal(signal.SIGINT,onSigInt)
#p=multi.Pool(processes=4,initializer=startup)
p=multi.Pool(4)
try:
out=p.map(wrapper,range(30))
#out=p.map_async(wrapper,range(30)).get() #This doesn't work either...
#The following lines don't work either
#Effectively trying to roll my own p.map() with p.apply_async
# results=[p.apply_async(wrapper,args=(i,)) for i in range(30)]
# out = [ r.get() for r in results() ]
except KeyboardInterrupt:
print ("Hello!")
out=None
finally:
p.terminate()
p.join()
print (out)
This works just fine if no KeyboardInterrupt is raised. However, if I raise one, the following exception occurs:
10
7
9
12
^CHello!
None
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "test.py", line 58, in cleanup
for i in cleanup_stuff:
File "<string>", line 2, in __getitem__
File "/usr/lib/python2.6/multiprocessing/managers.py", line 722, in _callmethod
self._connect()
File "/usr/lib/python2.6/multiprocessing/managers.py", line 709, in _connect
conn = self._Client(self._token.address, authkey=self._authkey)
File "/usr/lib/python2.6/multiprocessing/connection.py", line 143, in Client
c = SocketClient(address)
File "/usr/lib/python2.6/multiprocessing/connection.py", line 263, in SocketClient
s.connect(address)
File "<string>", line 1, in connect
error: [Errno 2] No such file or directory
Error in sys.exitfunc:
Traceback (most recent call last):
File "/usr/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "test.py", line 58, in cleanup
for i in cleanup_stuff:
File "<string>", line 2, in __getitem__
File "/usr/lib/python2.6/multiprocessing/managers.py", line 722, in _callmethod
self._connect()
File "/usr/lib/python2.6/multiprocessing/managers.py", line 709, in _connect
conn = self._Client(self._token.address, authkey=self._authkey)
File "/usr/lib/python2.6/multiprocessing/connection.py", line 143, in Client
c = SocketClient(address)
File "/usr/lib/python2.6/multiprocessing/connection.py", line 263, in SocketClient
s.connect(address)
File "<string>", line 1, in connect
socket.error: [Errno 2] No such file or directory
Interestingly enough, the code does exit the Pool.map function without calling any of the additional functions ... The problem seems to be that the KeyboardInterrupt isn't handled properly at some point, but it is a little confusing where that is, and why it isn't handled in interruptable. Thanks.
Note, the same problem happens if I use out=p.map_async(wrapper,range(30)).get()
EDIT 1
A little closer ... If I enclose the out=p.map(...) in a try,except,finally clause, it gets rid of the first exception ... the other ones are still raised in atexit however. The code and traceback above have been updated.
EDIT 2
Something else that does not work has been added to the code above as a comment. (Same error). This attempt was inspired by:
http://jessenoller.com/2009/01/08/multiprocessingpool-and-keyboardinterrupt/
EDIT 3
Another failed attempt using signals added to the code above.
EDIT 4
I have figured out how to restructure my code so that the above is no longer necessary. In the (unlikely) event that someone stumbles upon this thread with the same use-case that I had, I will describe my solution ...
Use Case
I have a function which generates temporary files using the tempfile module. I would like those temporary files to be cleaned up when the program exits. My initial attempt was to pack each temporary file name into a list and then delete all the elements of the list with a function registered via atexit.register. The problem is that the updated list was not being updated across multiple processes. This is where I got the idea of using multiprocessing.Manager to manage the list data. Unfortunately, this fails on a KeyboardInterrupt no matter how hard I tried because the communication sockets between processes were broken for some reason. The solution to this problem is simple. Prior to using multiprocessing, set the temporary file directory ... something like tempfile.tempdir=tempfile.mkdtemp() and then register a function to delete the temporary directory. Each of the processes writes to the same temporary directory, so it works. Of course, this solution only works where the shared data is a list of files that needs to be deleted at the end of the program's life.

DBus object error

I'm trying to make a script to launch my custom script when my usb stick connected.
I found nice python script here but when it calls GetAllProperties() method I get an exception:
ERROR:dbus.connection:Exception in handler for D-Bus signal: Traceback
(most recent call last): File
"/usr/lib/python2.7/site-packages/dbus/connection.py", line 214, in
maybe_handle_message
self._handler(*args, **kwargs) File "./hal-automount", line 31, in device_added
properties = self.udi_to_device(udi).GetAllProperties()
File "/usr/lib/python2.7/site-packages/dbus/proxies.py", line 68, in
__call__
return self._proxy_method(*args, **keywords) File "/usr/lib/python2.7/site-packages/dbus/proxies.py", line 140, in
__call__
**keywords) File "/usr/lib/python2.7/site-packages/dbus/connection.py", line 630, in
call_blocking
message, timeout) DBusException: org.freedesktop.DBus.Error.AccessDenied: Rejected send message, 3
matched rules; type="method_call", sender=":1.39539" (uid=0 pid=9527
comm="python) interface="(unset)" member="getAllProperties" error
name="(unset)" requested_reply=0 destination=":1.8" (uid=0 pid=3039
comm="/usr/sbin/hald))
OS: openSuSE 11.4
I didn't work with DBus before, can you give me a hint what's wrong?
Thanks.
Your DBus method call failed due to access policy. It is probably because you called a method without specifying any interface. Looks like a bug in the script you tried to use (DBus methods should always be called via an interface).
Try replacing:
def udi_to_device(self, udi):
return self.bus.get_object("org.freedesktop.Hal", udi)
With:
def udi_to_device(self, udi):
obj = self.bus.get_object("org.freedesktop.Hal", udi)
return dbus.Interface(obj, dbus_interface='org.freedesktop.Hal.Device')
BTW: HAL is now obsolete, you should probably switch to udisks. See http://www.freedesktop.org/wiki/Software/hal

Categories

Resources