Paramiko inside Python Daemon causes IOError - python

I'm trying to execute ssh commands using paramiko from inside a python daemon process.
I'm using the following implementation for the daemon: https://pypi.python.org/pypi/python-daemon/
When the program is started pycrypto raises an IOError with a Bad file descriptor when paramiko tries to connect.
If I remove the daemon code (just uncomment the last line and comment the two above) the ssh connection is established as expected.
The code for a short test program looks like this:
#!/usr/bin/env python2
from daemon import runner
import paramiko
class App():
def __init__(self):
self.stdin_path = '/dev/null'
self.stdout_path = '/dev/tty'
self.stderr_path = '/dev/tty'
self.pidfile_path = '/tmp/testdaemon.pid'
self.pidfile_timeout = 5
def run(self):
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.load_system_host_keys()
ssh.connect("hostname", username="username")
ssh.close()
app = App()
daemon_runner = runner.DaemonRunner(app)
daemon_runner.do_action()
#app.run()
The trace looks like this:
Traceback (most recent call last):
File "./daemon-test.py", line 31, in <module>
daemon_runner.do_action()
File "/usr/lib/python2.7/site-packages/daemon/runner.py", line 189, in do_action
func(self)
File "/usr/lib/python2.7/site-packages/daemon/runner.py", line 134, in _start
self.app.run()
File "./daemon-test.py", line 22, in run
ssh.connect("hostname", username="username")
File "/usr/lib/python2.7/site-packages/paramiko/client.py", line 311, in connect
t.start_client()
File "/usr/lib/python2.7/site-packages/paramiko/transport.py", line 460, in start_client
Random.atfork()
File "/usr/lib/python2.7/site-packages/Crypto/Random/__init__.py", line 37, in atfork
_UserFriendlyRNG.reinit()
File "/usr/lib/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 224, in reinit
_get_singleton().reinit()
File "/usr/lib/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 171, in reinit
return _UserFriendlyRNG.reinit(self)
File "/usr/lib/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 99, in reinit
self._ec.reinit()
File "/usr/lib/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 62, in reinit
block = self._osrng.read(32*32)
File "/usr/lib/python2.7/site-packages/Crypto/Random/OSRNG/rng_base.py", line 76, in read
data = self._read(N)
File "/usr/lib/python2.7/site-packages/Crypto/Random/OSRNG/posix.py", line 65, in _read
d = self.__file.read(N - len(data))
IOError: [Errno 9] Bad file descriptor
I'm guessing this has something to do with the stream redirection when the daemon spawns. I've tried to set them all to /dev/tty or even to a normal file but nothing works.
When I run the program with strace I can see that something tries to close a file twice and that's when I get the error. But I couldn't find out which file the descriptor actually points to (strace shows a memory location that doesn't seem to be set anywhere).

This is a known issue that I am actually experiencing myself (which is what led me to this question). Basically, it has to do with the definition of a UNIX daemon process and the way paramiko implements its random number generator (RNG).
If you refer to PEP 3143 - Standard daemon process library, the first step in becoming a correct daemon is to "close all open file descriptors." Unfortunately, this closes the file descriptor to /dev/urandom which is used in the Crypto module's RNG which is in turn used by paramiko.
There are some workarounds for the moment, but the author has indicated that he doesn't currently have time to pursue this bug (although the last post in the first link is by the author and is 8 days old as of this writing).
Daemonizing after importing paramiko breaks the random number
generator
EAGAIN on file when using RNG after daemon fork
In summary, if you import paramiko after your process becomes a daemon, then it should work as desired because the file descriptor will have been opened after the daemonizing closes all file descriptors.
The user #xraj also had a hackish, yet clever workaround for finding and preserving the file descriptor to /dev/urandom when daemonizing (first link above):
import os
from resource import getrlimit, RLIMIT_NOFILE
def files_preserve_by_path(*paths):
wanted=[]
for path in paths:
fd = os.open(path, os.O_RDONLY)
try:
wanted.append(os.fstat(fd)[1:3])
finally:
os.close(fd)
def fd_wanted(fd):
try:
return os.fstat(fd)[1:3] in wanted
except OSError:
return False
fd_max = getrlimit(RLIMIT_NOFILE)[1]
return [ fd for fd in xrange(fd_max) if fd_wanted(fd) ]
daemon_context.files_preserve = files_preserve_by_path('/dev/urandom')

This recently happens for daemons and multithreading applications which makes mass close() in cycle of separated thread. I've found the problem in class pipe.PosixPipe. There is no synchronization between set() and close() methods. Methods of PosixPipe could read/write and close descriptor of socket at the same time.
Issue was created: https://github.com/paramiko/paramiko/issues/692
Pull was requested: https://github.com/paramiko/paramiko/pull/691/files

Related

Flask-RQ2 Redis error: ZADD requires an equal number of values and scores

I have tried implementing the basic Flask-RQ2 setup as per the docs to attempt to write to two separate files concurrently, but I am getting the following Redis error: redis.exceptions.RedisError: ZADD requires an equal number of values and scores when the worker tries to perform a job in the Redis queue.
Here's the full stack trace:
10:20:37: Worker rq:worker:1d0c83d6294249018669d9052fd759eb: started, version 1.2.0
10:20:37: *** Listening on default...
10:20:37: Cleaning registries for queue: default
10:20:37: default: tester('time keeps on slipping') (02292167-c7e8-4040-a97b-742f96ea8756)
10:20:37: Worker rq:worker:1d0c83d6294249018669d9052fd759eb: found an unhandled exception, quitting...
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/rq/worker.py", line 515, in work
self.execute_job(job, queue)
File "/usr/local/lib/python3.6/dist-packages/rq/worker.py", line 727, in execute_job
self.fork_work_horse(job, queue)
File "/usr/local/lib/python3.6/dist-packages/rq/worker.py", line 667, in fork_work_horse
self.main_work_horse(job, queue)
File "/usr/local/lib/python3.6/dist-packages/rq/worker.py", line 744, in main_work_horse
raise e
File "/usr/local/lib/python3.6/dist-packages/rq/worker.py", line 741, in main_work_horse
self.perform_job(job, queue)
File "/usr/local/lib/python3.6/dist-packages/rq/worker.py", line 866, in perform_job
self.prepare_job_execution(job, heartbeat_ttl)
File "/usr/local/lib/python3.6/dist-packages/rq/worker.py", line 779, in prepare_job_execution
registry.add(job, timeout, pipeline=pipeline)
File "/usr/local/lib/python3.6/dist-packages/rq/registry.py", line 64, in add
return pipeline.zadd(self.key, {job.id: score})
File "/usr/local/lib/python3.6/dist-packages/redis/client.py", line 1691, in zadd
raise RedisError("ZADD requires an equal number of "
redis.exceptions.RedisError: ZADD requires an equal number of values and scores
My main is this:
#!/usr/bin/env python3
from flask import Flask
from flask_rq2 import RQ
import time
import tester
app = Flask(__name__)
rq = RQ(app)
default_worker = rq.get_worker()
default_queue = rq.get_queue()
tester = tester.Tester()
while True:
default_queue.enqueue(tester.tester, args=["time keeps on slipping"])
default_worker.work(burst=True)
with open('test_2.txt', 'w+') as f:
data = f.read() + "it works!\n"
time.sleep(5)
if __name__ == "__main__":
app.run()
and my tester.py module is thus:
#!/usr/bin/env python3
import time
class Tester:
def tester(string):
with open('test.txt', 'w+') as f:
data = f.read() + string + "\n"
f.write(data)
time.sleep(5)
I'm using the following:
python==3.6.7-1~18.04
redis==2.10.6
rq==1.2.0
Flask==1.0.2
Flask-RQ2==18.3
I've also tried the simpler setup from the documentation were you don't specify either queue or worker but implicitly rely on the Flask-RQ2 module defaults... Any help with this would be greatly appreciated.
Reading the docs a bit deeper it seems for it to work with Redis<3.0 you need Flask-RQ2==18.2.2.
I have different errors now but I can work with this.

pyserial with multiprocessing gives me a ctype error

Hi I'm trying to write a module that lets me read and send data via pyserial. I have to be able to read the data in parallel to my main script. With the help of a stackoverflow user, I have a basic and working skeleton of the program, but when I tried adding a class I created that uses pyserial (handles finding port, speed, etc) found here I get the following error:
File "<ipython-input-1-830fa23bc600>", line 1, in <module>
runfile('C:.../pythonInterface1/Main.py', wdir='C:/Users/Daniel.000/Desktop/Daniel/Python/pythonInterface1')
File "C:...\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
execfile(filename, namespace)
File "C:...\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/Daniel.000/Desktop/Daniel/Python/pythonInterface1/Main.py", line 39, in <module>
p.start()
File "C:...\Anaconda3\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "C:...\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:...\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:...\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
reduction.dump(process_obj, to_child)
File "C:...\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
ValueError: ctypes objects containing pointers cannot be pickled
This is the code I am using to call the class in SerialConnection.py
import multiprocessing
from time import sleep
from operator import methodcaller
from SerialConnection import SerialConnection as SC
class Spawn:
def __init__(self, _number, _max):
self._number = _number
self._max = _max
# Don't call update here
def request(self, x):
print("{} was requested.".format(x))
def update(self):
while True:
print("Spawned {} of {}".format(self._number, self._max))
sleep(2)
if __name__ == '__main__':
'''
spawn = Spawn(1, 1) # Create the object as normal
p = multiprocessing.Process(target=methodcaller("update"), args=(spawn,)) # Run the loop in the process
p.start()
while True:
sleep(1.5)
spawn.request(2) # Now you can reference the "spawn"
'''
device = SC()
print(device.Port)
print(device.Baud)
print(device.ID)
print(device.Error)
print(device.EMsg)
p = multiprocessing.Process(target=methodcaller("ReadData"), args=(device,)) # Run the loop in the process
p.start()
while True:
sleep(1.5)
device.SendData('0003')
What am I doing wrong for this class to be giving me problems? Is there some form of restriction to use pyserial and multiprocessing together? I know it can be done but I don't understand how...
here is the traceback i get from python
Traceback (most recent call last): File "C:...\Python\pythonInterface1\Main.py", line 45, in <module>
p.start()
File "C:...\AppData\Local\Programs\Python\Python36-32\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:...\AppData\Local\Programs\Python\Python36-32\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:...\AppData\Local\Programs\Python\Python36-32\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:...\AppData\Local\Programs\Python\Python36-32\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
reduction.dump(process_obj, to_child)
File "C:...\AppData\Local\Programs\Python\Python36-32\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj) ValueError: ctypes objects containing pointers cannot be pickled
You are trying to pass a SerialConnection instance to another process as an argument. For that python has first to serialize (pickle) the object, and it is not possible for SerialConnection objects.
As said in Rob Streeting's answer, a possible solution would be to allow the SerialConnection object to be copied to the other process' memory using the fork that occurs when multiprocessing.Process.start is invoked, but this will not work on Windows as it does not use fork.
A simpler, cross-platform and more efficient way to achieve parallelism in your code would be to use a thread instead of a process. The changes to your code are minimal:
import threading
p = threading.Thread(target=methodcaller("ReadData"), args=(device,))
I think the problem is due to something inside device being unpicklable (i.e., not serializable by python). Take a look at this page to see if you can see any rules that may be broken by something in your device object.
So why does device need to be picklable at all?
When a multiprocessing.Process is started, it uses fork() at the operating system level (unless otherwise specified) to create the new process. What this means is that the whole context of the parent process is "copied" over to the child. This does not require pickling, as it's done at the operating system level.
(Note: On unix at least, this "copy" is actually a pretty cheap operation because it used a feature called "copy-on-write". This means that both parent and child processes actually read from the same memory until one or the other modifies it, at which point the original state is copied over to the child process.)
However, the arguments of the function that you want the process to take care of do have to be pickled, because they are not part of the main process's context. So, that includes your device variable.
I think you might be able to resolve your issue by allowing device to be copied as part of the fork operation rather than passing it in as a variable. To do this though, you'll need a wrapper function around the operation you want your process to do, in this case methodcaller("ReadData"). Something like this:
if __name__ == "__main__":
device = SC()
def call_read_data():
device.ReadData()
...
p = multiprocessing.Process(target=call_read_data) # Run the loop in the process
p.start()

EFOError when trying to connect Pyftpsync to remote server on port 22

I am trying to sync two folders via FTP, yes I know there are better or different ways but for now I need to implement it this way, I was trying the example code from pyftpsync since well, a sample code should work easily right? I am just trying to connect between some test folders I made, one is empty(local) and the remote has a single text file that I want to fetch. It tries to connect but after about 2 minutes I get this error.
Well, my FTP does work outside of python. I can connect over WinSCP just fine.
Some places mentioned that a proxy could possibly cause this, but it seems I am not behind a proxy currently, but maybe I did not set that properly and it believes there should be a proxy somehow?
Here is my code, just using commands on the prompt for pyftpsync produces the same errors for me. So it is possible some input parameter is off causing all of this.
import time
import os
import re
import shutil
import string
import sys
from ftpsync.targets import FsTarget
from ftpsync.ftp_target import FtpTarget
from ftpsync.synchronizers import DownloadSynchronizer
#synchronize a local folder with ftp
local = FsTarget( "C:\\testfolder\\" )
user = "login"
passwd = "password"
remote = FtpTarget("/my/folder/location/testfold/", "126.0.0.1",port=22, username=user,password=passwd,tls=False,timeout=None,extra_opts=None)
opts = {}
s=DownloadSynchronizer(local, remote, opts)
s.run()
This is the output I am getting, I have edited out the folder names and IP addresses.
INFO:keyring.backend:Loading KWallet
INFO:keyring.backend:Loading SecretService
INFO:keyring.backend:Loading Windows
INFO:keyring.backend:Loading chainer
INFO:keyring.backend:Loading macOS
INFO:pyftpsync:Download to C:\testfolder
from ftp://126.0.0.1/.../testfold
INFO:pyftpsync:Encoding local: utf-8, remote: utf-8
Traceback (most recent call last):
File "c:\..\.py", line 30, in <module>
s.run()
File "C:\\AppData\Local\Programs\Python\Python37-32\lib\site-
packages\ftpsync\synchronizers.py", line 1268, in run
res = super(DownloadSynchronizer, self).run()
File "C:\\AppData\Local\Programs\Python\Python37-
32\lib\site-packages\ftpsync\synchronizers.py", line 827, in run
res = super(BiDirSynchronizer, self).run()
File "C:\\AppData\Local\Programs\Python\Python37-
32\lib\site-packages\ftpsync\synchronizers.py", line 211, in run
self.remote.open()
File "C:\\AppData\Local\Programs\Python\Python37-
32\lib\site-packages\ftpsync\ftp_target.py", line 141, in open
self.ftp.connect(self.host, self.port)
File "C:\\AppData\Local\Programs\Python\Python37-
32\lib\ftplib.py", line 155, in connect
self.welcome = self.getresp()
File "C:\\Local\Programs\Python\Python37-
32\lib\ftplib.py", line 236, in getresp
resp = self.getmultiline()
File "C:\\AppData\Local\Programs\Python\Python37-
32\lib\ftplib.py", line 226, in getmultiline
nextline = self.getline()
File "C:\\AppData\Local\Programs\Python\Python37-
32\lib\ftplib.py", line 210, in getline
raise EOFError
EOFError
Anyways any possible troubleshooting ideas would help. Thank you.
Pyftpsync uses FTP protocol.
You are connecting to port 22, which is used for SSH/SFTP.
So if your server is actually SFTP server, not FTP server, you cannot use Pyftpsync with it.

Python - python-daemon lockfile timeout on lock.aquire()

I am using python-daemon module to manage the daemon process of my Python script.
However, I am running into a headache when running the script that I simply can't figure out. Nor do I really know how to begin to debug it.
I have the code:
def run_application():
#Do something here...
class App():
def __init__(self):
self.stdin_path = '/dev/null'
self.stdout_path = 'stdout.txt'
self.stderr_path = 'stdlog.log'
self.pidfile_path = 'filelock.pid'
self.pidfile_timeout = 5
def run(self):
run_application()
app = App()
daemon_runner = runner.DaemonRunner(app)
daemon_runner.do_action()
When run, it always writes the following to stdlog.log:
Traceback (most recent call last):
File "MyApp.py", line 335, in <module>
daemon_runner.do_action()
File "/anaconda/lib/python2.7/site-packages/daemon/runner.py", line 189, in do_action
func(self)
File "/anaconda/lib/python2.7/site-packages/daemon/runner.py", line 124, in _start
self.daemon_context.open()
File "/anaconda/lib/python2.7/site-packages/daemon/daemon.py", line 346, in open
self.pidfile.__enter__()
File "/anaconda/lib/python2.7/site-packages/lockfile/__init__.py", line 229, in __enter__
self.acquire()
File "/anaconda/lib/python2.7/site-packages/daemon/pidfile.py", line 42, in acquire
super(TimeoutPIDLockFile, self).acquire(timeout, *args, **kwargs)
File "/anaconda/lib/python2.7/site-packages/lockfile/pidlockfile.py", line 88, in acquire
self.path)
lockfile.LockTimeout: Timeout waiting to acquire lock for /MyApp/filelock.pid
So it appears to be timing out when attempting to lock filelock.pid. I have no idea why this is. I have deleted filelock.pid, I've changed permissions; same error every time.
How can I begin to debug this??? I'm at a loss.
I am using python-daemon version 1.6 (if it matters).
Update:
Following the advice here, I now see that there is already a process running. Now how can I figure out how to determine the PID of the running daemon process.
I agree with #ExploWare as far as how he demonstrates you can capture those LockTimeout exceptions.
So as far as a way to debug and see what process is holding on to this lock, here is an external bit of code you can run...
import daemon.pidfile
import os
import lockfile
# We know the lockfile name.
pidfile = daemon.pidfile.PIDLockFile(
os.path.join("/MyApp/","filelock.pid"))
# This current process id...
os.getpid()
# 46337
So what process has acquired this lock if any?
pidfile.is_locked()
# True
pidfile.read_pid()
# 96856
When our PIDLockFile instance tries to "acquire",
pidfile.__dict__
# {'unique_name': '/MyApp/filelock.pid', 'lock_file': '/MyApp/filelock.pid.lock', 'hostname':
# 'MyMachine.local', 'pid': 46337, 'timeout': None, 'tname': '', 'path': '/MyApp/filelock.pid'}
pidfile.acquire()
#
# (Had to Control-C quit because I didnt set a timeout on PIDLockFile )
#
# ^CTraceback (most recent call last):
# File "<stdin>", line 1, in <module>
# File "/Users/michal/venf/lib/python2.7/site-packages/lockfile/pidlockfile.py", line 92, in acquire
# time.sleep(timeout is not None and timeout/10 or 0.1)
# KeyboardInterrupt
So instead, use #ExploWare 's exception catching.
# Wait only 5 seconds.
pidfile.timeout = 5
try:
pidfile.acquire()
except lockfile.LockTimeout:
print 'locked . need to wait or move on.'
#
# locked . need to wait or move on.
I found a nice way to handle this exception, so maybe its helpful for you as well:
Add
from lockfile import LockTimeout
to the beginning of the script and surround daemon_runner.doaction() like this
try:
daemon_runner.do_action()
except LockTimeout:
print "Error: couldn't aquire lock"
#you can exit here or try something else

Error with multiprocessing, atexit and global data

Sorry in advance, this is going to be long ...
Possibly related:
Python Multiprocessing atexit Error "Error in atexit._run_exitfuncs"
Definitely related:
python parallel map (multiprocessing.Pool.map) with global data
Keyboard Interrupts with python's multiprocessing Pool
Here's a "simple" script I hacked together to illustrate my problem...
import time
import multiprocessing as multi
import atexit
cleanup_stuff=multi.Manager().list([])
##################################################
# Some code to allow keyboard interrupts
##################################################
was_interrupted=multi.Manager().list([])
class _interrupt(object):
"""
Toy class to allow retrieval of the interrupt that triggered it's execution
"""
def __init__(self,interrupt):
self.interrupt=interrupt
def interrupt():
was_interrupted.append(1)
def interruptable(func):
"""
decorator to allow functions to be "interruptable" by
a keyboard interrupt when in python's multiprocessing.Pool.map
**Note**, this won't actually cause the Map to be interrupted,
It will merely cause the following functions to be not executed.
"""
def newfunc(*args,**kwargs):
try:
if(not was_interrupted):
return func(*args,**kwargs)
else:
return False
except KeyboardInterrupt as e:
interrupt()
return _interrupt(e) #If we really want to know about the interrupt...
return newfunc
#atexit.register
def cleanup():
for i in cleanup_stuff:
print(i)
return
#interruptable
def func(i):
print(i)
cleanup_stuff.append(i)
time.sleep(float(i)/10.)
return i
#Must wrap func here, otherwise it won't be found in __main__'s dict
#Maybe because it was created dynamically using the decorator?
def wrapper(*args):
return func(*args)
if __name__ == "__main__":
#This is an attempt to use signals -- I also attempted something similar where
#The signals were only caught in the child processes...Or only on the main process...
#
#import signal
#def onSigInt(*args): interrupt()
#signal.signal(signal.SIGINT,onSigInt)
#Try 2 with signals (only catch signal on main process)
#import signal
#def onSigInt(*args): interrupt()
#signal.signal(signal.SIGINT,onSigInt)
#def startup(): signal.signal(signal.SIGINT,signal.SIG_IGN)
#p=multi.Pool(processes=4,initializer=startup)
#Try 3 with signals (only catch signal on child processes)
#import signal
#def onSigInt(*args): interrupt()
#signal.signal(signal.SIGINT,signal.SIG_IGN)
#def startup(): signal.signal(signal.SIGINT,onSigInt)
#p=multi.Pool(processes=4,initializer=startup)
p=multi.Pool(4)
try:
out=p.map(wrapper,range(30))
#out=p.map_async(wrapper,range(30)).get() #This doesn't work either...
#The following lines don't work either
#Effectively trying to roll my own p.map() with p.apply_async
# results=[p.apply_async(wrapper,args=(i,)) for i in range(30)]
# out = [ r.get() for r in results() ]
except KeyboardInterrupt:
print ("Hello!")
out=None
finally:
p.terminate()
p.join()
print (out)
This works just fine if no KeyboardInterrupt is raised. However, if I raise one, the following exception occurs:
10
7
9
12
^CHello!
None
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "test.py", line 58, in cleanup
for i in cleanup_stuff:
File "<string>", line 2, in __getitem__
File "/usr/lib/python2.6/multiprocessing/managers.py", line 722, in _callmethod
self._connect()
File "/usr/lib/python2.6/multiprocessing/managers.py", line 709, in _connect
conn = self._Client(self._token.address, authkey=self._authkey)
File "/usr/lib/python2.6/multiprocessing/connection.py", line 143, in Client
c = SocketClient(address)
File "/usr/lib/python2.6/multiprocessing/connection.py", line 263, in SocketClient
s.connect(address)
File "<string>", line 1, in connect
error: [Errno 2] No such file or directory
Error in sys.exitfunc:
Traceback (most recent call last):
File "/usr/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "test.py", line 58, in cleanup
for i in cleanup_stuff:
File "<string>", line 2, in __getitem__
File "/usr/lib/python2.6/multiprocessing/managers.py", line 722, in _callmethod
self._connect()
File "/usr/lib/python2.6/multiprocessing/managers.py", line 709, in _connect
conn = self._Client(self._token.address, authkey=self._authkey)
File "/usr/lib/python2.6/multiprocessing/connection.py", line 143, in Client
c = SocketClient(address)
File "/usr/lib/python2.6/multiprocessing/connection.py", line 263, in SocketClient
s.connect(address)
File "<string>", line 1, in connect
socket.error: [Errno 2] No such file or directory
Interestingly enough, the code does exit the Pool.map function without calling any of the additional functions ... The problem seems to be that the KeyboardInterrupt isn't handled properly at some point, but it is a little confusing where that is, and why it isn't handled in interruptable. Thanks.
Note, the same problem happens if I use out=p.map_async(wrapper,range(30)).get()
EDIT 1
A little closer ... If I enclose the out=p.map(...) in a try,except,finally clause, it gets rid of the first exception ... the other ones are still raised in atexit however. The code and traceback above have been updated.
EDIT 2
Something else that does not work has been added to the code above as a comment. (Same error). This attempt was inspired by:
http://jessenoller.com/2009/01/08/multiprocessingpool-and-keyboardinterrupt/
EDIT 3
Another failed attempt using signals added to the code above.
EDIT 4
I have figured out how to restructure my code so that the above is no longer necessary. In the (unlikely) event that someone stumbles upon this thread with the same use-case that I had, I will describe my solution ...
Use Case
I have a function which generates temporary files using the tempfile module. I would like those temporary files to be cleaned up when the program exits. My initial attempt was to pack each temporary file name into a list and then delete all the elements of the list with a function registered via atexit.register. The problem is that the updated list was not being updated across multiple processes. This is where I got the idea of using multiprocessing.Manager to manage the list data. Unfortunately, this fails on a KeyboardInterrupt no matter how hard I tried because the communication sockets between processes were broken for some reason. The solution to this problem is simple. Prior to using multiprocessing, set the temporary file directory ... something like tempfile.tempdir=tempfile.mkdtemp() and then register a function to delete the temporary directory. Each of the processes writes to the same temporary directory, so it works. Of course, this solution only works where the shared data is a list of files that needs to be deleted at the end of the program's life.

Categories

Resources