I have spent many hours on this problem but I could not figure out what is going wrong.
I have two programs in python. P1 program uses gevent socket modeul to create two types of connection i.e., TCP and UDP. Each connection is created in a separate greenlet. Program1 runs okay.
Then I have another program P2 that uses Popen from gevent subprocess to invoke P1 as an external program like this:
sub = Popen("python p1.py 1", stdout = PIPE, stderr=STDOUT, shell=True)
As the P1 starts, a greenlet with UDP connection crashes with following error:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/gevent/greenlet.py", line 327, in run
result = self.run(self.args, *self.kwargs)
File "gtest_runner.py", line 883, in run
self.sock.sendto(data, master_addr)
File "/usr/local/lib/python2.7/dist-packages/gevent/socket.py", line 475, in sendto
ret_val = sock.sendto(*args)
TypeError: an integer is required
<Greenlet at 0x7f86a0140730: <bound method Beacon.run of <__main_.Beacon instance at 0x7f86a2c4aa28>>> failed with TypeError
I am using Python 2.7 on ubuntu 14.
Anticipated thanks!
Related
I'm facing a strange situation, I've searched on google without any good results.
I'm running a python script as a subprocess from a parent subprocess with nohup using subprocess package:
cmd = list()
cmd.append("nohup")
cmd.append(sys.executable)
cmd.append(os.path.abspath(script))
cmd.append(os.path.abspath(conf_path))
_env = os.environ.copy()
if env:
_env.update({k: str(v) for k, v in env.items()})
p = subprocess.Popen(cmd, env=_env, cwd=os.getcwd())
After some time the parent process exists and the subprocess (the one with the nohup continues to run).
After another minute or two the process with the nohup exits, and with obvious reasons, becomes a zombie.
When running it on local PC with python3.6 and ubuntu 18.04, I manage to run the following code and everything works like a charm:
comp_process = psutil.Process(pid)
if comp_process.status() == "zombie":
comp_status_code = comp_process.wait(timeout=10)
As I said, everything works like a charm, The zombie process removed and I got the status code of the mentioned process.
But for some reason, when doing the SAME at docker container with the SAME python version and Ubuntu version, It fails after the timeout (Doesn't matter if its 10 seconds or 10 minutes)
The error:
psutil.TimeoutExpired timeout after 60 seconds (pid=779)
Traceback (most recent call last): File
"/usr/local/lib/python3.6/dist-packages/psutil/_psposix.py", line 84,
in wait_pid
retpid, status = waitcall() File "/usr/local/lib/python3.6/dist-packages/psutil/_psposix.py", line 75,
in waitcall
return os.waitpid(pid, os.WNOHANG) ChildProcessError: [Errno 10] No child processes
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File ".py", line 41, in
run
comp_status_code = comp_process.wait(timeout=60) File "/usr/local/lib/python3.6/dist-packages/psutil/init.py", line
1383, in wait
return self._proc.wait(timeout) File "/usr/local/lib/python3.6/dist-packages/psutil/_pslinux.py", line
1517, in wrapper
return fun(self, *args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/psutil/_pslinux.py", line
1725, in wait
return _psposix.wait_pid(self.pid, timeout, self._name) File "/usr/local/lib/python3.6/dist-packages/psutil/_psposix.py", line 96,
in wait_pid
delay = check_timeout(delay) File "/usr/local/lib/python3.6/dist-packages/psutil/_psposix.py", line 68,
in check_timeout
raise TimeoutExpired(timeout, pid=pid, name=proc_name) psutil.TimeoutExpired: psutil.TimeoutExpired timeout after 60 seconds
(pid=779)
One possibility may be the lack of an init process to reap zombies. You can fix this by running with docker run --init, or using e.g. tini. See https://hynek.me/articles/docker-signals/
I'm working on a short, native (do NOT recommend an outside [non-native] module such as pexpect), cross-platform, insecure remote control application for python (Windows will use py2exe and an exe file). I am using start_new_thread for the blocking calls such as readline(). For some reason, however, I get this string of ugliness as my output:
Unhandled exception in thread started by <function read_stream at 0xb6918730>Unhandled exception in thread started by <function send_stream at 0xb69186f0>
Traceback (most recent call last):
Traceback (most recent call last):
File "main.py", line 17, in read_stream
s.send(pipe.stdout.readline())
AttributeError File "main.py", line 14, in send_stream
pipe.stdin.write(s.recv(4096))
AttributeError: 'NoneType' object has no attribute 'stdin'
: 'NoneType' object has no attribute 'stdout'
Here is my program (main.py):
#!/usr/bin/env python
import socket
import subprocess as sp
from thread import start_new_thread
from platform import system
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('10.0.0.201', 49200))
shell = 'powershell.exe' if system() == 'Windows' else '/bin/bash' # is this right?
pipe = sp.Popen(shell, shell=True, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE)
entered_command=False
def send_stream(): # send what you get from command center
while True:
pipe.stdin.write(s.recv(4096))
def read_stream(): # send back what is returned from shell command
while True:
s.send(pipe.stdout.readline())
start_new_thread(send_stream, ())
start_new_thread(read_stream, ())
Thanks for your help.
It turns out that the problem is that the program was trying to exit after the two start_new_thread calls because it had reached the end, and caused errors while trying to do so. So I replaced:
start_new_thread(send_stream, ())
start_new_thread(read_stream, ())
With:
start_new_thread(send_stream, ())
read_stream()
In my python script I have:
os.spawnvpe(os.P_WAIT, cmd[0], cmd, os.environ)
where cmd is something like ['mail', '-b', emails,...] which allows me to run mail interactively and go back to the python script after mail finishes.
The only problem is when I press Ctrl-C. It seems that "both mail and the python script react to it" (*), whereas I'd prefer that while mail is ran, only mail should react, and no exception should be raised by python. Is it possible to achieve it?
(*) What happens exactly on the console is:
^C
(Interrupt -- one more to kill letter)
Traceback (most recent call last):
File "./tutster.py", line 104, in <module>
cmd(cmd_run)
File "./tutster.py", line 85, in cmd
code = os.spawnvpe(os.P_WAIT, cmd[0], cmd, os.environ)
File "/usr/lib/python3.4/os.py", line 868, in spawnvpe
return _spawnvef(mode, file, args, env, execvpe)
File "/usr/lib/python3.4/os.py", line 819, in _spawnvef
wpid, sts = waitpid(pid, 0)
KeyboardInterrupt
and then the mail is in fact sent (which is already bad because the intention was to kill it), but the body is empty and the content is sent as a attachment with a bin extension.
Wrap it with an try/except statement:
try:
os.spawnvpe(os.P_WAIT, cmd[0], cmd, os.environ)
except KeyboardInterrupt:
pass
OpenSolaris derivate (NexentaStor), python 2.5.5
I've seen numerous examples and many seem to indicate that the problem is a deadlock. I'm not writing to stdin so I think the problem is that one of the shell commands exits prematurely.
What's executed in Popen is:
ssh <remotehost> "zfs send tank/dataset#snapshot | gzip -9" | gzip -d | zfs recv tank/dataset
In other words, login to a remote host and (send a replication stream of a storage volume, pipe it to gzip) pipe it to zfs recv to write to a local datastore.
I've seen the explanation about buffers but Im definitely not filling up those, and gzip is bailing out prematurely so I think that the process.wait() never gets an exit.
process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
process.wait()
if process.returncode == 0:
for line in process.stdout:
stdout_arr.append([line])
return stdout_arr
else:
return False
Here's what happens when I run and interrupt it
# ./zfs_replication.py
gzip: stdout: Broken pipe
^CKilled by signal 2.
Traceback (most recent call last):
File "./zfs_replication.py", line 155, in <module>
Exec(zfsSendRecv(dataset, today), LOCAL)
File "./zfs_replication.py", line 83, in Exec
process.wait()
File "/usr/lib/python2.5/subprocess.py", line 1184, in wait
pid, sts = self._waitpid_no_intr(self.pid, 0)
File "/usr/lib/python2.5/subprocess.py", line 1014, in _waitpid_no_intr
return os.waitpid(pid, options)
KeyboardInterrupt
I also tried to use the Popen.communicat() method but that too hangs if gzip bail out. In this case the last part of my command (zfs recv) exits because the local dataset has been modified so the incremental replication stream will not be applied, so even though that will be fixed there has got to be a way of dealing with gzips broken pipes?
process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
stdout, stderr = process.communicate()
if process.returncode == 0:
dosomething()
else:
dosomethingelse()
And when run:
cannot receive incremental stream: destination tank/repl_test has been modified
since most recent snapshot
gzip: stdout: Broken pipe
^CKilled by signal 2.Traceback (most recent call last):
File "./zfs_replication.py", line 154, in <module>
Exec(zfsSendRecv(dataset, today), LOCAL)
File "./zfs_replication.py", line 83, in Exec
stdout, stderr = process.communicate()
File "/usr/lib/python2.5/subprocess.py", line 662, in communicate
stdout = self._fo_read_no_intr(self.stdout)
File "/usr/lib/python2.5/subprocess.py", line 1025, in _fo_read_no_intr
return obj.read()
KeyboardInterrupt
I'm trying to execute ssh commands using paramiko from inside a python daemon process.
I'm using the following implementation for the daemon: https://pypi.python.org/pypi/python-daemon/
When the program is started pycrypto raises an IOError with a Bad file descriptor when paramiko tries to connect.
If I remove the daemon code (just uncomment the last line and comment the two above) the ssh connection is established as expected.
The code for a short test program looks like this:
#!/usr/bin/env python2
from daemon import runner
import paramiko
class App():
def __init__(self):
self.stdin_path = '/dev/null'
self.stdout_path = '/dev/tty'
self.stderr_path = '/dev/tty'
self.pidfile_path = '/tmp/testdaemon.pid'
self.pidfile_timeout = 5
def run(self):
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.load_system_host_keys()
ssh.connect("hostname", username="username")
ssh.close()
app = App()
daemon_runner = runner.DaemonRunner(app)
daemon_runner.do_action()
#app.run()
The trace looks like this:
Traceback (most recent call last):
File "./daemon-test.py", line 31, in <module>
daemon_runner.do_action()
File "/usr/lib/python2.7/site-packages/daemon/runner.py", line 189, in do_action
func(self)
File "/usr/lib/python2.7/site-packages/daemon/runner.py", line 134, in _start
self.app.run()
File "./daemon-test.py", line 22, in run
ssh.connect("hostname", username="username")
File "/usr/lib/python2.7/site-packages/paramiko/client.py", line 311, in connect
t.start_client()
File "/usr/lib/python2.7/site-packages/paramiko/transport.py", line 460, in start_client
Random.atfork()
File "/usr/lib/python2.7/site-packages/Crypto/Random/__init__.py", line 37, in atfork
_UserFriendlyRNG.reinit()
File "/usr/lib/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 224, in reinit
_get_singleton().reinit()
File "/usr/lib/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 171, in reinit
return _UserFriendlyRNG.reinit(self)
File "/usr/lib/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 99, in reinit
self._ec.reinit()
File "/usr/lib/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 62, in reinit
block = self._osrng.read(32*32)
File "/usr/lib/python2.7/site-packages/Crypto/Random/OSRNG/rng_base.py", line 76, in read
data = self._read(N)
File "/usr/lib/python2.7/site-packages/Crypto/Random/OSRNG/posix.py", line 65, in _read
d = self.__file.read(N - len(data))
IOError: [Errno 9] Bad file descriptor
I'm guessing this has something to do with the stream redirection when the daemon spawns. I've tried to set them all to /dev/tty or even to a normal file but nothing works.
When I run the program with strace I can see that something tries to close a file twice and that's when I get the error. But I couldn't find out which file the descriptor actually points to (strace shows a memory location that doesn't seem to be set anywhere).
This is a known issue that I am actually experiencing myself (which is what led me to this question). Basically, it has to do with the definition of a UNIX daemon process and the way paramiko implements its random number generator (RNG).
If you refer to PEP 3143 - Standard daemon process library, the first step in becoming a correct daemon is to "close all open file descriptors." Unfortunately, this closes the file descriptor to /dev/urandom which is used in the Crypto module's RNG which is in turn used by paramiko.
There are some workarounds for the moment, but the author has indicated that he doesn't currently have time to pursue this bug (although the last post in the first link is by the author and is 8 days old as of this writing).
Daemonizing after importing paramiko breaks the random number
generator
EAGAIN on file when using RNG after daemon fork
In summary, if you import paramiko after your process becomes a daemon, then it should work as desired because the file descriptor will have been opened after the daemonizing closes all file descriptors.
The user #xraj also had a hackish, yet clever workaround for finding and preserving the file descriptor to /dev/urandom when daemonizing (first link above):
import os
from resource import getrlimit, RLIMIT_NOFILE
def files_preserve_by_path(*paths):
wanted=[]
for path in paths:
fd = os.open(path, os.O_RDONLY)
try:
wanted.append(os.fstat(fd)[1:3])
finally:
os.close(fd)
def fd_wanted(fd):
try:
return os.fstat(fd)[1:3] in wanted
except OSError:
return False
fd_max = getrlimit(RLIMIT_NOFILE)[1]
return [ fd for fd in xrange(fd_max) if fd_wanted(fd) ]
daemon_context.files_preserve = files_preserve_by_path('/dev/urandom')
This recently happens for daemons and multithreading applications which makes mass close() in cycle of separated thread. I've found the problem in class pipe.PosixPipe. There is no synchronization between set() and close() methods. Methods of PosixPipe could read/write and close descriptor of socket at the same time.
Issue was created: https://github.com/paramiko/paramiko/issues/692
Pull was requested: https://github.com/paramiko/paramiko/pull/691/files