ZMQ socket gracefully termination in Python - python

I have the following ZMQ script
#!/usr/bin/env python2.6
import signal
import sys
import zmq
context = zmq.Context()
socket = context.socket(zmq.SUB)
def signal_term_handler(signal, fname):
socket.close()
sys.exit(0)
def main():
signal.signal(signal.SIGTERM, signal_term_handler)
socket.connect('tcp://16.160.163.27:8888')
socket.setsockopt(zmq.SUBSCRIBE, '')
print 'Waiting for a message'
while True:
(event, params) = socket.recv().split()
# ... doing something with that data ...
if __name__ == '__main__':
main()
When I Ctrl-C, I get the following errors:
Traceback (most recent call last):
File "./nag.py", line 28, in <module>
main()
File "./nag.py", line 24, in main
(event, params) = socket.recv().split()
File "socket.pyx", line 628, in zmq.backend.cython.socket.Socket.recv (zmq/backend/cython/socket.c:5616)
File "socket.pyx", line 662, in zmq.backend.cython.socket.Socket.recv (zmq/backend/cython/socket.c:5436)
File "socket.pyx", line 139, in zmq.backend.cython.socket._recv_copy (zmq/backend/cython/socket.c:1771)
File "checkrc.pxd", line 11, in zmq.backend.cython.checkrc._check_rc (zmq/backend/cython/socket.c:5863)
KeyboardInterrupt
Now, I thought I handled the closing of the socket, when receiving a termination signal from the user, pretty well, then why do I get this ugly messages. What am I missing.
Note I have done some search on Google and StackOverflow but haven't found anything that fixes this problem.
Thanks.
EDIT To anyone that has gotten this far -- user3666197 has suggested a very-good-and-robust way to handle termination or any exception during the execution.

Event handling approach
While the demo-code is small, the real-world systems, the more the multi-host / multi-process communicating systems, shall typically handle all adversely impacting events in their main control-loop.
try:
context = zmq.Context() # setup central Context instance
socket = ... # instantiate/configure all messaging archetypes
# main control-loop ----------- # ----------------------------------------
#
# your app goes here, incl. all nested event-handling & failure-resilience
# ----------------------------- # ----------------------------------------
except ...:
# # handle IOErrors, context-raised exceptions
except Keyboard Interrupt:
# # handle UI-SIG
except:
# # handle other, exceptions "un-handled" above
finally:
# # GRACEFULL TERMINATION
# .setsockopt( zmq.LINGER, 0 ) # to avoid hanging infinitely
# .close() # .close() for all sockets & devices
#
context.term() # Terminate Context before exit

Cleaning at exist
One may think of the code bellow! But it's not needed! For the socket closing!
The sockets get closed automatically!
However that's the way to do it manually!
Also i'm listing all the different useful information to understand the implication around the subject of destroying and closing or cleaning!
try:
context = zmq.Context()
socket = context.socket(zmq.ROUTER)
socket.bind(SOCKET_PATH)
# ....
finally :
context.destroy() # Or term() for graceful destroy
Error at KeyboardInterupt and fix
Before going further! Why the error:
Traceback (most recent call last):
File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
...
msg = self.recv(flags)
File "zmq/backend/cython/socket.pyx", line 781, in zmq.backend.cython.socket.Socket.recv
File "zmq/backend/cython/socket.pyx", line 817, in zmq.backend.cython.socket.Socket.recv
File "zmq/backend/cython/socket.pyx", line 186, in zmq.backend.cython.socket._recv_copy
File "zmq/backend/cython/checkrc.pxd", line 13, in zmq.backend.cython.checkrc._check_rc
KeyboardInterrupt
It's simply the KeyboardInterrupt error!
Just catching it! Will solve the problem!
For example:
try:
context = zmq.Context()
socket = context.socket(zmq.ROUTER)
socket.bind(SOCKET_PATH)
# ...
except KeyboardInterrupt:
print('> User forced exit!')
Bingo the error no more show!
Now no need to terminate the context! It will be done automatically!
Note too: If you don't catch KeyboardInterrupt! And simply make a finally: block and run context.term() alone! The process will hang for ever!
finally:
socket.close() # assuming one socket within the context
context.term()
or
finally:
context.destroy()
Will throw the same error! Which prove the error is the raise up of the keyboard interupt! Which should have been catched from within the library! And thrown again!
Only catching KeyboardInterrupt will do!
except KeyboardInterrupt:
print('> User forced exit!')
finally:
context.destroy() # manual (not needed)
Will do! But completly useless to add the finally block! And manually destroy (close socket + terminate)
Let me tell you why!
If in a hurry go to In python no need to clean at exit section all at the end!
How termination work and why
From the zguide: Making-a-Clean-Exit
It states that we need to close all messages! And also all sockets! Only until this, that the termination unblock and make the code exit
And c lang! The api go through zmq_ctx_destroy() and also closing the sockets and destroying the messages!
There is a lot of things to know:
Memory leaks are one thing, but ZeroMQ is quite finicky about how you exit an application. The reasons are technical and painful, but the upshot is that if you leave any sockets open, the zmq_ctx_destroy() function will hang forever. And even if you close all sockets, zmq_ctx_destroy() will by default wait forever if there are pending connects or sends unless you set the LINGER to zero on those sockets before closing them.
The ZeroMQ objects we need to worry about are messages, sockets, and contexts. Luckily it’s quite simple, at least in simple programs:
Use zmq_send() and zmq_recv() when you can, as it avoids the need to work with zmq_msg_t objects.
If you do use zmq_msg_recv(), always release the received message as soon as you’re done with it, by calling zmq_msg_close().
If you are opening and closing a lot of sockets, that’s probably a sign that you need to redesign your application. In some cases socket handles won’t be freed until you destroy the context.
When you exit the program, close your sockets and then call zmq_ctx_destroy(). This destroys the context.
Python api for destroying the context and termination
In pyzmq! The Context.term() make the call to zmq_ctx_destroy()!
The method Context.destroy() on the other hand is not only zmq_ctx_destroy() but it go and close all the sockets of the context! Then call Context.term() which call zmq_ctx_destroy()!
From the python doc
destroy()
note destroy() is not zmq_ctx_destroy()! term() is!
destroy() = context socket close() + term()
destroy(linger=None)
Close all sockets associated with this context and then terminate the context.
Warning
destroy involves calling zmq_close(), which is NOT threadsafe. If there are active sockets in other threads, this must not be called.
Parameters
linger (int, optional) – If specified, set LINGER on sockets prior to closing them.
term()
term()
Close or terminate the context.
Context termination is performed in the following steps:
Any blocking operations currently in progress on sockets open within context shall raise zmq.ContextTerminated. With the exception of socket.close(), any further operations on sockets open within this context shall raise zmq.ContextTerminated.
After interrupting all blocking calls, term shall block until the following conditions are satisfied:
All sockets open within context have been closed.
For each socket within context, all messages sent on the socket have either been physically transferred to a network peer, or the socket’s linger period set with the zmq.LINGER socket option has expired.
For further details regarding socket linger behaviour refer to libzmq documentation for ZMQ_LINGER.
This can be called to close the context by hand. If this is not called, the context will automatically be closed when it is garbage collected.
This is useful if you want to manually close!
It depends on the wanted behavior one may want to go with a way or another!
term() will raise the zmq.ContextTerminated exception for open sockets operation! If forcing out! One can simply call destroy()! For graceful exit! One can use term()! Then in the catched zmq.ContextTerminated exceptoin block! One should close the sockets! And do any handling! For closing the socket one can use socket.close()! Doing it socket by socket! I wonder what happen if we call destroy() at this point! It may works! The socket will get closed! But then a second call for context.term() will go! it may be ok! It may not! Didn't try it!
LINGER
Check ZMQ_LINGER: Set linger period for socket shutdown title! (ctrl + f)
http://api.zeromq.org/2-1:zmq-setsockopt
The ZMQ_LINGER option shall set the linger period for the specified socket. The linger period determines how long pending messages which have yet to be sent to a peer shall linger in memory after a socket is closed with zmq_close(3), and further affects the termination of the socket's context with zmq_term(3). The following outlines the different behaviours:
The default value of -1 specifies an infinite linger period. Pending messages shall not be discarded after a call to zmq_close(); attempting to terminate the socket's context with zmq_term() shall block until all pending messages have been sent to a peer.
The value of 0 specifies no linger period. Pending messages shall be discarded immediately when the socket is closed with zmq_close().
Positive values specify an upper bound for the linger period in milliseconds. Pending messages shall not be discarded after a call to zmq_close(); attempting to terminate the socket's context with zmq_term() shall block until either all pending messages have been sent to a peer, or the linger period expires, after which any pending messages shall be discarded.
Option value type: int
Option value unit: milliseconds
Default value: -1 (infinite)
Applicable socket types: all
In python no need to clean at exit
You only use destroy() or a combination of term() and destroy() if you want to manually destroy a context! And that's if you want to do some handling given the zmq.ContextTerminated exception! Or while working with multiple contexts! And you are creating them and closing them! Even though generally we never do that! Or some reasons while the code is all right running!
Otherwise as stated in the zguide
This is at least the case for C development. In a language with automatic object destruction, sockets and contexts will be destroyed as you leave the scope. If you use exceptions you’ll have to do the clean-up in something like a “final” block, the same as for any resource.
And you can see it in the pyzmq doc at Context.term() above:
This can be called to close the context by hand. If this is not called, the context will automatically be closed when it is garbage collected.
When the variable run out of scope they get destroyed! And the destroy and exit will be handled automatically! When the program exit ! Let's say even after a finally code! All variables will get destroyed! And so the cleaning will happen there!
Again! If you are having some problems! Make sure it's not contexts, socket and messages closing related! And make sure to use the latest version of pyzmq

Use SIGINT instead of SIGTERM that should fix it.
http://www.quora.com/Linux/What-is-the-difference-between-the-SIGINT-and-SIGTERM-signals-in-Linux

Related

Is there a point to using a with statement for creating a stem.control.Controller object?

I have some python that talks to a tor daemon, here it tells the daemon to shut down.
from stem import Signal
from stem.control import Controller
def shutDownTor():
with Controller.from_port(port=portNum) as controller:
controller.signal(Signal.SHUTDOWN)
I'm using a with statement because the code I'm stealing from learning from does so too. The code works fine, but I'm wondering if there is any point to using the with statement.
I know that when you use with to open files it makes sure the file closes even if there's an Exception or interrupt. But in this case it seems like all with is doing is adding an un-necessary tab. The variable controller is even left inside the namespace.
The Controller class you import from stem is a wrapper for ControlSocket which is itself a wrapper around a socket connection to Tor protocol. So when you use with in your code, you do so to open a connection with the given port. The same way the file is open and closed, you will have to open and close the connection yourself if you want to get rid of with.
If you would like to get rid of the with statement you will have to handle all of the open,close and exception on your own.
This will results with:
try:
controller = Controller.from_port()
except stem.SocketError as exc:
print("Unable to connect to tor on port 9051: %s" % exc)
sys.exit(1)
finally:
controller.close()
Which results with the same and I will quote "un-necessary tab".
You can skip all of it (handle the close, open and exception) if you are aware of and ready for all the consequences of it.

How to poll zmq and variable?

I have python server that waits for a global flag to be set and exits.
In a few threads, I have code that waits using zmq.Poller for
a message. It times out, prints a heartbeat message, then waits on poller
for a new message:
def timed_recv(zock, msec=5000.0):
poller = zmq.Poller()
poller.register(zock, zmq.POLLIN)
events = dict(poller.poll(msec))
data = None
if events and events.get(zock) == zmq.POLLIN:
# if a message came in time, read it.
data = zock.recv()
return data
So in the above function, I wait for 5 seconds for a message to arrive. If none does, the function returns, the calling loop prints a message and waits for a new message:
while not do_exit():
timed_recv(zock)
print "Program still here!"
sys.exit()
do_exit() checks a global flag for exitting.
Now, if the flag is set, there can be a 5 second delay between it being set, and the loop exitting. How, can I poll for both zock input, and for the global flag being set so that the loop exits quickly?
I thought I can add to the poller, a file descriptor that closes upon global flag being set. Does that seem reasonable? It seems kind of hackish.
Is there a better way to wait for global flag and POLLIN on zock?
(We are using zmq version 3.0 on debian.)
thanks.
The easiest way is to drop the use of a flag, and use another 0mq socket to convey a message. The poller can then wait on both 0mq sockets. The message could be just a single byte; it's arrival in the poller is the message, not its content.
In doing that you're heading down the road to Actor Model programming.
It's a whole lot easier if a development sticks to one programming model; mixing stuff up (e.g. 0mq and POSIX condition variables) is inviting a lot of problems.

Stop pyzmq receiver by KeyboardInterrupt

Following this example in the ØMQ docs, I'm trying to create a simple receiver. The example uses infinite loop. Everything works just fine. However, on MS Windows, when I hit CTRL+C to raise KeyboardInterrupt, the loop does not break. It seems that recv() method somehow ignores the exception. However, I'd love to exit the process by hiting CTRL+C instead of killing it. Is that possible?
In response to the #Cyclone's request, I suggest the following as a possible solution:
import signal
signal.signal(signal.SIGINT, signal.SIG_DFL);
# any pyzmq-related code, such as `reply = socket.recv()`
A zmq.Poller object seems to help:
def poll_socket(socket, timetick = 100):
poller = zmq.Poller()
poller.register(socket, zmq.POLLIN)
# wait up to 100msec
try:
while True:
obj = dict(poller.poll(timetick))
if socket in obj and obj[socket] == zmq.POLLIN:
yield socket.recv()
except KeyboardInterrupt:
pass
# Escape while loop if there's a keyboard interrupt.
Then you can do things like:
for message in poll_socket(socket):
handle_message(message)
and the for-loop will automatically terminate on Ctrl-C. It looks like the translation from Ctrl-C to a Python KeyboardInterrupt only happens when the interpreter is active and Python has not yielded control to low-level C code; the pyzmq recv() call apparently blocks while in low-level C code, so Python never gets a chance to issue the KeyboardInterrupt. But if you use zmq.Poller then it will stop at a timeout and give the interpreter a chance to issue the KeyboardInterrupt after the timeout is complete.
Don't know if this going to work in Windows, but in Linux I did something like this:
if signal.signal(signal.SIGINT, signal.SIG_DFL):
sys.exit()
Try ctrl+break (as in the key above Page Up, I had to look it up, I don't think I've ever touched that key before)
suggested near the bottom of this thread. I haven't done anything too fancy, but this seems to work well enough in the cases I've tried.

why a new process entry then the events of old process stop running when sharing a listening socket for multiple processes?

The problem happened in my proxy program, Considering G10K I use gevent in my program and I use the low-level gevent.core to run all my function.
Before I change my program into multiple processes. everything is OK. But when I changed it, the problem appears.
I find the problem is that when process NO.2 accept the socket, then the events of process NO.1 will stop dispatch. And if I add a sleep(0.1) in my event, then came a surprise. BUT I lower the sleep time, the problem showed again.
The problem have bothered me for a weeks, still nothing to do with that, Could someone help me ?
I use event like that:
core.init()
self.ent_s_send = core.event(core.EV_WRITE,self.conn.fileno(),\
self.ser_send,[self.conn,self.body])
self.ent_s_send.add()
core.dispatch()
I think that the problem is in your code, because this code is working fine, with the same shared socket.
When you accept sa ocket with EV_READ, you must get the client socket and free the control over the main socket; you must not write to it. You should use code similar to the following one:
try:
client_socket, address = sock.accept()
except socket.error, err:
if err[0] == errno.EAGAIN:
sys.exc_clear()
return
raise
core.event(core.EV_READ, client_socket.fileno(), callback)
core.event(core.EV_WRITE, client_socket.fileno(), callback)
core.event(core.EV_READ | core.EV_WRITE, client_socket.fileno(), callback)
After this, set READ and WRITE events for this socket.

Could not get out of python loop

I want to get out of loop when there is no data but loop seems to be stopping at recvfrom
image=''
while 1:
data,address=self.socket.recvfrom(512)
if data is None:break
image=image+data
count=count+1
print str(count)+' packets received...'
Try setting to a non-blocking socket. You would do this before the loop starts. You can also try a socket with a timeout.
recvfrom may indeed stop (waiting for data) unless you've set your socket to non-blocking or timeout mode. Moreover, if the socket gets closed by your counterpart, the indication of "socket was closed, nothing more to receive" is not a value of None for data -- it's an empty string, ''. So you could change your test to if not data: break for more generality.
What is the blocking mode of your socket?
If you are in blocking mode (which I think is the default), your program would stop until data is available... You would then not get to the next line after the recv() until data is coming.
If you switch to non-blocking mode, however (see socket.setblocking(flag)), I think that it will raise an exception you would have to catch rather than null-check.
You might want to set socket.setdefaulttimeout(n) to get out of the loop if no data is returned after specified time period.

Categories

Resources