Watching sockets with Glib on Windows puts them in non-blocking mode

Watching sockets with Glib on Windows puts them in non-blocking mode - python

The following code does not work correctly on Windows (but does on Linux):
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setblocking(True)
sock.connect(address)
gobject.io_add_watch(
sock.fileno(),
gobject.IO_OUT | gobject.IO_ERR | gobject.IO_HUP,
callback)
Snippets of comments in various places in the glib source, and other places mention that in Windows, sockets are put in non-blocking mode during polling. As a result the callback self.outgoing_cb is constantly called, and writing to the socket fails with this error message:
[Errno 10035] A non-blocking socket operation could not be completed immediately
Calling sock.setblocking(True) prior to writing does not seem to circumvent this. By lowering the priority of the polling, and ignoring the error message, it works as expected, but throws far to many events, and consumes a lot of CPU. Is there a way around this limitation in Windows?
Update
I might point out, that the whole point of polling for POLLOUT is that when you make the write call you won't get EAGAIN/EWOULDBLOCK. The strange error message that I'm getting, I believe would be the Windows equivalent of those 2 error codes. In other words, I'm getting gobject.IO_OUT events when the socket will not let me write successfully, and putting it into blocking mode still gives me this inappropriate error.
Another update
On Linux, where this works correctly, the socket is not switched to non-blocking mode, and I receive IO_OUT, when the socket will let me write without blocking, or throwing an error. It's this functionality I want to best emulate/restore under Windows.
Further notes
From man poll:
poll() performs a similar task to select(2): it waits for one of a set
of file descriptors to become ready to perform I/O.
POLLOUT
Writing now will not block.
From man select:
A file descriptor is considered ready if it is possible to perform the corre‐
sponding I/O operation (e.g., read(2)) without blocking.

Is there a problem with doing non-blocking I/O? It seems kind of strange to use polling loops if you're using blocking I/O.
When I write programs like this I tend to do the following:
Buffer the bytes I want to send to the file descriptor.
Only ask for IO_OUT (or the poll() equivalent, POLLOUT) events when said buffer is non-empty.
When poll() (or equivalent) has signaled that you're ready to write, issue the write. If you get EAGAIN/EWOULDBLOCK, remove the bytes you successfully wrote from the buffer and wait for the next time you get signaled. If you successfully wrote the entire buffer, then stop asking for POLLOUT so you don't spuriously wake up.
(My guess is that the Win32 bindings are using WSAEventSelect and WaitForMultipleObjects() to simulate poll(), but the result is the same...)
I'm not sure how your desired approach with blocking sockets would work. You are "waking up" constantly because you asked to wake you up when you can write. You only want to specify that when you have data to write... But then, when it wakes you up, the system won't really tell you how much data you can write without blocking, so that's a good reason to use non-blocking I/O.

GIO contains GSocket, a "lowlevel network socket object" since 2.22. However this is yet to be ported to pygobject on Windows.

I'm not sure if this helps (I'm not proficient with the poll function or the MFC sockets and don't know the polling is a requirement of your program structure), so take this with a grain of salt:
But to avoid a blocking or EAGAIN on write, we use select, i.e. add the socket to the write set that is passed to select, and if select() comes back with rc=0 the socket will accept writes right away ...
The write loop we use in our app is (in pseudocode):
set_nonblocking.
count= 0.
do {
FDSET writefds;
add skt to writefds.
call select with writefds and a reaonsable timeout.
if (select fails with timeout) {
die with some error;
}
howmany= send(skt, buf+count, total-count).
if (howmany>0) {
count+= howmany.
}
} while (howmany>0 && count<total);

You could use Twisted, which includes support for GTK (even on Windows) and will handle all the various error conditions that non-blocking sockets on Windows like to raise.

Related

Is there anyway to terminate a running function from a thread?

I've tried lately to write my own Socket-Server in python.
While i was writing a thread to handle server commands (sort of command line in the server), I've tried to implement a code that will restart the server when the raw_input() receives specific command.
Basically, i want to restart the server as soon as the "Running" variable changes its state from True to False, and when it does, i would like to stop the function (The function that called the thread) from running (get back to main function) and then run it again. Is there a way to do it?
Thank you very much, and i hope i was clear about my problem,
Idan :)

Communication between threads can be done with Events, Queues, Semaphores, etc. Check them out and choose the one, that fits your problem best.

You can't abort a thread, or raise an exception into it asynchronously, in Python.
The standard Unix solution to this problem is to use a non-blocking socket, create a pipe with pipe, replace all your blocking sock.recv calls with a blocking r, _, _ = select.select([sock, pipe], [], []), and then the other thread can write to the pipe to wake up the other thread.
To make this portable to Windows you'll need to create a UDP localhost socket instead of a pipe, which makes things slightly more complicated, but it's still not hard.
Or, of course, you can use a higher-level framework, like asyncio in 3.4+, or twisted or another third-party lib, which will wrap this up for you. (Most of them are already running the equivalent of a loop around select to service lots of clients in one thread or a small thread pool, so it's trivial to toss in a stop pipe.)
Are there other alternatives? Yes, but all less portable and less good in a variety of other ways.
Most platforms have a way to asynchronously kill or signal another thread, which you can access via, e.g., ctypes. But this is a bad idea, because it will prevent Python from doing any normal cleanup. Even if you don't get a segfault, this could mean files never get flushed and end up with incomplete/garbage data, locks are left acquired to deadlock your program somewhere completely unrelated a short time later, memory gets leaked, etc.
If you're specifically trying to interrupt the main thread, and you only care about CPython on Unix, you can use a signal handler and the kill function. The signal will take effect on the next Python bytecode, and if the interpreter is blocked on any kind of I/O (or most other syscalls, e.g., inside a sleep), the system will return to the interpreter with an EINTR, allowing it to interrupt immediately. If the interpreter is blocked on something else, like a call to a C library that blocks signals or just does nothing but CPU work for 30 seconds, then you'll have to wait 30 seconds (although that doesn't come up that often, and you should know if it will in your case). Also, threads and signals don't play nice on some older *nix platforms. And signals don't work the same way on Windows, or in some other Python implementations like Jython.
On some platforms (including Windows--but not most modern *nix plafforms), you can wake up a blocking socket call just by closing the socket out from under the waiting thread. On other platforms, this will not unblock the thread, or will do it sometimes but not other times (and theoretically it could even segfault your program or leave the socket library in an unusable state, although I don't think either of those will happen on any modern platform).

As far as I understand the documentation, and some experiments I've over the last weeks, there is no way to really force another thread to 'stop' or 'abort'. Unless the function is aware of the possibility of being stopped and has a foolproof method of avoiding getting stuck in some of the I/O functions. Then you can use some communication method such as semaphores. The only exception is the specialized Timer function, which has a Cancel method.
So, if you really want to stop the server thread forcefully, you might want to think about running it in a separate process, not a thread.
EDIT: I'm not sure why you want to restart the server - I just thought it was in case of a failure. Normal procedure in a server is to loop waiting for connections on the socket, and when a connection appears, attend it and return to that loop.
A better way, is to use the GIO library (part of glib), and connect methods to the connection event, to attend the connection even asynchronously. This avoids the loop completely. I don't have any real code for this in Python, but here's an example of a client in Python (which uses GIO for reception events) and a server in C, which uses GIO for connections.
Use of GIO makes life so much easier...

Windows named pipes in practice

With Windows named pipes, what is the proper way to use the CreateNamedPipe, ConnectNamedPipe, DisconnectNamedPipe, and CloseHandle calls?
I am making a server app which is connecting to a client app which connects and disconnects to the pipe multiple times across a session.
When my writes fail because the client disconnected, should I call DisconnectNamedPipe, CloseHandle, or nothing on my handle.
Then, to accept a new connection, should I call CreateNamedPipe and then ConnectNamedPipe, or just ConnectNamedPipe?
I would very much like an explanation of the different states my pipe can be in as a result of these calls, because I have not found this elsewhere.
Additional info:
Language: Python using the win32pipe,win32file and win32api libraries.
Pipe settings: WAIT, no overlap, bytestream.

It is good practice to call DisconnectNamedPipe then CloseHandle, although CloseHandle should clean everything up.
The MSDN documentation is a little vague and their server example is pretty basic. As to whether you reuse pipe handles, it seems that it is your own choice. Documentation for DisconnectNamedPipe seems to indicate that you can re-use a pipe handle for a new client by calling ConnectNamedPipe again on that handle after disconnecting. The role of ConnectNamedPipe seems to be to assign a connecting client to a handle.
Make sure you are cleaning up pipes though as MSDN states the following
Every time a named pipe is created, the system creates the inbound and/or outbound buffers using nonpaged pool, which is the physical memory used by the kernel. The number of pipe instances (as well as objects such as threads and processes) that you can create is limited by the available nonpaged pool. Each read or write request requires space in the buffer for the read or write data, plus additional space for the internal data structures.
I'd also bare the above in mind if you are creating/destroying a lot of pipes. My guess that it would be better to operate a pool of pipe handles if there are many clients and have some grow/shrink mechanism to the pool.

I have managed to achieve what I wanted. I call CreateNamedPipe and CloseHandle exactly once per session, and I call DisconnectNamedPipe when my write fails, followed by another ConnectNamedPipe.
The trick is to only call DisconnectNamedPipe when the pipe was actually connected. I called it every time I tried to connect "just to be sure" and it gave me strange errors.
See also djgandy's answer for more information about pipes.

Do not exit python program, but keep running

I've seen a few of these questions, but haven't found a real answer yet.
I have an application that launches a gstreamer pipe, and then listens to the data it sends back.
In the example application I based mine one, it ends with this piece of code:
gtk.main()
there is no gtk window, but this piece of code does cause it to keep running. Without it, the program exits.
Now, I have read about constructs using while True:, but they include the sleep command, and if I'm not mistaken that will cause my application to freeze during the time of the sleep so ...
Is there a better way, without using gtk.main()?

gtk.main() runs an event loop. It doesn't exit, and it doesn't just freeze up doing nothing, because inside it has code kind of like this:
while True:
timeout = timers.earliest() - datetime.now()
try:
message = wait_for_next_gui_message(timeout)
except TimeoutError:
handle_any_expired_timers()
else:
handle_message(message)
That wait_for_next_gui_message function is a wrapper around different platform-specific functions that wait for X11, WindowServer, the unnamed thing in Windows, etc. to deliver messages like "user clicked your button" or "user hit ctrl-Q".
If you call http.serve_forever() or similar on a twisted, HTTPServer, etc., it's doing exactly the same thing, except it's a wait_for_next_network_message(sources, timeout) function, which wraps something like select.select, where sources is a list of all of your sockets.
If you're listening on a gstreamer pipe, your sources can just be that pipe, and the wait_for_next function just select.select.
Or, of course, you could use a networking framework like twisted.
However, you don't need to design your app this way. If you don't need to wait for multiple sources, you can just block:
while True:
data = pipe.read()
handle_data(data)
Just make sure the pipe is not set to nonblocking. If you're not sure, you can use setblocking on a socket, fcntl on a Unix pipe, or something I can't remember off the top of my head on a Windows pipe to make sure.
In fact, even if you need to wait for multiple sources, you can do this, by putting a blocking loop for each source into a separate thread (or process). This won't work for thousands of sockets (although you can use greenlets instead of threads for that case), but it's fine for 3, or 30.

I've become a fan of the Cmd class. It gives you a shell prompt for your programs and will stay in the loop while waiting for input. Here's the link to the docs. It might do what you want.

python SocketServer stuck on waitpid() syscall

I am using Python (2.7) SocketServer with ForkingMixIn. It worked well.
However sometimes on heavy usage (tons of rapidly connecting/disconnecting clients) the "server" stuck, consuming all the idle CPU (shown 100% CPU by top). If I use strace from CLI on the process it shows it does endless sequence of waitpid() syscall. According to command "ps" there are no child processes though at this point.
After this problem my server implementation goes unusable and only its restarting helps :( Clients can connect but no anwser, I guess just the "backlog" queue is used on OS side, but the python code never accepts the connection.
It can be easily reproduced eg with some privimitive HTTP implementation, and a browser (I used chrome) with CTRL-R (reload) hold down for something like 10 seconds. Of course the problem is triggered without this "brutal" try as well "on normal usage" just more rarely, and it was quite hard to even come with the idea what can be the problem. I wrote my own implementation of something like SocketServer with os.fork(), and socket functions, and it does not have this problem, but I am more happy with some "already ready", and "standard" solution.
The problem: it is not a nice thing, as my script implementing a server can be DoS'ed very easily in this way.
What I could notice: I installed a singal handler for SIGCHLD. It seems if I remove that, I can't reproduce the problem, however then I can see zombie processes (I guess since they are not wait()'ed). Even if I install signal handler with signal.SIG_IGN, I expereince this problem.
Can anybody help what can be the problem and how I can solve this? I'd like use singal handler anyway since it's also not so nice to leave many zombie processes, especially after a long run.
Thanks for any idea.

maybe related: What is the cost of many TIME_WAIT on the server side?
it is possible that you have all your max connections in a time_wait state.
check sysctl net.core.somaxconn for maximum connections.
check sysctl net.ipv4 for other configuration details (e.g. tw
check ulimit -n for max open file descriptors (sockets included)
you can try: sysctl net.ipv4.tcp_tw_reuse=1 to quickly reuse those sockets (don't keep it enabled unless you know what you're doing.)
check for file handle leaks.
[not-so] stupid question: how is your SocketServer implementation different from the standard one + ForkingMixIn?
However, it is really easy to abuse a ForkingMixIn (fork bomb), you might want to use green threads, e.g. the eventlet library ( http://eventlet.net/doc/index.html )
this might be your problem.
this: http://bugs.python.org/issue7978
this: http://mail.python.org/pipermail/python-bugs-list/2010-April/095492.html
this: http://twistedmatrix.com/trac/ticket/733
you will see that SIGCHLD handler is discouraged unless you take some extra measures (signal.siginterrupt(signal.SIGCHLD, False) in handler, or using a wake-up fd in select() call)

Avoid hang when writing to named pipe which disappears and comes back

I have a program which dumps information into a named pipe like this:
cmd=open(destination,'w')
cmd.write(data)
cmd.close()
This works pretty well until the pipe (destination) disappears while my program is writing to it. The problem is that it keeps hanging on the write part(?)
I was expecting some exception to happen, but that's not the case.
How can I avoid this situation?
Thanks,
Jay

If the process reading from the pipe is not reading as fast as your writing, your script will block when it tries to write to the pipe. From the Wikipedia article:
"If the queue buffer fills up, the
sending program is suspended (blocked)
until the receiving program has had a
chance to read some data and make room
in the buffer. In Linux, the size of
the buffer is 65536 bytes."
Luckly you have a few options:
The signal module will allow you to set an alarm to break out of the write call. After the prescribed amount of time, a SIGALRM signal will be sent to your process, if your handler for the signal raises an exception, it will break you out of the write.
With threading, you can spawn a new thread to handle the writing, killing it if it blocks for too long.
You can also use the fnctl module to make the pipe nonblocking (meaning the call will not wait, it will fail immediately if the pipe is full): Non-blocking read on a subprocess.PIPE in python.
Finally, you can use the select module to check if the pipe is ready for writing before attempting your write, just be careful, the check-write action is not idempotent (e.g. the pipe could fill up between the check and write).

I think that the signal module can help you. Check this example:
http://docs.python.org/library/signal.html#example
(The example solves an possibly non-finishing open() call, but can be trivially modified to do the same thing to your cmd.write() call.)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.