When to use blocking and non-blocking functions? - python

I am making a python module to help manage some tasks in Linux (and BSD) - namely managing Linux Containers. I'm aware of a couple of the ways to run terminal commands from python, such as Popen(), call(), and check_call(). When should I use these specific functions? More specifically, when is it proper to use the blocking or non-blocking function?
I have functions which build the commands to be run, which then pass the command (a list) to another function to execute it, using Popen.
Passing a command such as:
['lxc-start', '-n', 'myContainer']
to
...
def executeCommand(command, blocking=False):
try:
if blocking:
subprocess.check_call(command)
else:
(stdout, stderr) = Popen(command, stdout=PIPE).communicate()
self.logSelf(stdout)
except:
as_string = ' '.join(command)
logSelf("Could not execute :", as_string) #logging function
return
...
the code defaults to using Popen(), which is a non-blocking function. Under which kinds of cases should I override blocking and let the function perform check_call()?
My initial thoughts were to using blocking when the process is a one-time temporary process, such as the creation of the container, and to use non-blocking when the process is continuously running, such as starting a container.
Am I understanding the purpose of these functions correctly?

To answer the wider question - I would suggest :
Use a blocking call when you are doing something which either :
You know will always be quick - regardless of whether it works or fails.
Something which is critical to your application, and where is make no sense for your application to do anything else unless and until that task is complete - for instance connecting to or creating critical resources.
Use non-blocking calls in all other cases if you can, and especially if :
The task could take a while or
It would be useful to be doing something else while the task executes (even if that is a gui update to show progress).

Related

How can I stop the execution of a Python function from outside of it?

So I have this library that I use and within one of my functions I call a function from that library, which happens to take a really long time. Now, at the same time I have another thread running where I check for different conditions, what I want is that if a condition is met, I want to cancel the execution of the library function.
Right now I'm checking the conditions at the start of the function, but if the conditions happen to change while the library function is running, I don't need its results, and want to return from it.
Basically this is what I have now.
def my_function():
if condition_checker.condition_met():
return
library.long_running_function()
Is there a way to run the condition check every second or so and return from my_function when the condition is met?
I've thought about decorators, coroutines, I'm using 2.7 but if this can only be done in 3.x I'd consider switching, it's just that I can't figure out how.
You cannot terminate a thread. Either the library supports cancellation by design, where it internally would have to check for a condition every once in a while to abort if requested, or you have to wait for it to finish.
What you can do is call the library in a subprocess rather than a thread, since processes can be terminated through signals. Python's multiprocessing module provides a threading-like API for spawning forks and handling IPC, including synchronization.
Or spawn a separate subprocess via subprocess.Popen if forking is too heavy on your resources (e.g. memory footprint through copying of the parent process).
I can't think of any other way, unfortunately.
Generally, I think you want to run your long_running_function in a separate thread, and have it occasionally report its information to the main thread.
This post gives a similar example within a wxpython program.
Presuming you are doing this outside of wxpython, you should be able to replace the wx.CallAfter and wx.Publisher with threading.Thread and PubSub.
It would look something like this:
import threading
import time
def myfunction():
# subscribe to the long_running_function
while True:
# subscribe to the long_running_function and get the published data
if condition_met:
# publish a stop command
break
time.sleep(1)
def long_running_function():
for loop in loops:
# subscribe to main thread and check for stop command, if so, break
# do an iteration
# publish some data
threading.Thread(group=None, target=long_running_function, args=()) # launches your long_running_function but doesn't block flow
myfunction()
I haven't used pubsub a ton so I can't quickly whip up the code but it should get you there.
As an alternative, do you know the stop criteria before you launch the long_running_function? If so, you can just pass it as an argument and check whether it is met internally.

Terminating an IronPython script

This may not specifically be an IronPython question, so a Python dev out there might be able to assist.
I want to run python scripts in my .Net desktop app using IronPython, and would like to give users the ability to forcibly terminate a script. Here's my test script (I'm new to Python so it might not be totally correct):-
import atexit
import time
import sys
#atexit.register
def cleanup():
print 'doing cleanup/termination code'
sys.exit()
for i in range(100):
print 'doing something'
time.sleep(1)
(Note that I might want to specify an "atexit" function in some scripts, allowing them to perform any cleanup during normal or forced termination).
In my .Net code I'm using the following code to terminate the script:
_engine.Runtime.Shutdown();
This results in the script's atexit function being called, but the script doesn't actually terminate - the for loop keeps going. A couple of other SO articles (here and here) say that sys.exit() should do the trick, so what am I missing?
It seems that it's not possible to terminate a running script - at least not in a "friendly" way. One approach I've seen is to run the IronPython engine in another thread, and abort the thread if you need to stop the script.
I wasn't keen on this brute-force approach, which would risk leaving any resources used by the script (e.g. files) open.
In the end, I create a C# helper class like this:-
public class HostFunctions
{
public bool AbortScript { get; set; }
// Other properties and functions that I want to expose to the script...
}
When the hosting application wants to terminate the script it sets AbortScript to true. This object is passed to the running script via the scope:-
_hostFunctions = new HostFunctions();
_scriptScope = _engine.CreateScope();
_scriptScope.SetVariable("HostFunctions", _hostFunctions);
In my scripts I just need to strategically place checks to see if an abort has been requested, and deal with it appropriately, e.g.:-
for i in range(100):
print 'doing something'
time.sleep(1)
if HostFunctions.AbortScript:
cleanup()
It seems that if you are using ".NET 5" or higher then aborting Thread might work imperfect.
Thread.Abort() is not supported on ".NET 5" or higher and throws PlatformNotSupportedException.
You probably will find a solution to use Thread.Interrupt(), but it has slightly different behavior:
If your Python script does not have any Thread.Sleep() it won't stop your script;
It looks like you couldn't Abort that Thread twice, but you can Interrupt that Thread twice. So, if your Python script is using finally blocks or "Context Manager", you will be able to Interrupt it by calling Thread.Interrupt() twice (with some delays between those calls).

Can I make a gunicorn worker stop (and restart) AFTER responding to a request?

I have a worker process that loads a special data structure at startup, and then periodically afterwards.
However, the data structure -- a third party C++ module -- has a memory leak.
I tried using the gunicorn "max_requests" setting to have the workers expire after so many requests, which clears out their resources and reloads the data structure. But there were some nit-picky problems with that that I won't go into.
I tried adding an os._exit(0), forcing the worker to halt (and reload), but it meant that request got an error response.
What I'd like to do is signal gunicorn to send the response and THEN kill the worker, just as if the "max_requests" flag had been triggered.
Is there a mechanism to do that?
The reload built-in function is useful for re-initializing modules, but in the documentation for that function (http://docs.python.org/2/library/functions.html#reload), you'll find this: "It is legal though generally not very useful to reload built-in or dynamically loaded modules, except for sys, main and builtin. In many cases, however, extension modules are not designed to be initialized more than once, and may fail in arbitrary ways when reloaded." In other words, it might not solve the problem in your case.
However, if you are able to move the calls to your third party module into a subprocess it is easy to ensure that it is reloaded every time, as you are now using the module by executing a script in a separate process. Just make a separate py file, e.g. do_something.py, where you do all your stuff you need to do with the subprocess. Then you can run this file using:
p = subprocess.Popen(
[sys.executable, 'do_something.py', arg1, arg2 ...],
stdout=subprocess.PIPE
)
where 'arg1, arg2, ...' represents arguments that you may need to pass to your subprocess. If you need to read the output from your C++ module you can print it in do_something.py and read it in your application using:
p.stdout.read()
I admit that this is not a particularly smooth solution, but I think it should work.

Use python subprocess module like a command line simulator

I am writing a test framework in Python for a command line application. The application will create directories, call other shell scripts in the current directory and will output on the Stdout.
I am trying to treat {Python-SubProcess, CommandLine} combo as equivalent to {Selenium, Browser}. The first component plays something on the second and checks if the output is expected. I am facing the following problems
The Popen construct takes a command and returns back after that command is completed. What I want is a live handle to the process so I can run further commands + verifications and finally close the shell once done
I am okay with writing some infrastructure code for achieveing this since we have a lot of command line applications that need testing like this.
Here is a sample code that I am running
p = subprocess.Popen("/bin/bash", cwd = test_dir)
p.communicate(input = "hostname") --> I expect the hostname to be printed out
p.communicate(input = "time") --> I expect current time to be printed out
but the process hangs or may be I am doing something wrong. Also how do I "grab" the output of that sub process so I can assert that something exists?
subprocess.Popen allows you to continue execution after starting a process. The Popen objects expose wait(), poll() and many other methods to communicate with a child process when it is running. Isn't it what you need?
See Popen constructor and Popen objects description for details.
Here is a small example that runs Bash on Unix systems and executes a command:
from subprocess import Popen, PIPE
p = Popen (['/bin/sh'], stdout=PIPE, stderr=PIPE, stdin=PIPE)
sout, serr = p.communicate('ls\n')
print 'OUT:'
print sout
print 'ERR:'
print serr
UPD: communicate() waits for process termination. If you do not need that, you may use the appropriate pipes directly, though that usually gives you rather ugly code.
UPD2: You updated the question. Yes, you cannot call communicate twice for a single process. You may either give all commands you need to execute in a single call to communicate and check the whole output, or work with pipes (Popen.stdin, Popen.stdout, Popen.stderr). If possible, I strongly recommend the first solution (using communicate).
Otherwise you will have to put a command to input and wait for some time for desired output. What you need is non-blocking read to avoid hanging when there is nothing to read. Here is a recipe how to emulate a non-blocking mode on pipes using threads. The code is ugly and strangely complicated for such a trivial purpose, but that's how it's done.
Another option could be using p.stdout.fileno() for select.select() call, but that won't work on Windows (on Windows select operates only on objects originating from WinSock). You may consider it if you are not on Windows.
Instead of using plain subprocess you might find Python sh library very useful:
http://amoffat.github.com/sh/
Here is an example how to build in an asynchronous interaction loop with sh:
http://amoffat.github.com/sh/tutorials/2-interacting_with_processes.html
Another (old) library for solving this problem is pexpect:
http://www.noah.org/wiki/pexpect

Do not exit python program, but keep running

I've seen a few of these questions, but haven't found a real answer yet.
I have an application that launches a gstreamer pipe, and then listens to the data it sends back.
In the example application I based mine one, it ends with this piece of code:
gtk.main()
there is no gtk window, but this piece of code does cause it to keep running. Without it, the program exits.
Now, I have read about constructs using while True:, but they include the sleep command, and if I'm not mistaken that will cause my application to freeze during the time of the sleep so ...
Is there a better way, without using gtk.main()?
gtk.main() runs an event loop. It doesn't exit, and it doesn't just freeze up doing nothing, because inside it has code kind of like this:
while True:
timeout = timers.earliest() - datetime.now()
try:
message = wait_for_next_gui_message(timeout)
except TimeoutError:
handle_any_expired_timers()
else:
handle_message(message)
That wait_for_next_gui_message function is a wrapper around different platform-specific functions that wait for X11, WindowServer, the unnamed thing in Windows, etc. to deliver messages like "user clicked your button" or "user hit ctrl-Q".
If you call http.serve_forever() or similar on a twisted, HTTPServer, etc., it's doing exactly the same thing, except it's a wait_for_next_network_message(sources, timeout) function, which wraps something like select.select, where sources is a list of all of your sockets.
If you're listening on a gstreamer pipe, your sources can just be that pipe, and the wait_for_next function just select.select.
Or, of course, you could use a networking framework like twisted.
However, you don't need to design your app this way. If you don't need to wait for multiple sources, you can just block:
while True:
data = pipe.read()
handle_data(data)
Just make sure the pipe is not set to nonblocking. If you're not sure, you can use setblocking on a socket, fcntl on a Unix pipe, or something I can't remember off the top of my head on a Windows pipe to make sure.
In fact, even if you need to wait for multiple sources, you can do this, by putting a blocking loop for each source into a separate thread (or process). This won't work for thousands of sockets (although you can use greenlets instead of threads for that case), but it's fine for 3, or 30.
I've become a fan of the Cmd class. It gives you a shell prompt for your programs and will stay in the loop while waiting for input. Here's the link to the docs. It might do what you want.

Categories

Resources