How can I make Python wait when it cannot allocate memory? - python

Sometimes a Python program stops with an exception like the following, when there is not enough memory:
OSError: [Errno 12] Cannot allocate memory
Can I make it wait until memory is available again instead of dying unrecoverably?
Or at least freeze until the user sends a SIGCONT or something to it?
It's not my program, so I prefer not to modify its source code, but I think it would still be cool if I can do that by modifying only the outmost calling part.
Thank you!

You can catch the OSError exception but this may not help your program to continue where it left off.
To do this well you need to interpose some code between Python and malloc. You can do that using LD_PRELOAD as per the details here: How can I limit memory acquired with `malloc()` without also limiting stack?
The idea is you implement a wrapper for malloc which calls the real malloc and waits to retry if it fails. If you prefer not to use LD_PRELOAD then building Python with your interposing code baked in is a possibility (but a bit more work).
The library you'll write for LD_PRELOAD will end up being usable with just about any program written in C or C++. You could even open-source it. :)

Related

Is there no way to parallelize calls to an executable in Windows

I've got a list of files for each of which I'm calling sox. Because it takes a while I thought I'd speed the process up by parallelizing it, each call to sox is independent of each other so I thought it'd be a simple thing.
But it seems you cannot call the same executable from a different process, as that leads to an The process cannot access the file because it is being used by another process. error.
I'm guessing that is the cause because there's no other file I'm using across different processes. And yet I'm quite surprised by this, why would RO access not be possible? And does that really mean there's absolutely no way for me to speed my program up?
Found the error. I had at the end of my sox command 2> $nul to suppress the output. That was of course causing issues. :D

Unable to call PETSc/MPI-based external code in parallel OpenMDAO

I am writing an OpenMDAO problem that calls a group of external codes in a parallel group. One of these external codes is a PETSc-based fortran FEM code. I realize this is potentially problematic since OpenMDAO also utilizes PETSc. At the moment, I'm calling the external code in a component using python's subprocess.
If I run my OpenMDAO problem in serial (i.e. python2.7 omdao_problem.py), everything, including the external code, works just fine. When I try to run it in parallel, however (i.e. mpirun -np 4 python2.7 omdao_problem.py) then it works up until the subprocess call, at which point I get the error:
*** Process received signal ***
Signal: Segmentation fault: 11 (11)
Signal code: Address not mapped (1)
Failing at address: 0xe3c00
[ 0] 0 libsystem_platform.dylib 0x00007fff94cb652a _sigtramp + 26
[ 1] 0 libopen-pal.20.dylib 0x00000001031360c5 opal_timer_darwin_bias + 15469
*** End of error message ***
I can't make much of this, but it seems reasonable to me that the problem would come from using an MPI-based python code to call another MPI-enabled code. I've tried using a non-mpi "hello world" executable in the external code's place and that can be called by the parallel OpenMDAO code without error. I do not need the external code to actually run in parallel, but I do need to use the PETSc solvers and such, hence the inherent reliance on MPI. (I guess I could consider having both an MPI-enabled and non-MPI-enabled build of PETSc laying around? Would prefer not to do that if possible as I can see that becoming a mess in a hurry.)
I found this discussion which appears to present a similar issue (and further states that using subprocess in an MPI code, as I'm doing, is a no-no). In that case, it looks like using MPI_Comm_spawn may be an option, even though it isn't intended for that use. Any idea if that would work in the context of OpenMDAO? Other avenues to pursue for getting this to work? Any thoughts or suggestions are greatly appreciated.
You don't need to call the external code as a sub-process. Wrap the fortran code in python using F2py and pass a comm object down into it. This docs example shows how to work with components that use a comm.
You could use an MPI spawn if you want to. This approach has been done, but its far from ideal. You will be much more efficient if you can wrap the code in memory and let OpenMDAO pass you a comm.

How to trace random MemoryError in python script?

I have a python script, which is used to perform a lab measurement using several devices. The whole setup is rather involved, including communication over serial devices, API calls as well as the use of self-written and commercial drivers. In the end, however, everything boils down to two nested loops, which vary some parameters, collect data and write it to a file.
My problem is that I observe random occurences of a MemoryError, typically after 10 hours, equivalent to ~15k runs of the loops. At the moment, I don't have an idea, where it comes from or how I can trace it further. So I would be happy for suggestions, how to work on my problem. My observations up to this moment are as follows.
The error occurs at random states of the program. Different runs will throw the MemoryError at different lines of my script.
There is never any helpful error message. Python only says MemoryError without any error string. The traceback leads me to some point in the script, where memory is needed (e.g. when building a list), but it appears to be no specific instruction, which is the problem.
My RAM is far from full. The python process in question typically consumes some ten MB of RAM when viewed in the task manager. In addition, the RAM usage appears to be stable for hours. Usually, it increases slowly for some time, just to drop to down to the previous level quickly, which I interpret as the garbage collector kicking in periodically.
So far I did not find any indications for a memory leak. I used memory_profiler to trace the memory usage of my functions and found it to be stable. In addition, I followed this blog entry to observe what the garbage collector does in detail. Again, I could not find any hints for undeleted objects.
I am stuck to Win7 x86 due to a driver, which will only work on a 32bit system. So I cannot follow suggestions like this to go to a 64 bit version of Windows. Anyway, I do not see, how this would help in my situation.
The iPython console, from which the script is being launched, often behaves strange after the error occurred. Sometimes, a new MemoryError is thrown even for very simple operations. Often, the console is marked by Windows as "not responding" after some time. A menu pops up, where besides the usual options to wait for the process or to terminate it, there is a third option to "restore" the program (whatever that means). Doing so usually causes the console to work normal again.
At this point, I am somewhat out of ideas on how to proceed. The general receipe to comment out parts of the script until it works is highly undesirable in my case. As stated above, each test run will take several hours, meaning a potential downtime of weeks for my lab equipment. Going that direction, appears unfeasable to me. Is there any more direct approach to learn, what is crashing behind the scenes? How can I understand that python apparently fails to malloc?

file.read() multiprocessing and the GIL

I've read that certain Python functions implemented in C, which I assume includes file.read(), can release the GIL while they're working and then get it back on completion and by doing so make use of multiple cores if they're available.
I'm using multiprocess to parallelize some code and currently I've got three processes, the parent, one child that reads data from a file, and one child that generates a checksum from the data passed to it by the first child process.
Now if I'm understanding this right, it seems that creating a new process to read the file as I'm currently doing is uneccessary and I should just call it in the main process. The question is am I understanding this right and will I get better performance with the read kept in the main process or in a separate one?
So given my function to read and pipe the data to be processed:
def read(file_path, pipe_out):
with open(file_path, 'rb') as file_:
while True:
block = file_.read(block_size)
if not block:
break
pipe_out.send(block)
pipe_out.close()
I reckon that this will definitely make use of multiple cores, but also introduces some overhead:
multiprocess.Process(target=read, args).start()
But now I'm wondering if just doing this will also use multiple cores, minus the overhead:
read(*args)
Any insights anybody has as to which one would be faster and for what reason would be much appreciated!
Okay, as came out by the comments, the actual question is:
Does (C)Python create threads on its own, and if so, how can I make use of that?
Short answer: No.
But, the reason why these C-Functions are nevertheless interesting for Python programmers is the following. By default, no two snippets of python code running in the same interpreter can execute in parallel, this is due to the evil called the Global Interpreter Lock, aka the GIL. The GIL is held whenever the interpreter is executing Python code, which implies the above statement, that no two pieces of python code can run in parallel in the same interpreter.
Nevertheless, you can still make use of multithreading in python, namely when you're doing a lot of I/O or make a lot of use of external libraries like numpy, scipy, lxml and so on, which all know about the issue and release the GIL whenever they can (i.e. whenever they do not need to interact with the python interpreter).
I hope that cleared up the issue a bit.
I think this is the main part of your question:
The question is am I understanding this right and will I get better
performance with the read kept in the main process or in a separate
one?
I assume your goal is to read and process the file as fast as possible. File reading is in any case I/O bound and not CPU bound. You cannot process data faster than you are able to read it. So file I/O clearly limits the performance of your software. You cannot increase the read data rate by using concurrent threads/processes for file reading. Also 'low level' CPython is not doing this. As long as you read the file in one process or thread (even in case of CPython with its GIL a thread is fine), you will get as much data per time as you can get from the storage device. It is also fine if you do the file reading in the main thread as long as there are no other blocking calls that would actually slow down the file reading.

Is there a simple way to launch a background task from a Python CGI script without waiting around for it to terminate?

In Windows, that is.
I think the answer to this question is that I need to create a Windows service. This seems ludicrously heavyweight for what I am trying to do.
I'm just trying to slap together a little prototype here for my manager, I'm not going to be responsible for productizing it... in fact, it may never even BE productized; it might just be something that a few researchers play around with.
I have a CGI script that receives a file for upload, stores it to a temporary location, then launches a background process to do some serious number-crunching on the file. Then some Javascript stuff sits around calling other CGI scripts to check on the status and update the page as needed.
All of this works, except the damn web server won't close the connection as long as the subrocess is running. I've done some searching, and it appears the answer on Unix is to make it a daemon, but I'm stuck on Windows right now and I guess the answer there is to make it a Windows service?!? This seems incredibly heavyweight to just, you know, launch a damn process and then close the server connection.
That's really the only way?
Edit: Okay, found a nifty little hack over here (the choice (3) that the guy gives):
How to completely background a process in Perl CGI under IIS
I was able to modify this to make it even simpler, and although this is a klugey solution, it is perfect for the quick-and-dirty little prototype I am trying to make.
So I initially had my main script doing this:
subprocess.Popen("python.exe","myscript.py","arg1","arg2")
Which doesn't work, as I've described. Instead, I now have my main script emit this little bit of Javascript which runs after the document is fully loaded:
$("#somecrap").load("launchBackgroundProcess.py", {arg1:"foo",arg2:"bar"});
And then launchBackgroundProcess.py does the subprocess.Popen.
This solution would never scale, since it still leaves the browser connection open during the entire time the background task is running. But since this little thinger I am whipping up might someday have two simultaneous users at most (even then I doubt it) resources are not a concern. This allows the user to see the main page and get the Javascript updates even though there is still an http connection hanging open for no good reason.
Thanks for the answers! If I'm ever asked to productize this, I'll take at the resources Profane recommends.
If you haven't much experience with windows programming and don't wish to peruse the MSDN docs-- I don't blame you-- you may want to try to pick up a copy of Mark Hammond's cannonical guide to all things python and windows. It somehow never goes out-of-date on many of these sorts of recurring questions. Instead of launching the process with the every-platform solution, you'd probably be better off using the win32process module. Chapter 17 of the Hammond book covers this extensively, but you could probably get all you need by downloading the pywin ide (I think it comes bundled in the windows extensions which you can download from pypi), and looking through the help docs it has on python's windows' api. Here's an example of using the api, from a project I was working on recently. It may in fact do some of what you want with a little adaptation. You'd probably want to focus on CreationFlags. In particular, win32process.DETACHED_PROCESS is "often used to execute console programs in the background." Many other flags are available and conveniently wrapped however.
if subprocess.mswindows:
su=subprocess.STARTUPINFO()
su.dwFlags |= subprocess._subprocess.STARTF_USESHOWWINDOW
process = subprocess.Popen(['program', 'flag', 'flag2'], bufsize=-1,
stdout=subprocess.PIPE, startupinfo=su)
Simplest, but not most efficient way would be to just run another python executable
from subprocess import Popen
Popen("python somescript.py")
You can just use a system call using the "start" windows command. This way your python script will not wait for the completion of the started program.
CGI scripts are run with standard output redirected, either directly to the TCP socket or to a pipe. Typically, the connection won't close until the handle, and all copies of it, are closed. By default, the subprocess will inherit a copy of the handle.
There are two ways to prevent the connection from waiting on the subprocess. One is to prevent the subprocess from inheriting the handle, the other is for the subprocess to close its copy of the handle when it starts.
If the subprocess is in Perl, I think you could close the handle very simply:
close(STDOUT);
If you want to prevent the subprocess from inheriting the handle, you could use the SetHandleInformation function (if you have access to the Win32 API) or set bInheritHandles to FALSE in the call to CreateProcess. Alternatively, close the handle before launching the subprocess.

Categories

Resources