Unable to call PETSc/MPI-based external code in parallel OpenMDAO

Unable to call PETSc/MPI-based external code in parallel OpenMDAO - python

I am writing an OpenMDAO problem that calls a group of external codes in a parallel group. One of these external codes is a PETSc-based fortran FEM code. I realize this is potentially problematic since OpenMDAO also utilizes PETSc. At the moment, I'm calling the external code in a component using python's subprocess.
If I run my OpenMDAO problem in serial (i.e. python2.7 omdao_problem.py), everything, including the external code, works just fine. When I try to run it in parallel, however (i.e. mpirun -np 4 python2.7 omdao_problem.py) then it works up until the subprocess call, at which point I get the error:
*** Process received signal ***
Signal: Segmentation fault: 11 (11)
Signal code: Address not mapped (1)
Failing at address: 0xe3c00
[ 0] 0 libsystem_platform.dylib 0x00007fff94cb652a _sigtramp + 26
[ 1] 0 libopen-pal.20.dylib 0x00000001031360c5 opal_timer_darwin_bias + 15469
*** End of error message ***
I can't make much of this, but it seems reasonable to me that the problem would come from using an MPI-based python code to call another MPI-enabled code. I've tried using a non-mpi "hello world" executable in the external code's place and that can be called by the parallel OpenMDAO code without error. I do not need the external code to actually run in parallel, but I do need to use the PETSc solvers and such, hence the inherent reliance on MPI. (I guess I could consider having both an MPI-enabled and non-MPI-enabled build of PETSc laying around? Would prefer not to do that if possible as I can see that becoming a mess in a hurry.)
I found this discussion which appears to present a similar issue (and further states that using subprocess in an MPI code, as I'm doing, is a no-no). In that case, it looks like using MPI_Comm_spawn may be an option, even though it isn't intended for that use. Any idea if that would work in the context of OpenMDAO? Other avenues to pursue for getting this to work? Any thoughts or suggestions are greatly appreciated.

You don't need to call the external code as a sub-process. Wrap the fortran code in python using F2py and pass a comm object down into it. This docs example shows how to work with components that use a comm.
You could use an MPI spawn if you want to. This approach has been done, but its far from ideal. You will be much more efficient if you can wrap the code in memory and let OpenMDAO pass you a comm.

Related

Starting process in Google Colab with Prefix "!" vs. "subprocess.Popen(..)"

I've been using Google Colab for a few weeks now and I've been wondering what the difference is between the two following commands (for example):
!ffmpeg ...
subprocess.Popen(['ffmpeg', ...
I was wondering because I ran into some issues when I started either of the commands above and then tried to stop execution midway. Both of them cancel on KeyboardInterrupt but I noticed that after that the runtime needs a factory reset because it somehow got stuck. Checking ps aux in the Linux console listed a process [ffmpeg] <defunct> which somehow still was running or at least blocking some ressources as it seemed.
I then did some research and came across some similar posts asking questions on how to terminate a subprocess correctly (1, 2, 3). Based on those posts I generally came to the conclusion that using the subprocess.Popen(..) variant obviously provides more flexibility when it comes to handling the subprocess: Defining different stdout procedures or reacting to different returncode etc. But I'm still unsure on what the first command above using the ! as prefix exactly does under the hood.
Using the first command is much easier and requires way less code to start this process. And assuming I don't need a lot of logic handling the process flow it would be a nice way to execute something like ffmpeg - if I were able to terminate it as expected. Even following the answers from the other posts using the 2nd command never got me to a point where I could terminate the process fully once started (even when using shell=False, process.kill() or process.wait() etc.). This got me frustrated, because restarting and re-initializing the Colab instance itself can take several minutes every time.
So, finally, I'd like to understand in more general terms what the difference is and was hoping that someone could enlighten me. Thanks!

! commands are executed by the notebook (or more specifically by the ipython interpreter), and are not valid Python commands. If the code you are writing needs to work outside of the notebook environment, you cannot use ! commands.
As you correctly note, you are unable to interact with the subprocess you launch via !; so it's also less flexible than an explicit subprocess call, though similar in this regard to subprocess.call
Like the documentation mentions, you should generally avoid the bare subprocess.Popen unless you specifically need the detailed flexibility it offers, at the price of having to duplicate the higher-level functionality which subprocess.run et al. already implement. The code to run a command and wait for it to finish is simply
subprocess.check_call(['ffmpeg', ... ])
with variations for capturing its output (check_output) and the more modern run which can easily replace all three of the legacy high-level calls, albeit with some added verbosity.

Python3 Search the virtual memory of a running windows process

begin TLDR;
I want to write a python3 script to scan through the memory of a running windows process and find strings.
end TLDR;
This is for a CTF binary. It's a typical Windows x86 PE file. The goal is simply to get a flag from the processes memory as it runs. This is easy with ProcessHacker you can search through the strings in the memory of the running application and find the flag with a regex. Now because I'm a masochistic geek I strive to script out solutions for CTFs (for everything really). Specifically I want to use python3, C# is also an option but would really like to keep all of the solution scripts in python.
Thought this would be a very simple task. You know... pip install some library written by someone that's already solved the problem and use it. Couldn't find anything that would let me do what I need for this task. Here are the libraries I tried out already.
ctypes - This was the first one I used, specifically ReadProcessMemory. Kept getting 299 errors which was because the buffer I was passing in was larger than that section of memory so I made a recursive function that would catch that exception, divide the buffer length by 2 until it got something THEN would read one byte at a time until it hit a 299 error. May have been on the right track there but I wasn't able to get the flag. I WAS able to find the flag only if I knew the exact address of the flag (which I'd get from process hacker). I may make a separate question on SO to address that, this one is really just me asking the community if something already exists before diving into this.
pymem - A nice wrapper for ctypes but had the same issues as above.
winappdbg - python2.x only. I don't want to use python 2.x.
haystack - Looks like this depends on winappdbg which depends on python 2.x.
angr - This is a possibility, Only scratched the surface with it so far. Looks complicated and it's on the to learn list but don't want to dive into something right now that's not going to solve the issue.
volatility - Looks like this is meant for working with full RAM dumps not for hooking into currently running processes and reading the memory.
My plan at the moment is to dive a bit more into angr to see if that will work, go back to pymem/ctypes and try more things. If all else fails ProcessHacker IS opensource. I'm not fluent in C so it'll take time to figure out how they're doing it. Really hoping there's some python3 library I'm missing or maybe I'm going about this the wrong way.

Ended up writing the script using the frida library. Also have to give soutz to rootbsd because his or her code in the fridump3 project helped greatly.

LLDB python debugger register read

I need to trace program execution, so I decided to make infinite loop, and read pc register and make step.
Platform: IOS
In such way I want to trace program's execution flow.
Question is - how should i get $pc register through LLDB python API?

Your program will likely have more than one thread, and each thread will have a different PC. So you would start with your SBProcess object, then it has a "threads" property for iterating over threads - represented by the SBThread object. The SBThread has a "frames" property which is an array of all the "SBFrames", and frames[0] is the bottom-most frame. The SBFrame has "pc" property which is the pc. This table of the Python SB API's might help you out:
LLDB Python APIs
However, what you are trying to do won't work under Xcode - which is generally the only way to do debugging on iOS. Xcode and Python currently fight over who gets to control process execution, and at some point the wrong actor wins and execution stalls.
You can do this sort of thing using a stand-alone Python driver, an example of which is:
Process Events Example
But since you can't really attach to an iOS process from stand-alone lldb, this is hard to use for iOS development.
BTW, I've occasionally done what you are describing on Mac OS X, and it is also really really slow. You would only want to do this when you are desperate.
You can sometimes get the same effect by putting breakpoints on every function entry point, which you can do on the lldb command line using:
(lldb) break set -r .
and if you only care about tracing through some given modules, you can add the --shlib option one or more times to the "break set" line to restrict the breakpoints to those libraries. Then write a breakpoint command (which you can do in Python) to gather the requisite information. This will still be slow, but is closer to useable.

Accessing a Panatone Huey via Python

I have a Panatone Huey, a monitor calibration probe (device you attach to the monitor, and it gives you colour readings) - I want to get readings from the device in Python.
Having never written such a device driver before, I'm not sure where to start.
I've found are two open-source C/C++ projects that interface with the Heuy - ArgyllCMS and mcalib.
ArgyllCMS comes with a spotread command which returns readings from the device, although it only functions as an interactive command line tool, so running it via subprocess will not (easily) work.
The code ArgyllCMS uses to communicate with the device is in spectro/huey.c
Not tried it (only just found it while writing this question), but mcalib contains much less code, mainly just heuy.cpp - however it has a worrying number of FIXME comments and incomplete methods, and the code appears to have been automatically generated (unhelpful variable names)
There seems to be three options:
Modify spotread to work without any interactive prompts, call it via subprocess
Create a C-based Python module around huey.c or huey.cpp
Re-implement the interface using something like PyUSB
Being much more familiar with Python, I'm tempted to use PyUSB, but will this be substantially more work than wrapping existing code with the Python C API? Is there anything obvious in either of the C implementations that will not be easily doable in PyUSB?

Given the existence of spotread the easiest (though perhaps not the best) way to proceed would be to use pexpect. It allows you to interact with other command-line programs.

Embedded Python - Blocking operations in time module

I'm developing my own Python code interpreter using the Python C API, as described in the Python documentation. I've taken a look on the Python source code and I tried to follow the same steps that are carried out in the standard interpreter when executing a py file. These steps (sequence of C API function calls) are basically:
PyRun_AnyFileExFlags()
PyRun_SimpleFileExFlags()
PyRun_FileExFlags()
PyArena_New()
PyParser_ASTFromFile()
run_mod()
PyAST_Compile()
PyEval_EvalCode()
PyEval_EvalCodeEx()
PyThreadState_GET()
PyFrame_New()
PyEval_EvalFrameEx()
The only difference in my code is that I do manually the AST compilation, frame creation, etc. and then I call PyEval_EvalFrame.
With this, I am able to execute an arbitrary .py file with my program, as if it were the normal Python interpreter. My problem comes when the code that my program is executing makes use of the time module: all time module operations get blocked in the GIL! For example, if the Python code calls time.sleep(1), this call is blocked and never gets executed.
Obviously I am doing something wrong that blocks the GIL (and therefore blocks the time module) but I dont know how to correct it. The last statement in my code where I have control is in PyEval_EvalFrameEx, and from that point on, everything runs "as in regular Python interpreter", I think.
Anybody had a similar problem? What am I doing wrong, so that I block the time module?
Hope somebody can help me...
Thanks for your time. Best regards,
R.

You need to provide more detail.
How does your interpreter's behavior differ from the standard interpreter?
If you just want to run arbitrary source files, why are you not calling one of the higher level interfaces, like PyRun_SimpleFile? Did your code call Py_Initialize?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.