Python run binary and intercept file writes (using subprocess)

Python run binary and intercept file writes (using subprocess) - python

I have a simple command-line utility which produces output both on the console and the filesystem. While I know very well how to capture the console output, I am not aware how can I also intercept the file - for which I know the filename in advance.
I would like to keep the execution "in memory" without touching the filesystem as I immediately parse and delete the file created and this creates an unnecessary bottleneck (especially when I need to run the little tool millions of times).
So, to sum up, I am trying to achieve following:
Run a binary using python's subprocess
Capture both the tool's output AND contents of a file it creates (in current working directory with in-advance known name)
Ideally, run it all without touching the filesystem.

Since you only need to support Linux, one possibility is to use named pipes. The idea is to pre-create the output file as a named pipe, and have your process read the tool's output from the pipe.
See, for example, Introduction to Named Pipes.
The Python API is os.mkfifo().

Related

How can I call a python script with arguments from within Processing?

I have a python script which outputs a JSON when called with different arguments. I am looking for a way to call that script from within Processing and load the output using something like loadJSONObject()
The problem is that I don't know how to call the python script with arguments from within Processing.
Any tip will be appreciated, thanks!

One option, as pointed out in the comments, is to use open, and then load the file that generates the normal way.
Another -arguably much better- way is to not do this and to run your python script as services with a web interface instead, so that your python scripts sits listening on http://localhost:1234, for instance, and your Processing sketch can simply load a file "http://localhost:1234/somefile?input=whatever" and not even care what is actually generating the content.
The upside there is also that you can run your script anywhere that can be reached via URLs, and those things don't need to rely on python being available as an executable.

Can I save a text file in python without closing it?

I am writing a program in which I would like to be able to view a log file before the program is complete. I have noticed that, in python (2.7 and 3), that file.write() does not save the file, file.close() does. I don't want to create a million little log files with unique names but I would like to be able to view the updated log file before the program is finished. How can I do this?
Now, to be clear I am scripting using Ansys Workbench (trying to batch some CFX runs). Here's a link to a tutorial that shows what I'm talking about. They appear to have wrapped python, and by running the script I can send commands to the various modules. When the script is running there is no console onscreen and it appears to be eating all of the print statements, so the only way I can report what's happening is via a file. Also, I don't want to bring a console window up because eventually I will just run the program in batch mode (no interface). But the simulations take a long time to run and I can't wait for the program to finish before checking on what's happening.

You would need this:
file.flush()
# typically the above line would do. however this is used to ensure that the file is written
os.fsync(file.fileno())
Check this: http://docs.python.org/2/library/stdtypes.html#file.flush
file.flush()
Flush the internal buffer, like stdio‘s fflush(). This may be a no-op on some file-like objects.
Note flush() does not necessarily write the file’s data to disk. Use flush() followed by os.fsync() to ensure this behavior.
EDITED: See this question for detailed explanations: what exactly the python's file.flush() is doing?

Does file.flush() after each write help?
Hannu

This will write the file to disk immediately:
file.flush()
os.fsync(file.fileno())
According to the documentation https://docs.python.org/2/library/os.html#os.fsync
Force write of file with filedescriptor fd to disk. On Unix, this calls the native fsync() function; on Windows, the MS _commit() function.
If you’re starting with a Python file object f, first do f.flush(), and then do os.fsync(f.fileno()), to ensure that all internal buffers associated with f are written to disk.

How to run multiple files successively in a single process using python

I am working in Windows, and just learning to use python (python 2.7).
I have a bunch of script files ("file1.script", "file2.script", "file3.script"....) that are executed in TheProgram.exe. Python has already given me the ability to automatically create these script files, but now I want to successively run each of these script files, back-to-back, in TheProgram.exe.
So far I have figured out how to use the subprocess module in python to start "TheProgram.exe" in a new process (child process?) and load the first script file as follows:
my_process = subprocess.Popen(["Path to TheProgram.exe", "Path to File1.script"])
As seen, simply "opening" the script file in TheProgram.exe, or passing it as an argument in this case, will execute it. Once File1.script is done, TheProgram.exe generates an output file, and then just sits there. It does not terminate. This is I want, because now I would like to load File2.script in the same process without terminating (file2.script is dependent on file1.script completing successfully), then File3.script etc.
Is this possible? And if so how? I cannot seem to find any documentation or anyone else who has had this problem. If I can provide other information please let me know, I am also new to posting to these forums. Thanks so much for any assistance.

Grabbing output FILE from Python Popen process?

I have written a python program to interface with a compiled program (call it ProgramX) that has some idiosyncrasies that are proving difficult to deal with. I need to feed many thousands of input files to ProgramX via my python program. What I would like to do is to grab the output file that ProgramX creates with each run, and rename it something sensible, like inputfilename.output.
The problem comes in the output file that is written by ProgramX -- it is named via an unpredictable method, which will write, and "mercilessly overwrite", the output file if it already exists (which is the case the majority of the time). The saving grace probably comes with the fact that there is a standard prefix to the output files: think ProgramX.notQuiteRandomNumber.
The only think I can think to do is something like this in my bash shell:
PROGRAMXOUTPUT=$(ls -ltr ProgramX* | tail -n -1 | awk '{print $8}')
mv $PROGRAMXOUTPUT input.output
Which does 90% of what I need, but before I program all that bash into a series of Popen statements, is there a better way to do this? This problem feels like something people might have a much better solution than what I'm thinking.
Sidenote: I can grab the program's standard output without problems, however it's the output file that I need to grab.
Bonus: I was planning on running a bunch of instantiations of the program in the same directory, so my naive approach above may start to have unforeseen problems. So perhaps something fancy that watches the PID of ProgramX and follows its output.

To do what your shell script above does, assuming you've only got one ProgramX* in the current directory:
import glob, os
programxoutput = glob.glob('ProgramX*')[0]
os.rename(programxoutput, 'input.output')
If you need to sort by time, etc., there are ways to do that too (look at os.stat), but using the most recent modification date is a recipe for nasty race conditions if you'll be running multiple copies of ProgramX concurrently.
I'd suggest instead that you create and change to a new, perhaps temporary directory for each run of ProgramX, so the runs have no possibility of treading on each other. The tempfile module can help with this.

Two options that I see:
You could use lsof to find open files to find the files that ProgramX is writing.
A different approach would be to run ProgramX in a temporary directory (see tempfile for an easy way of setting up directories. Between runs of ProgramX, you can clean that directory or keep requesting new temp directories, if you are planning on running multiple copieProgramX at the same time.

If there is only one ProgramX* file, then what about just:
mv ProgramX* input.output

How to I get scons to invoke an external script?

I'm trying to use scons to build a latex document. In particular, I want to get scons to invoke a python program that generates a file containing a table that is \input{} into the main document. I've looked over the scons documentation but it is not immediately clear to me what I need to do.
What I wish to achieve is essentially what you would get with this makefile:
document.pdf: table.tex
pdflatex document.tex
table.tex:
python table_generator.py
How can I express this in scons?

Something along these lines should do -
env.Command ('document.tex', '', 'python table_generator.py')
env.PDF ('document.pdf', 'document.tex')
It declares that 'document.tex' is generated by calling the Python script, and requests a PDF document to be created from this generatd 'document.tex' file.
Note that this is in spirit only. It may require some tweaking. In particular, I'm not certain what kind of semantics you would want for the generation of 'document.tex' - should it be generated every time? Only when it doesn't exist? When some other file changes? (you would want to add this dependency as the second argument to Command() that case).
In addition, the output of Command() can be used as input to PDF() if desired. For clarity, I didn't do that.

In this simple case, the easiest way is to just use the subprocess module
from subprocess import call
call("python table_generator.py")
call("pdflatex document.tex")
Regardless of where in your SConstruct file these lines are placed, they will happen before any of the compiling and linking performed by SCons.
The downside is that these commands will be executed every time you run SCons, rather than only when the files have changed, which is what would happen in your example Makefile. So if those commands take a long time to run, this wouldn't be a good solution.
If you really need to only run these commands when the files have changed, look at the SCons manual section Writing Your Own Builders.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.