I am writing a check.py file which reads a file that maps a python file to an output. That file looks something like this
001.py 233168
002.py 4613732
This means 001.py when ran should print out 233168. What is the best way to capture stdout from 00*.py? Overriding stdout and using execfile? or using a subprocess?
I have never done anything like this before, but it seems like
subprocess.check_output
does exactly what I want, is there a more appropriate way of doing this?
A subprocess. That way the script being executed cannot disrupt the calling script too badly.
Related
I wrote a python script that works. The first line of my script is reading an hdf5 file
readFile = h5py.File('FileName_00','r')
After reading the file, my script does several mathematical operations, successfully working. In the output I got function F.
Now, I want to repeat the same script for different files. Basically, I only need to modify FileName_00 by FimeName_01 or ....FileName_10. I was thinking to create a script that call this script!
I never wrote a script that call another script, so any advice would be appreciable.
One option: turn your existing code into a function which takes a filename as an argument:
def myfunc(filename):
h5py.file(filename, 'r')
...
Now, after your existing code, call your function with the filenames you want to input:
myfunc('Filename_00')
myfunc('Filename_01')
myfunc('Filename_02')
...
Even more usefully, I definitely recommend looking into
if(__name__ == '__main__')
and argparse (https://docs.python.org/3/library/argparse.html) as jkr noted.
Also, if you put your algorithm in a function like this, you can import it and use it in another Python script. Very useful!
Although there are certainly many ways to achieve what you want without multiple python scripts, as other answerers have shown, here's how you could do it.
In python we have this function os.system (learn more about it here: https://docs.python.org/3/library/os.html#os.system). Simply put, you can use it like this:
os.system("INSERT COMMAND HERE")
Replacing INSERT COMMAND HERE with the command you use to run your python script. For example, with a script named script.py you could conceivably (depending on your environment) include the following line of code in a secondary python script:
os.system("python script.py")
Running the secondary python script would run script.py as well. FWIW, I don't necessarily think this is the best way to accomplish your goal -- I tend to agree with DraftyHat's solution in most circumstances. But in case you were curious, this is certainly an option in python. I've used this functionality in the past, albeit not to run other python scripts, but to execute commands in the shell. Hope this helps!
I am using a 3rd-party python module which is normally called through terminal commands. When called through terminal commands it has a verbose option which prints to terminal in real time.
I then have another python program which calls the 3rd-party program through subprocess. Unfortunately, when called through subprocess the terminal output no longer flushes, and is only returned on completion (the process takes many hours so I would like real-time progress).
I can see the source code of the 3rd-party module and it does not set printing to be flushed such as print('example', flush=True). Is there a way to force the flushing through my module without editing the 3rd-party source code? Furthermore, can I send this output to a log file (again in real time)?
Thanks for any help.
The issue is most likely that many programs work differently if run interactively in a terminal or as part of a pipe line (i.e. called using subprocess). It has very little to do with Python itself, but more with the Unix/Linux architecture.
As you have noted, it is possible to force a program to flush stdout even when run in a pipe line, but it requires changes to the source code, by manually applying stdout.flush calls.
Another way to print to screen, is to "trick" the program to think it is working with an interactive terminal, using a so called pseudo-terminal. There is a supporting module for this in the Python standard library, namely pty. Using, that, you will not explicitly call subprocess.run (or Popen or ...). Instead you have to use the pty.spawn call:
def prout(fd):
data = os.read(fd, 1024)
while(data):
print(data.decode(), end="")
data = os.read(fd, 1024)
pty.spawn("./callee.py", prout)
As can be seen, this requires a special function for handling stdout. Here above, I just print it to the terminal, but of course it is possible to do other thing with the text as well (such as log or parse...)
Another way to trick the program, is to use an external program, called unbuffer. Unbuffer will take your script as input, and make the program think (as for the pty call) that is called from a terminal. This is arguably simpler if unbuffer is installed or you are allowed to install it on your system (it is part of the expect package). All you have to do then, is to change your subprocess call as
p=subprocess.Popen(["unbuffer", "./callee.py"], stdout=subprocess.PIPE)
and then of course handle the output as usual, e.g. with some code like
for line in p.stdout:
print(line.decode(), end="")
print(p.communicate()[0].decode(), end="")
or similar. But this last part I think you have already covered, as you seem to be doing something with the output.
I write a python script in which there are several print statement. The printed information can help me to monitor the progress of the script. But when I qsub the bash script, which contains python my_script &> output, onto computing nodes, the output file contains nothing even when the script is running and printing something. The output file will contains the output when the script is done. So how can I get the output in real time through the output file when the script is running.
Actually write to the file rather than piping and flush after each write or after each write call sys.stdout.flush() but you are better off using a logger function and replacing the prints with logs.
From Comments:
A logger function is one that you call instead of print that will output to somewhere the text, possibly timestamped and with other information, they usually let you output various amounts of information to various destinations including stdout and files. See python 2 or 3 documents for information on pythons built in logging function.
I like to write data to sys.stderr sometimes for this sort of thing. It obviates the need to flush so much. But if you're generating output for piping sometimes, you remain better off with sys.stdout.
Okay, so, I just have a quick question regarding python and linux.
I have a program that collects and outputs data to stdout indefinitely. I need to parse this data, and I have a python program I wrote that will do just that. However, I cannot save this data to a file first, as it produces far too much output to save to disk. Is there any way to use redirects to somehow pipe this output into the program?
Example:
python parser.py < ./dataCollector.sh
Close, but you want an actual pipe not a shell redirect:
./dataCollector.sh | python parser.py
i have tests that i ran which can take up to 15m at a time. during these 15m, a log file is periodically written to. however, most of the content is useless.
in response to this i have a python script that parses out the useless text and displays the relevant data.
what i'm trying to achieve is similar to what tail -f log_file, constantly updating the terminal with the newest additions to a file. i was thinking that if a python script ran as a process, it could parse the log file whenever the tests write to it, then the python script can go to sleep until interrupted again once the log file is written to.
any ideas how one can achieve this?
i already have a script that does the parsing, i just don't know how to make it do it continually and efficiently.
You could just have the script filter standard input, and pipe tail -f through it. When you're waiting on stdin, your script will sleep, so it's plenty efficient.
Eg.
python long_running_script.py && tail -f log_file | python filter_logs.py
Your script can be something like
while true:
line = sys.stdin.readline()
if filter_line(line): print line
looks like you need something like "pytailer":
http://code.google.com/p/pytailer/
While I never used it myself, last example looks like what you want.
any ideas how one can achieve this?
This should be pretty easy to do. Most of what you want is already part of your OS.
python test.py | python log_parser.py
Be sure your tests write their log to stdout instead of some other file. This is often easy to do with small changes to the logging configuration.
Having implemented almost this exact tool, I had great success using the inotify capability in twisted