How to grab files generated by a subprocess? - python

I want to run some command line scripts from within my python program. These scripts generates some output files. I want to grab these output files from the subprocess call as object in my python program, while canceling generation of files on disk. Problem is I don't know how to do it, or whether that is even possible.
A simple example would look like this:
#foo.py
fout1 = open("temp1.txt","w")
fout2 = open("temp2.txt","w")
fout1.write("fout1")
fout2.write("fout2")
fout1.close()
fout2.close()
#test.py
import subprocess
process = subprocess.Popen(["python","foo.py"], ????????) #what arguments to use to grab temp1.txt and temp2.txt
print(process.??????) #how to access those files
I am familiar with subprocess.Popen so that is what the example code uses, but I am open to the use of other modules too if they could do it.

Related

Why does python3.7 pass argv elements alongside 'garbage'

I had this script working for me, before I decided I'm gonna rewrite everything and make it portable.
Without delving too much into the details, there's a central Bash script, which calls 5 other Bash scripts in their own respective folders. I have no intention of porting to Windows anytime soon, as of current this is just for Linux.
The execution path of the central Bash script is:
dos.1/1-init.sh dos.1/
dos.2/1-trace-to-file.sh dos.2/ dos.1/
dos.3/1-recognize-categories.sh dos.3/
dos.4/1-ping-in-groups.sh dos.4/ dos.3/
dos.5/init.sh dos.5/ dos.4/
I run with ./init.sh
Before the script was 'portable' I was using explicit file paths inside each respective script. All was well and good. The program itself is a combination of Bash and Python, and writes to files in one directory, so that they can be manipulated in various ways, before being read back into different parts of the program.
I understand that the fastest way to do this would be to write a monolithic Python script, using subprocess calls for the Bash side of things... However, I am doing it this way to ease maintenance, and (before I started making it 'portable') it was lightning fast.
My issue now is this: each time I have to read text into Python (either from SQL or from file) there's always this added garbage. Up until this point, I have been using sed, awk and Python's .rstrip() function to manage this... Which is all well and good, but this one damn function will not play nice... And I feel there must be a better way.
In bash I call it with:
$prog_dir=$1
$data_dir=$2
$prog_dir/2fast-ping.py $data_dir/group0.txt > $prog_dir/group0_averages.txt
$prog_dir/2fast-ping.py $data_dir/group1.txt > $prog_dir/group1_averages.txt
...
Now I know that I could write to file from within Python, but in this instance I have other reasons not to.
The issue, is that when the 2fast-ping.py script is ran, it reads the text file in with commas and a newline char. I have vigorously checked and I can confirm that the group#.txt files 100% do not contain commas. Here's the Python:
import sys
import subprocess
import select
from concurrent.futures import ThreadPoolExecutor
filename = sys.argv[1]
f = open(filename, "r")
ips = [elem.rstrip('\n') for elem in f]
print(ips)
f.close()
The script goes on to do some work on the IPs afterwards, but this is the painful part. If I call the script direct from CLI: ./2fast-ping.py ../dos.3/group0.txt, the text is processed PROPERLY and the superseding instructions actually function. But, when called from the first init script, the program basically sh*ts itself because each line is read in with commas. It works until the point where it starts to use the processed info, then:
<actual IP would be here>
ping: ('##.###.###.###',): Name or service not known
Of course, the issue is the ('',) But, Python is adding that in, and I don't know how to stop it :(
Any ideas?
Python code was okay, just passing an additional / with the argument :(

Linking multiple python scripts to run one after another

I have three python scripts. One gathers data from database(data_for_report.py), another generates report from that data and creaters .xlsx file(report_gen.py) and the last one modifies the style of that excel file(excel_style.py).
Now all three files are in the same directory and what I do now is simply execute scripts one after another in the interpreter to get the report. I want to make everything work with one click so people who need this report could do it themselves. I thought of creating an exe with pyinstaller, but I can not think of a way to link my scripts together so that when data_for_report.py ends its job report_gen.py is started and so on.
I tried to put
subprocess.call("report_gen.py", shell=True)
at the end of the first script, but nothing happens, I just get this:
Out[2]: 1
How could I do this?
Actually, This problem can be solved by using batch programming. Your python files will run in batches i.e. one file after the other. I am assuming your all three python files resides in folder ReportGenerator with Path as C:\ReportGenerator so adjust accordingly the PATH as of your system (Please care for \ and / in PATH of folder having the python files).
Your files are which need to be executed:
data_for_report.py
report_gen.py
excel_style.py
Now open a Notepad file and write the below lines.
cd C:/ReportGenerator
python data_for_report.py
python report_gen.py
python excel_style.py
PAUSE
Now save this file with file_Name.bat anywhere u want in system and remember it. After saving the batch file icon will form on saving.
Now Open window command prompt and just drag this batch file to window command prompt.
Why not encapsulate all the logic for each script in a function, make a new file which imports all the 3 functions, and then run that script.
So if the scripts are
data_for_report.py
def f1():
...
report_gen.py
def f2():
...
excel_style.py
def f3():
...
Then the final script which you will run is :
from data_for_report import f1
from report_gen import f2
from excel_style import f3
f1()
f2()
f3()

From a Python script, run all Python scripts in a directory and stream output

My file structure looks like this:
runner.py
scripts/
something_a/
main.py
other_file.py
something_b/
main.py
anythingelse.py
something_c/
main.py
...
runner.py should look at all folders in scripts/ in run the main.py located there.
Right now I'm achieving this through subprocess.check_output. It works, but some of these scripts take a long time to run and I don't get to see any progress; it prints everything after the process has finished.
I'm hoping to find a solution that allows for 2 things to be done somewhat easily:
1) Stream the output instead of getting it all at the end
2) Doesn't
prohibit running multiple scripts at once
Is this possible? A lot of the solutions I've seen for running a Python script from another require knowledge of the other script's name/location. I can also enforce that all the main.py's have a specific function if that helps.
You could use Popen to loop through each file and write its content to multiple log files. Then, you could read from these files in real-time, while each one is populated. :)
How you would want to translate the output to a more readable format, is a little bit more tricky because of readability. You could create another script which reads these log files, with Popen, and decide on how you'd like this information read back in a understandable manner.
""" Use the same command as you would do for check_output """
cmd = ''
for filename in scriptList:
log = filename + ".log"
with io.open(filename, mode=log) as out:
subprocess.Popen(cmd, stdout=out, stderr=out)

Python multiple instances of program writing and then reading the same file

I wrote a piece of python code that calls a external program to write an intermediate file and thereafter my code reads from it. I want to run multiple instances of my code simultaneously. Will there be any conflict if I code list this?
args=['/usr/bin/program','-o','intermediate_file']
process = subprocess.Popen(args,shell=False)
process.wait()
if process.returncode ==0:
fh = open('intermediate_file', 'r')
process(fh)
...
Concurrent file access is handled by the operating system. There are several scenarios, depending on the OS and or filesystem you use. Take a look at the Wikipedia-article.
Take a look here: tempfile
You can make use of this lib to avoid conflicts - temp files have random names.

Python run command line (time)

I want to run the 'time' unix command from a Python script, to time the execution of a non Python app. I would use the os.system method.
Is there any way to save the output of this in Python? My goal is to run the app several times, save their execution times and then do some statistics on them.
Thank You
You really should be using the subprocess module to run external commands. The Popen() method lets you specify a file object where stdout should go (note, you can use any Python object that behaves like a file, you don't necessarily need to write it to a file).
For instance:
import subprocess
log_file = open('/path/to/file', 'a')
return_code = subprocess.Popen(['/usr/bin/foo', 'arg1', 'arg2'], stdout=log_file).wait()
You can use the timeit module to time code snippets (say, a function that launches external commands using subprocess module as described in the answer above) and save the data to a csv file. You can do statistics on the csv data using a stats module or externally using Excel/ LogParser/ R etc.
Another approach is to use the hotshot profiler that does the profiling and also returns stats that you can either print using print_stats() method or save to a file by iterating over.

Categories

Resources