Subprocess, Popen Python - python

I want to run a program, let's say MATLAB or other FEA software from Python, wait for it to run and store results and later use again in Python to process further. I am not able to find a really basic example on how to do so. A simple code or any useful link will be highly appreciated. The help on Subprocess module seems a bit complicated.

I just spent a while trying to work this out from frustratingly vague documentation and examples and finally got it figured out.
Here's a really simple demo example:
How to run a MATLAB script from Python
(using subprocess.Popen, without having to install the matlab engine)
Step 1:
Create the MATLAB script you want to run. In this demo, I have two scripts, saved in the folder C:/Users/User/Documents/MATLABsubprocess :
triangle_area.m
b = 5;
h = 3;
a = 0.5*(b.* h);
save('a.txt','a', '-ASCII')
triangle_area_fun.m
function [a] = triangle_area(b,h)
a = 0.5*(b.* h); %area
save('a.txt','a', '-ASCII')
end
Step 2:
Once these two .m files are created, the following Python script runs them using subprocess.Popen():
#Imports:
import subprocess as sp
import pandas as pd
#Set paths and options:
#note: paths need to have forward slashes not backslashes (why?!)
program = 'C:/Program Files/MATLAB/R2017b/bin/matlab.exe' #path to MATLAB exe
folder = 'C:/Users/User/Documents/MATLABsubprocess' #path to MATLAB folder with scripts to run
script = 'triangle_area' #name of script to run
options = '-nosplash -nodesktop -wait' #optional: set run options (nosplash? nodesktop means MATLAB won't open a new desktop window, wait means Python will wait until MATLAB is done beore continuing (needs to be paired with p.wait() after sp.Popen))
has_args = True #set whether the MATLAB script needs arguments (i.e. is it a function?)
#Optional: define arguments to feed to function
if has_args ==True:
script = 'triangle_area_fun' #select script version with arguments
b = 5
h = 3
args = '({},{})'.format(b,h) #put all args into one string
#Set function string:
#Structure: """path_to_exe optional_arguments -r "cd(fullfile('path_to_folder')), script_name, exit" """
#Example: """C:/Program Files/MATLAB/R2017b/bin/matlab.exe -r "cd(fullfile('C:/Users/User/Documents/MATLABsubprocess')), triangle_area, exit" """
#basically, needs to know where the program to use lives, then takes some optional settings, -r runs the program, cd changes to the directory with the script, then needs the name of the script (possibly with arguments), then exits
fun = """{} {} -r "cd(fullfile('{}')), {}, exit" """.format(program, options, folder, script) #create function string that tells subprocess what to do
if has_args==True:
fun = """{} {} -r "cd(fullfile('{}')), {}{}, exit" """.format(program, options, folder, script, args)
print('command:', fun)
#Run MATLAB:
print('running MATLAB script...')
p = sp.Popen(fun) #open the subprocess & run the MATLAB script
p.wait() #wait until MATLAB is done before proceeding (this needs to be paired with -wait in options)
print('done') #if the run is successful, an output file named a.txt should appear in the folder with the MATLAB scripts
#Import MATLAB output files back into Python:
a = pd.read_csv('a.txt', header=None) #read text file using pandas
print(a)

Related

How to pass command-line arguments to a python script from within another script?

I've got a python script that is named code.py and this takes two command line arguments --par_value xx --out_dir /path/to/dir where par_value is a certain parameter and out_dir is the output directory. I want to run this script for different values of the input parameter.
So I want to use the array:
par_vector = np.array([1,2,3])
and have the output file for each named, say, "output_1", "output_2" and "output_3". So I'm trying to write a python script where I can create that array above and an array of output strings, say, called output_strings so I can automatically run something along the like:
for i,j in zip(par_vector, output_strings):
python code.py --par_value i --out_dir j
But this does not run so I'm unable to figure out how to make such an automation of my code work rather than repeatedly calling the script code.py from the terminal. Any advice is welcome!
It might be a bit convoluted, but a way you could do it is to generate either .bat or .sh files (depending on your operating system) and call them with Popen from the subprocess library.
from subprocess import Popen, PIPE
for i,j in zip(par_vector, output_strings):
cmd = """python code.py --par_value {} --out_dir {}""".format(i,j)
temp_filename = "{}{}.bat".format(i,) #swap .bat for .sh if not on Windows
with open(temp_filename, 'wb') as out_file:
out_file.write(cmd)
execution = Popen([temp_filename], stdout=PIPE, stderr=PIPE)
results = execution.communicate()
# add conditions based on the output generated by results if you want to use it to verify, etc.
os.remove(temp_filename)
Obviously this needs to be changed to match your exact needs for the file name and location, etc..
I would use os.system for this if you're OK with the fact that os.system() is blocking. This would look something like:
import os
for i,j in zip(par_vector, output_strings):
os.system(f"python code.py --par_value {i} --out_dir {j}")

Problem with broken backup and python script

Right up front to be clear, I am not fluent in programming or python, but generally can accomplish what I need to with some research. Please excuse any bad formatting structure, as this is my first post to a board like this
I recently updated my laptop from Ubuntu 18.04 to 20.04. I created a full system backup with Dejadup, which due to a missing file, could not be restored. Research brought me to post on here from 2019 for manually restoring these files. The process called for 2 scripts, 1 to unpack and the second to reconstruct the files, both created by Hamish Downer.
The first,
"for f in duplicity-full.*.difftar.gz; do echo "$f"; tar xf "$f"; done"
seemed to work well and did unpack the files.
The second,
#!/usr/bin/env python3
import argparse
from pathlib import Path
import shutil
import sys"
is the start of a re-constructor script. Using terminal from within the directory I am trying to rebuild I enter the first line and return.
When I enter the second line of code the terminal just "hangs" with no activity, and will only come back to the prompt if I double click the cursor. I receive no errors or warnings. When I enter the third line of code
"from pathlib import Path"
and return I then get an error
from: can't read /var/mail/pathlib
The problem seems to originate with the "import argparse" command and I assume is due to a symlink.
argparse is located in /usr/local/lib/python3.8/dist-packages (1.4.0)
python3 is located in /usr/bin/
Python came with the Ubuntu 20.04 distribution package.
Any help with reconstructing these files would be greatly appreciated, especially in a batch as this script is meant to do versus trying to do them one file at a time.
Update: I have tried adding the "re-constructor" part of this script without success. This is a link to the script I want to use:
https://askubuntu.com/questions/1123058/extract-unencrypted-duplicity-backup-when-all-sigtar-and-most-manifest-files-are
Re-constructor script:
class FileReconstructor():
def __init__(self, unpacked_dir, restore_dir):
self.unpacked_path = Path(unpacked_dir).resolve()
self.restore_path = Path(restore_dir).resolve()
def reconstruct_files(self):
for leaf_dir in self.walk_unpacked_leaf_dirs():
target_path = self.target_path(leaf_dir)
target_path.parent.mkdir(parents=True, exist_ok=True)
with target_path.open('wb') as target_file:
self.copy_file_parts_to(target_file, leaf_dir)
def copy_file_parts_to(self, target_file, leaf_dir):
file_parts = sorted(leaf_dir.iterdir(), key=lambda x: int(x.name))
for file_part in file_parts:
with file_part.open('rb') as source_file:
shutil.copyfileobj(source_file, target_file)
def walk_unpacked_leaf_dirs(self):
"""
based on the assumption that all leaf files are named as numbers
"""
seen_dirs = set()
for path in self.unpacked_path.rglob('*'):
if path.is_file():
if path.parent not in seen_dirs:
seen_dirs.add(path.parent)
yield path.parent
def target_path(self, leaf_dir_path):
return self.restore_path / leaf_dir_path.relative_to(self.unpacked_path)
def parse_args(argv):
parser = argparse.ArgumentParser()
parser.add_argument(
'unpacked_dir',
help='The directory with the unpacked tar files',
)
parser.add_argument(
'restore_dir',
help='The directory to restore files into',
)
return parser.parse_args(argv)
def main(argv):
args = parse_args(argv)
reconstuctor = FileReconstructor(args.media/jerry/ubuntu, args.media/jerry/Restored)
return reconstuctor.reconstruct_files()
if __name__ == '__main__':
sys.exit(main(sys.argv[1:]))
I think you are typing the commands to the shell instead of python interpreter. Please check your prompt, python (started with python3) has >>>.
Linux has an import command (part of the ImageMagick) and understands import argparse, but it does something completely different.
import - saves any visible window on an X server and outputs it as an
image file. You can capture a single window, the entire screen, or any
rectangular portion of the screen.
This matches the described behaviour. import waits for a mouse click and then creates a large output file. Check if there is a new file named argparse.
An executable script contains instruction to be processed by an interpreter and there are many possible interpreters, several shells (bash and alternatives), languages like Perl, Python, etc. and also some very specialized like nft for firewall rules.
If you execute a script from the command line, the shell reads its first line. If it starts with #! characters (called "shebang"), it uses the program listed on that line. (note: /usr/bin/env there is just a helper to find the exact location of a program).
But if you want to use an interpreter interactively, you need to start it explicitly. The shebang line has no special meaning in this situation, only as the very first line of a script. Otherwise it is just a comment and is ignored.

batch file to convert .mp4 to .mp3 crashes half the times

I am using a batch file to access my portable VLC executable to convert an mp4 to an mp3:
set arg1=%1 REM -> arg1={my_mp4_full_path}
set arg2=%2 REM -> arg2={my_mp3_full_path}
echo %arg1%
echo %arg2%
REM batch file is in the same directory as "VLCPlayer" folder
"%~dp0\VLCPlayer\VLCPortable.exe" -I dummy %arg1% --sout=#transcode{acodec=mp3,ab=128,vcodec=dummy}:std{access="file",mux="raw",dst=%arg2%} vlc://quit
When I run this script the first time, vlc crashes and I get an unplayable mp3 file, however when I run the script again the script works and I get a playable mp3. Is there a way to remedy this, or make it consistent? I don't see why running it twice would yield different outcomes.
No I don't have ffmpeg on my computer it is unrecognizable internal or external command.
Note that I face the same problem when using powershell to perform the same task, when I import my function from a .psm1 script:
function ConvertToMp3(
[switch] $inputObject,
[string] $vlc = '{PAth_TO_PORTABLE_VLC}\VLCPortable.exe')
{
PROCESS {
$codec = 'mp3';
$oldFile = $_;
$newFile = $oldFile.FullName.Replace($oldFile.Extension, ".$codec").Replace("'","");
&"$vlc" -I dummy "$oldFile" ":sout=#transcode{acodec=$codec,
vcodec=dummy}:standard{access=file,mux=raw,dst=`'$newFile`'}" vlc://quit | out-null;
# delete the original file
Remove-Item $oldFile;
}
}
I get the same random output that sometimes works, sometimes crashes.
Update:
I feel like I should add more info of how I use the batch file:
I have a python script Convert.py and I call my batch file inside using os.system():
mp4_to_convert = arguments.file
full_path_mp4 = os.path.join(outdir,mp4_to_convert)
mp3_to_convert_to = mp4_to_convert.replace(".mp4",".mp3")
full_path_mp3 = os.path.join(outdir,mp3_to_convert_to)
command_string = """Convert_Script.bat \"{}\" \"{}\"""".format(full_path_mp4, full_path_mp3)
os.system(command_string)
This is the documentation of os.system():
os.system(command)
Execute the command (a string) in a subshell. This
is implemented by calling the Standard C function system(), and has
the same limitations. Changes to sys.stdin, etc. are not reflected in
the environment of the executed command. If command generates any
output, it will be sent to the interpreter standard output stream.
On Unix, the return value is the exit status of the process encoded in
the format specified for wait(). Note that POSIX does not specify the
meaning of the return value of the C system() function, so the return
value of the Python function is system-dependent.
On Windows, the return value is that returned by the system shell
after running command. The shell is given by the Windows environment
variable COMSPEC: it is usually cmd.exe, which returns the exit status
of the command run; on systems using a non-native shell, consult your
shell documentation.
Any pointers or suggestions would be helpful, thank you in advance for your help.

Running .vbs scipt inside Python doesn't do anything

Idea
Basically, what my script does is checking C:/SOURCE for .txt files and add a timestamp to it. To replicate it you can basically make that folder and put some txt files in there. Then, it's supposed to run a .vbs file, which then runs a .bat files with some rclone commands which don't matter here. I did it like this because there wont be a CMD window opening when running the rclone command through the .vbs file.
Python code
import time, os, subprocess
while True:
print("Beginning checkup")
print("=================")
timestamp = time.strftime('%d_%m_%H_%M') # only underscores: no naming issues
the_dir = "C:/SOURCE"
for fname in os.listdir(the_dir):
if fname.lower().endswith(".txt"):
print("found " + fname)
time.sleep(0.1)
new_name = "{}-{}.txt".format(os.path.splitext(fname)[0], timestamp)
os.rename(os.path.join(the_dir, fname), os.path.join(the_dir, new_name))
time.sleep(0.5)
else:
subprocess.call(['cscript.exe', "copy.vbs"])
time.sleep(60)
VBScript code
Set WshShell = CreateObject("WScript.Shell" )
WshShell.Run Chr(34) & "copy.bat" & Chr(34), 0
Set WshShell = Nothing
The only important part for the Python script is below the very last else, where the subprocess.call() is supposed to run the .vbs file. What happens when running the script is it shows the first two lines that always come up when running CMD, but then nothing.
How could I fix that? I tried:
subprocess.call("cscript copy.vbs")
subprocess.call("cmd /c copy.vbs")
both with the same outcome, it doesn't do anything.
Anyone have an idea?
Why are you invoking a VBScript to invoke a batch script from Python? You should be able to simple run whatever the batch script is doing directly from your Python code. But even if you wanted to keep the batch script, something like this should do just fine without VBScript as an intermediary.
subprocess.call(['cmd', '/c', 'copy.bat'])
You may want to give the full path of the batch file, though, to avoid issues like the working directory not being what you think it is.
If your batch script resides in the same directory as the Python script, you can build the path with something like this:
import os
import subprocess
scriptdir = os.path.dirname(__file__)
batchfile = os.path.join(scriptdir, 'copy.bat')
subprocess.call(['cmd', '/c', os.path.realpath(batchfile)])
It seems there is no such an operation that could not be done using plain Python. Scan a directory, copy a file -- Python has it all in the standard library. See os.path and shutil modules.
Adding VB scripts and launching subprocesses make your code complex and difficult to debug.

Calling a subprocess within a script using mpi4py

I’m having trouble calling an external program from my python script in which I want to use mpi4py to distribute the workload among different processors.
Basically, I want to use my script such that each core prepares some input files for calculations in separate folders, then starts an external program in this folder, waits for the output, and then, finally, reads the results and collects them.
However, I simply cannot get the external program call to work. On my search for a solution to this problem I've found that the problems I'm facing seem to be quite fundamental. The following simple example makes this clear:
#!/usr/bin/env python
import subprocess
subprocess.call(“EXTERNAL_PROGRAM”, shell=True)
subprocess.call(“echo test”, shell=True)
./script.py works fine (both calls work), while mpirun -np 1 ./script.py only outputs test. Is there any workaround for this situation? The program is definitely in my PATH, but it also fails if I use the abolute path for the call.
This SO question seems to be related, sadly there are no answers...
EDIT:
In the original version of my question I’ve not included any code using mpi4py, even though I mention this module in the title. So here is a more elaborate example of the code:
#!/usr/bin/env python
import os
import subprocess
from mpi4py import MPI
def worker(parameter=None):
"""Make new folder, cd into it, prepare the config files and execute the
external program."""
cwd = os.getcwd()
dir = "_calculation_" + parameter
dir = os.path.join(cwd, dir)
os.makedirs(dir)
os.chdir(dir)
# Write input for simulation & execute
subprocess.call("echo {} > input.cfg".format(parameter), shell=True)
subprocess.call("EXTERNAL_PROGRAM", shell=True)
# After the program is finished, do something here with the output files
# and return the data. I'm using the input parameter as a dummy variable
# for the processed output.
data = parameter
os.chdir(cwd)
return data
def run_parallel():
"""Iterate over job_args in parallel."""
comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()
if rank == 0:
# Here should normally be a list with many more entries, subdivided
# among all the available cores. I'll keep it simple here, so one has
# to run this script with mpirun -np 2 ./script.py
job_args = ["a", "b"]
else:
job_args = None
job_arg = comm.scatter(job_args, root=0)
res = worker(parameter=job_arg)
results = comm.gather(res, root=0)
print res
print results
if __name__ == '__main__':
run_parallel()
Unfortunately I cannot provide more details of the external executable EXTERNAL_PROGRAM other than that it is a C++ application which is MPI enabled. As written in the comment section below, I suspect that this is the reason (or one of the resons) why my external program call is basically ignored.
Please note that I’m aware of the fact that in this situation, nobody can reproduce my exact situation. Still, however, I was hoping that someone here already ran into similar problems and might be able to help.
For completeness, the OS is Ubuntu 14.04 and I’m using OpenMPI 1.6.5.
In your first example you might be able to do this:
#!/usr/bin/env python
import subprocess
subprocess.call(“EXTERNAL_PROGRAM && echo test”, shell=True)
The python script is only facilitating the MPI call. You could just as well write a bash script with command “EXTERNAL_PROGRAM && echo test” and mpirun the bash script; it would be equivalent to mpirunning the python script.
The second example will not work if EXTERNAL_PROGRAM is MPI enabled. When using mpi4py it will initialize the MPI. You cannot spawn another MPI program once you initialized the MPI environment in such a manner. You could spawn using MPI_Comm_spawn or MPI_Comm_spawn_multiple and -up option to mpirun. For mpi4py refer to Compute PI example for spawning (use MPI.COMM_SELF.Spawn).

Categories

Resources