How to run multiple bash scripts from python master script in parallel? - python

I have a series of time consuming independent bash scripts to be run in parallel on 7 CPU cores from python master script. I tried to implement this using multiprocessing.Pool.map() function iterating over numbers from xrange(1, 300) sequence, where every number is used to define the name of a directory containing bash script to be executed. The issue is that the following script spawns 7 processes for bash script runs and finishes right after they are completed.
import multiprocessing
import os
a= os.getcwd() #gets current path
def run(x):
b = a + '/' + 'dir%r' % (x) # appends the name of targeted folder to path
os.chdir(b) #switches to the targeted directory
os.system('chmod +x run.sh')
os.system('./run.sh') # runs the time consuming script
if __name__ == "__main__":
procs = 7
p = multiprocessing.Pool(procs)
p.map(run, xrange(1, 300))
print "====DONE===="
I expect other 292 shell scripts to be run as well, so what fix or alternative implementation could help me?
Thank you!

Related

How do I run multiple commands with python subprocess( ) without waiting for the end of each command? [duplicate]

This question already has an answer here:
Blocking and Non Blocking subprocess calls
(1 answer)
Closed 12 months ago.
There are two python scripts involved in this task.
My current task requires me to run a long process(takes about a day or two per each, and this is the first python script) in each of 29 available regions on GCP's instances. In order to finish the task as quick as possible, I'm trying to run each process in each instance all at once after spinning off 29 VMs all at once.
As manually running the first script by SSH-ing in to each of the instance is cumbersome, I wrote a python script(the second script) that SSHs into each region's VM and runs the first script I mentioned above.
The issue with the second script that runs first script in different regions is that it doesn't start off to run the first script in second region's VM until it finishes running in the first region's VM, whereas I need the second script to run the first script in every region without waiting for the process started by first script to end.
I use subprocess() in the second script to run the first script in each VMs.
The following code is the second script:
for zone, instance in zipped_zone_instance:
command = "gcloud compute ssh --zone " + zone + " " + instance + " --project cloud-000000 --command"
command_lst = command.split(" ")
command_lst.append("python3 /home/first_script.py")
subprocess.run(command_lst)
I need the subprocess.run(command_lst) to run for every 29 zones at once rather than it running for the second zone only after the first zone's process ends.
The following code is the first script:
for idx, bucket in enumerate(bucket_lst):
start = time.time()
sync_src = '/home/' + 'benchmark-' + var_
subprocess.run(['gsutil', '-m', '-o', 'GSUtil:parallel_composite_upload_threshold=40M', 'rsync', '-r', sync_src, bucket])
end = time.time() - start
time_lst.append(end)
tput_lst.append(tf_record_disk_usage / end)
What can I fix in the second script or the first script to achieve what I want??
Switch out your subprocess.run(command_lst) with Popen(command_lst, shell=True) in each of your scripts and and loop through the command list like the example below to run the processes in parallel.
This is how you implement Popen to run processes in parallel using arbitrary commands for simplicity.
from subprocess import Popen
commands = ['ls -l', 'date', 'which python']
processes = [Popen(cmd, shell=True) for cmd in commands]

Python: looping a python script in another python script with parameter passing

I am trying to run a py script in loop from another py file with a parameter passing.
I am trying the following:
Script1:
lst = [12,23,45,67,89]
age_lst = []
for i in lst:
age_i = os.system("python script_to_run {0}".format(int(i)) )
age_lst.append(age_i)
Below is the code for script_to_run.py
Script2:script_to_run.py
def age(age:int):
estimated_val = age+2
return estimated_val
if __name__=="__main__":
my_age = int(sys.argv[1])
final_age = age(age=my_age)
print(final_age)
Whenever I am running Script 1 where I am calling Script 2 (script_to_run.py) It is running fine but age_lst[] is being populated only with 2.
Expectation is
age_lst = [14,25,47,69,91] <---adding 2 with all elements in age_lst
What I am missing?
Also when I am running the Script1.py from cmd, I am getting error python: can't open file 'script_to_run': [Errno 2] No such file or directory
I am using Windows 10.
os.system runs the program and returns its exit code. Your script writes to standard output, a different beast entirely. Exactly what returns from os.system is OS dependent. On linux for example, its the exit code, limited to 0-255, shifted left with signal information added. Messy.
But since you are converting the output to a string and printing to stdout anyway, just have the parent process read that. The subprocess module has several functions that run programs. run is the modern way.
import subprocess as subp
import sys
lst = [12,23,45,67,89]
age_lst = []
for i in lst:
proc = subp.run([sys.executable, "script_to_run.py", str(i)],
stdout=subp.PIPE)
if proc.returncode == 0:
print("script returned error")
else:
age_lst.append(int(proc.stdout))
print(age_lst)

Python script doesn't run background process, when called by cron

I have a python script, runned by cron:
"*/5 * * * * python /home/alex/scripts/checker > /dev/null &";
It has several purposes, one of them is to check certain programs in ps list and run them if they are not there. The problem is that script when runned by cron not executed programs in backgroung correctly, all of them are in ps list look like:
/usr/bin/python /home/alex/exec/runnable
So they look like python scripts. When I launch my python script manually it seems that it executes runnable in background corretcly, but with cron nothing works.
Here's the example of code:
def exec(file):
file = os.path.abspath(file)
os.system("chmod +x " + file)
cmd = file
#os.system(cmd)
#subprocess.Popen([cmd])
subprocess.call([cmd])
I tried different approaches but nothing seems to work right.
Some code update:
pids = get_pids(program)
if pids == None:
exec(program)
print 'Restarted'

How to use batch file to run multiple python scripts simultaneously

I have many python scripts and it is a pain to run each one of them individually by clicking them. How to make a batch file to run them all at once?
just make a script like this backgrounding each task (on windows):
start /B python script1.py
start /B python script2.py
start /B python script3.py
on *nix:
python script1.py &
python script2.py &
python script3.py &
Assuming non of your script requires human interaction to run
Use the start command to initiate a process.
#echo off
start "" foo.py
start "" bar.py
start "" baz.py
Re comment: “is there way to start these minimized?”
You can always ask about how a command works by typing the command name followed by a /?. In this case, start /? tells us its command-line options include:
MIN Start window minimized.
Hence, to start the application minimized, use:
start "" /MIN quux.py
Multiprocessing .py files simultaneously
Run as many .py files simultaneously as you want. Create for each .py a .bat to start the python file. Define all the .bat files in the list of lists. The second parameter in the list is a delay to start the .bat file. Don't use zero for the delay. It works fine. On this way You leave parallelism to the operating system which is very fast and stable. For every .bat you start opens a command window to interact with the User.
from apscheduler.schedulers.background import BackgroundScheduler
import datetime as dt
from os import system
from time import sleep
parallel_tasks = [["Drive:\YourPath\First.bat", 1], ["Drive:\YourPath\Second.bat", 3]]
def DatTijd():
Nu = dt.datetime.now()
return Nu
def GetStartTime(Nu, seconds):
StartTime = (Nu + dt.timedelta(seconds=seconds)).strftime("%Y-%m-%d %H:%M:%S")
return StartTime
len_li = len(parallel_tasks)
sleepTime = parallel_tasks[len_li - 1][1] + 3
Nu = DatTijd()
for x in range(0, len_li):
parallel_tasks[x][0] = 'start cmd /C ' + parallel_tasks[x][0]
# if you want the command window stay open after the tasks are finished use: cmd /k in the line above
delta = parallel_tasks[x][1]
parallel_tasks[x][1] = GetStartTime(Nu, delta)
JobShedul = BackgroundScheduler()
JobShedul.start()
for x in range(0, len_li):
JobShedul.add_job(system, 'date', run_date=parallel_tasks[x][1], misfire_grace_time=3, args=[parallel_tasks[x][0]])
sleep(sleepTime)
JobShedul.shutdown()
exit()
Example.bat
echo off
Title Python is running [Your Python Name]
cls
echo "[Your Python Name] is starting up ..."
cd Drive:\YourPathToPythonFile
python YourPyFile.py

Running multiple python scripts in a sequence

I have scripts I would like to execute in sequence with a time delay between the each of them.
The intention is to run the scripts which scan for an string in file names and imports those files into a folder. The time delay is to give the script the time to finish copying the files before moving to the next file.
I have tried the questions already posed on Stackoverflow:
Running multiple Python scripts
Run a python script from another python script, passing in args
But I'm not understanding why the lines below don't work.
import time
import subprocess
subprocess.call(r'C:\Users\User\Documents\get summary into folder.py', shell=True)
time.sleep(100)
subprocess.call(r'C:\Users\User\Documents\get summaries into folder.py', shell=True)
time.sleep(100)
The script opens the files but doesn't run.
Couple of things, first of all, time.sleep accepts seconds as an argument, so you're waiting 100s after you've spawned these 2 processes, I guess you meant .100. Anyway, if you just want to run synchronously your 2 scripts better use subprocess.Popen.wait, that way you won't have to wait more than necessary, example below:
import time
import subprocess
test_cmd = "".join([
"import time;",
"print('starting script{}...');",
"time.sleep(1);",
"print('script{} done.')"
])
for i in range(2):
subprocess.Popen(
["python", "-c", test_cmd.format(*[str(i)] * 2)], shell=True).wait()
print('-'*80)

Categories

Resources