I'm trying to use multiprocessing to run multiple scripts. At the start, I launch a loading animation, however I am unable to ever kill it. Below is an example...
Animation: foo.py
import sys
import time
import itertools
# Simple loading animation that runs infinitely.
for c in itertools.cycle(['|', '/', '-', '\\']):
sys.stdout.write('\r' + c)
sys.stdout.flush()
time.sleep(0.1)
Useful script: bar.py
from time import sleep
# Stand-in for a script that does something useful.
sleep(5)
Attempt to run them both:
import multiprocessing
from multiprocessing import Process
import subprocess
pjt_dir = "/home/solebay/path/to/project" # Setup paths..
foo_path = pjt_dir + "/foo.py" # ..
bar_path = pjt_dir + "/bar.py" # ..
def run_script(path): # Simple function that..
"""Launches python scripts.""" # ..allows me to set a..
subprocess.run(["python", path]) # ..script as a process.
foo_p = Process(target=run_script, args=(foo_path,)) # Define the processes..
bar_p = Process(target=run_script, args=(bar_path,)) # ..
foo_p.start() # start loading animation
bar_p.start() # start 'useful' script
bar_p.join() # Wait for useful script to finish executing
foo_p.kill() # Kill loading animation
I get no error messages, and (my_venv) solebay#computer:~$ comes up in my terminal, but the loading animation persists (clipping over my name and environement). How can I kill it?
I've run into a similar situation before where I couldn't terminate the program using ctrl + c. The issue is (more or less) solved by using daemonic processes/threads (see multiprocessing doc). To do this, you simply change
foo_p = Process(target=run_script, args=(foo_path,))
to
foo_p = Process(target=run_script, args=(foo_path,), daemon=True)
and similarly for other children processes that you would like to create.
With that being said, I myself am not exactly sure if this is the correct way to remedy the issue with not being able to terminate the multiprocessing program, or is it just some artifact that happens to help with this. I would suggest this thread that went into the discussion about daemon threads more. But essentially, from my understanding, daemon threads would be terminated automatically whenever their parent process is terminated, regardless of whether they are finished or not. Meanwhile, if a thread is not daemonic, then somehow you need to wait until the children processes to finish before you're able to fully terminate the program.
You are creating too many processes. These two lines:
foo_p = Process(target=run_script, args=(foo_path,)) # Define the processes..
bar_p = Process(target=run_script, args=(bar_path,)) # ..
create two new processes. Let's all them "A" and "B". Each process consists of this function:
def run_script(path): # Simple function that..
"""Launches python scripts.""" # ..allows me to set a..
subprocess.run(["python", path]) # ..script as a process.
which then creates another subprocess. Let's call those two processes "C" and "D". In all you have created 4 extra processes, instead of just the 2 that you need. It is actually process "C" that's producing the output on the terminal. This line:
bar_p.join()
waits for "B" to terminate, which implies that "D" has terminated. But this line:
foo_p.kill()
kills process "A" but orphans process "C". So the output to the terminal continues forever.
This is well documented - see the description of multiprocessing.terminate, which says:
"Note that descendant processes of the process will not be terminated – they will simply become orphaned."
The following program works as you intended, exiting gracefully from the second process after the first one has finished. (I renamed "foo.py" to useless.py and "bar.py" to useful.py, and made small changes so I could run it on my computer.)
import subprocess
import os
def run_script(name):
s = os.path.join(r"c:\pyproj310\so", name)
return subprocess.Popen(["py", s])
if __name__ == "__main__":
useless_p = run_script("useless.py")
useful_p = run_script("useful.py")
useful_p.wait() # Wait for useful script to finish executing
useless_p.kill() # Kill loading animation
You can't use subprocess.run() to launch the new processes since that function will block the main script until the process completes. So I used Popen instead. Also I placed the running code under an if __name__ == "__main__" which is good practice (and maybe necessary on Windows).
Related
I have created a (rather large) program that takes quite a long time to finish, and I started looking into ways to speed up the program.
I found that if I open task manager while the program is running only one core is being used.
After some research, I found this website:
Why does multiprocessing use only a single core after I import numpy? which gives a solution of os.system("taskset -p 0xff %d" % os.getpid()),
however this doesn't work for me, and my program continues to run on a single core.
I then found this:
is python capable of running on multiple cores?,
which pointed towards using multiprocessing.
So after looking into multiprocessing, I came across this documentary on how to use it https://docs.python.org/3/library/multiprocessing.html#examples
I tried the code:
from multiprocessing import Process
def f(name):
print('hello', name)
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
a = input("Finished")
After running the code (not in IDLE) It said this:
Finished
hello bob
Finished
Note: after it said Finished the first time I pressed enter
So after this I am now even more confused and I have two questions
First: It still doesn't run with multiple cores (I have an 8 core Intel i7)
Second: Why does it input "Finished" before its even run the if statement code (and it's not even finished yet!)
To answer your second question first, "Finished" is printed to the terminal because a = input("Finished") is outside of your if __name__ == '__main__': code block. It is thus a module level constant which gets assigned when the module is first loaded and will execute before any code in the module runs.
To answer the first question, you only created one process which you run and then wait to complete before continuing. This gives you zero benefits of multiprocessing and incurs overhead of creating the new process.
Because you want to create several processes, you need to create a pool via a collection of some sort (e.g. a python list) and then start all of the processes.
In practice, you need to be concerned with more than the number of processors (such as the amount of available memory, the ability to restart workers that crash, etc.). However, here is a simple example that completes your task above.
import datetime as dt
from multiprocessing import Process, current_process
import sys
def f(name):
print('{}: hello {} from {}'.format(
dt.datetime.now(), name, current_process().name))
sys.stdout.flush()
if __name__ == '__main__':
worker_count = 8
worker_pool = []
for _ in range(worker_count):
p = Process(target=f, args=('bob',))
p.start()
worker_pool.append(p)
for p in worker_pool:
p.join() # Wait for all of the workers to finish.
# Allow time to view results before program terminates.
a = input("Finished") # raw_input(...) in Python 2.
Also note that if you join workers immediately after starting them, you are waiting for each worker to complete its task before starting the next worker. This is generally undesirable unless the ordering of the tasks must be sequential.
Typically Wrong
worker_1.start()
worker_1.join()
worker_2.start() # Must wait for worker_1 to complete before starting worker_2.
worker_2.join()
Usually Desired
worker_1.start()
worker_2.start() # Start all workers.
worker_1.join()
worker_2.join() # Wait for all workers to finish.
For more information, please refer to the following links:
https://docs.python.org/3/library/multiprocessing.html
Dead simple example of using Multiprocessing Queue, Pool and Locking
https://pymotw.com/2/multiprocessing/basics.html
https://pymotw.com/2/multiprocessing/communication.html
https://pymotw.com/2/multiprocessing/mapreduce.html
I want to create multi process app. Here is sample:
import threading
import time
from logs import LOG
def start_first():
LOG.log("First thread has started")
time.sleep(1000)
def start_second():
LOG.log("second thread has started")
if __name__ == '__main__':
### call birhtday daemon
first_thread = threading.Thread(target=start_first())
### call billing daemon
second_thread = threading.Thread(target=start_second())
### starting all daemons
first_thread.start()
second_thread.start()
In this code second thread does not work. I guess, after calling sleep function inside first_thread main process is slept. I found this post. But here sleep was used with class. I got that(Process finished with exit code 0
) as a result when I run answer. Could anybody explain me where I made a mistake ?
I am using python 3.* on windows
When creating your thread you are actually invoking the functions when trying to set the target for the Thread instead of passing a function to it. This means when you try to create the first_thread you are actually calling start_first which includes the very long sleep. I imagine you then get frustrated that you don't see the output from the second thread and kill it, right?
Remove the parens from your target= statements and you will get what you want
first_thread = threading.Thread(target=start_first)
second_thread = threading.Thread(target=start_second)
first_thread.start()
second_thread.start()
will do what you are trying
My program needs a snmp trap listener. It needs to continuously receive snmp traps and perform further calculations on them. I am using multiprocessing module of python.
Right now, my program looks something like this:
import multiprocessing
from snmpListener import snmpCompare
def main():
execute()
def execute():
try:
p=Process(target=snmpCompare)
p.start()
#my code that runs here
#it is basically sending commands to my server
#which sends snmp alerts as response to my commands
p.join()
except (KeyboardInterrupt,SystemExit):
p.terminate()
In snmpListener.py,
import multiprocessing
def trapListener():
snmpTrap= receiveSnmpTrap()
q.put(snmpTrap)
def snmpCompare():
f=open('Alerts.txt','w')
q=Queue()
p=Process(target=trapListener, args=(q,))
p.daemon=True
p.start()
while True:
alert= p.get()
f.write(alert)
#perform calculation on 'alert'
p.join()
f.close()
But, the code is running such that the child process from execute() function runs when it is created. Then all my commands in my parent process are getting executed on the server. The child process and parent process don't seem to running simultaneously. The alerts corresponding to the commands are not being received. i.e., The file "Alerts.txt" is empty.
I haven't been using multiprocessing module of python for a long time. In fact, I have worked very little on multiprocessing. I don't know where I am going wrong and I am a little confused. Any advice would be welcome.
UPDATE: I am calling trapListener when creating a child process from execute function. My calculations are being done in trapListener itself. I have also made trapListener a daemon process. My code is working now. Also there was an error being generated in child process due to which it was getting terminated. Hence Alerts.txt was empty.
I need this urgently in my Django site, but because of the time constraint, I cannot do any heavy modifications. This is probably the cheapest in-place modification.
If we just focus on either build or run...
Now I get the id back from build (or run).
All the heavy work is now in a separate function.
'
import multiprocessing as mp
def main():
id = get_build_id(....)
work = mp.Process(target=heavy_build_fn)
work.start()
return id
If I ran this in the shell (I have not tested this on the actual Django app), the terminal will not end completely until work process is done with its job. As a web app, I need to return the id right away. Can I place work on the background without interrupting?
Thanks.
I've read this How do I run another script in Python without waiting for it to finish?, but I want to know other ways to do it, for example, sticking with MP. The Popen solution may not be what I want actually.
import multiprocessing as mp
import time
def build():
print 'I build things'
with open('first.txt', 'w+') as f:
f.write('')
time.sleep(10)
with open('myname.txt', 'w+') as f:
f.write('3')
return
def main():
build_p = mp.Process(name='build process', target=build)
build_p.start()
build_p.join(2)
return 18
if __name__ == '__main__':
v = main()
print v
print 'done'
Console:
I build things
18
done
|
and wait
finally
user#user-P5E-VM-DO:~$ python mp3.py
I build things
18
done
user#user-P5E-VM-DO:~$
remove the join() and you may have what you want.
join() waits for the processes to end before returning.
The value will return before the child process(es) finish, however, your parent process will be alive until the child processes complete. Not sure if that's an issue for you or not.
This code:
import multiprocessing as mp
import time
def build():
print 'I build things'
for i in range(10):
with open('testfile{}.txt'.format(i), 'w+') as f:
f.write('')
time.sleep(5)
def main():
build_p = mp.Process(name='build process', target=build)
build_p.start()
return 18
if __name__ == '__main__':
v = main()
print v
print 'done'
Returns:
> python mptest.py
18
done
I build things
If you need to allow the process to end while the child process continues check out the answers here:
Run Process and Don't Wait
No, the easiest way to handle what you want is Probably to use a message broker. Django celery is a great solution. It will let you queue a process and return your vie right to the user. Your process will then be executed in the order it was queued
I believe process opened from Django are tied to the thread they were opened in so your view will wait to return until your process is complete
I have two functions, draw_ascii_spinner and findCluster(companyid).
I would like to:
Run findCluster(companyid) in the backround and while its processing....
Run draw_ascii_spinner until findCluster(companyid) finishes
How do I begin to try to solve for this (Python 2.7)?
Use threads:
import threading, time
def wrapper(func, args, res):
res.append(func(*args))
res = []
t = threading.Thread(target=wrapper, args=(findcluster, (companyid,), res))
t.start()
while t.is_alive():
# print next iteration of ASCII spinner
t.join(0.2)
print res[0]
You can use multiprocessing. Or, if findCluster(companyid) has sensible stopping points, you can turn it into a generator along with draw_ascii_spinner, to do something like this:
for tick in findCluster(companyid):
ascii_spinner.next()
Generally, you will use Threads. Here is a simplistic approach which assumes, that there are only two threads: 1) the main thread executing a task, 2) the spinner thread:
#!/usr/bin/env python
import time
import thread
def spinner():
while True:
print '.'
time.sleep(1)
def task():
time.sleep(5)
if __name__ == '__main__':
thread.start_new_thread(spinner, ())
# as soon as task finishes (and so the program)
# spinner will be gone as well
task()
This can be done with threads. FindCluster runs in a separate thread and when done, it can simply signal another thread that is polling for a reply.
You'll want to do some research on threading, the general form is going to be this
Create a new thread for findCluster and create some way for the program to know the method is running - simplest in Python is just a global boolean
Run draw_ascii_spinner in a while loop conditioned on whether it is still running, you'll probably want to have this thread sleep for a short period of time between iterations
Here's a short tutorial in Python - http://linuxgazette.net/107/pai.html
Run findCluster() in a thread (the Threading module makes this very easy), and then draw_ascii_spinner until some condition is met.
Instead of using sleep() to set the pace of the spinner, you can wait on the thread's wait() with a timeout.
It is possible to have a working example? I am new in Python. I have 6 tasks to run in one python program. These 6 tasks should work in coordinations, meaning that one should start when another finishes. I saw the answers , but I couldn't adopted the codes you shared to my program.
I used "time.sleep" but I know that it is not good because I cannot know how much time it takes each time.
# Sending commands
for i in range(0,len(cmdList)): # port Sending commands
cmd = cmdList[i]
cmdFull = convert(cmd)
port.write(cmd.encode('ascii'))
# s = port.read(10)
print(cmd)
# Terminate the command + close serial port
port.write(cmdFull.encode('ascii'))
print('Termination')
port.close()
# time.sleep(1*60)