I have two independent scripts that are in an infinite loop. I need to call both of them from another master script and make them run simultaneously. Producing results at the same time.
Here are some scripts
script1.py
y= 1000000000
while True:
y=y-1
print("y is now: ", y)
script2.py
x= 0
while True:
x=x+1
print("x is now: ", x)
The Aim Is to compile the master script with pyinstaller into one console
You can use the python 'multiprocessing' module.
import os
from multiprocessing import Process
def script1:
os.system("script1.py")
def script2:
os.system("script2.py")
if __name__ == '__main__':
p = Process(target=script1)
q = Process(target=script2)
p.start()
q.start()
p.join()
q.join()
Note that print statement might not be the accurate way to check parallelism of the processes.
Python scripts are executed when imported.
So if you really want to keep your two scripts untouched, you can import each one of then in a separate process, like the following.
from threading import Thread
def one(): import script1
def two(): import script2
Thread(target=one).start()
Thread(target=two).start()
Analogous if you want two processes instead of threads:
from multiprocessing import Process
def one(): import script1
def two(): import script2
Process(target=one).start()
Process(target=two).start()
Wrap scripts' code in functions so that they can be imported.
def main():
# script's code goes here
...
Use "if main" to keep ability to run as scripts.
if __name__ == '__main__':
main()
Use multiprocessing or threading to run created functions.
If you really can't make your scripts importable, you can always use subprocess module, but communication between the runner and your scripts (if needed) will be more complicated.
Related
For a Raspberry Pi-based project I'm working on, I want to have a main program and a "status checker" secondary script. In other words, when the first program is started, I want it to start as a background service and kick me back out to Terminal, and then the secondary program can be used to check the first program's status/progress.
I need the main program to send variable values to the status checking script, which will then print them to Terminal. I found this old post, but it doesn't seem to work.
I modified the code a bit from the old post, but here it is. The import main doesn't import the function, it seems to just run main.py. I added the for loop in main.py as a placeholder for the stuff I would be doing in the main script.
#main.py
from multiprocessing import Process,Pipe
import time
def f(child_conn):
msg = "Hello"
child_conn.send(msg)
child_conn.close()
for i in range(1000000):
print(i)
time.sleep(0.05)
#second.py
from multiprocessing import Process,Queue,Pipe
from main import f
if __name__ == '__main__':
parent_conn,child_conn = Pipe()
p = Process(target=f, args=(child_conn,))
p.start()
print(parent_conn.recv()) # prints "Hello"
The problem is that when second.py imports f from main.py, it runs everything on the global scope. If you remove them, you can see that your process and pipe do work. If you want to keep that part as well you can do something like:
if __name__ == '__main__':
for i in range(1000000):
print(i)
time.sleep(0.05)
Refer to this answer why this is the case:
I have a big code that take a while to make calculation, I have decided to learn about multithreading and multiprocessing because only 20% of my processor was being used to make the calculation. After not having any improvement with multithreading, I have decided to try multiprocessing and whenever I try to use it, it just show a lot of errors even on a very simple code.
this is the code that I tested after starting having problems with my big calculation heavy code :
from concurrent.futures import ProcessPoolExecutor
def func():
print("done")
def func_():
print("done")
def main():
executor = ProcessPoolExecutor(max_workers=3)
p1 = executor.submit(func)
p2 = executor.submit(func_)
main()
and in the error message that I amhaving it says
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
this is not the whole message because it is very big but I think that I may be helpful in order to help me. Pretty much everything else on the error message is just like "error at line ... in ..."
If it may be helpful the big code is at : https://github.com/nobody48sheldor/fuseeinator2.0
it might not be the latest version.
I updated your code to show main being called. This is an issue with spawning operating systems like Windows. To test on my linux machine I had to add a bit of code. But this crashes on my machine:
# Test code to make linux spawn like Windows and generate error. This code
# # is not needed on windows.
if __name__ == "__main__":
import multiprocessing as mp
mp.freeze_support()
mp.set_start_method('spawn')
# test script
from concurrent.futures import ProcessPoolExecutor
def func():
print("done")
def func_():
print("done")
def main():
executor = ProcessPoolExecutor(max_workers=3)
p1 = executor.submit(func)
p2 = executor.submit(func_)
main()
In a spawning system, python can't just fork into a new execution context. Instead, it runs a new instance of the python interpreter, imports the module and pickles/unpickles enough state to make a child execution environment. This can be a very heavy operation.
But your script is not import safe. Since main() is called at module level, the import in the child would run main again. That would create a grandchild subprocess which runs main again (and etc until you hang your machine). Python detects this infinite loop and displays the message instead.
Top level scripts are always called "__main__". Put all of the code that should only be run once at the script level inside an if. If the module is imported, nothing harmful is run.
if __name__ == "__main__":
main()
and the script will work.
There are code analyzers out there that import modules to extract doc strings, or other useful stuff. Your code shouldn't fire the missiles just because some tool did an import.
Another way to solve the problem is to move everything multiprocessing related out of the script and into a module. Suppose I had a module with your code in it
whatever.py
from concurrent.futures import ProcessPoolExecutor
def func():
print("done")
def func_():
print("done")
def main():
executor = ProcessPoolExecutor(max_workers=3)
p1 = executor.submit(func)
p2 = executor.submit(func_)
myscript.py
#!/usr/bin/env pythnon3
import whatever
whatever.main()
Now, since the pool is laready in an imported module that doesn't do this crazy restart-itself thing, no if __name__ == "__main__": is necessary. Its a good idea to put it in myscript.py anyway, but not required.
I have two Python files, file 1 and file 2 that does two separate things. I want to run them together. I am using VS2017
The pseudo code for file 1 is:
Class A:
foo1():
.
.
foo2();
if variable<30;
#do this
else;
subprocess.Popen('py file2.py')
#rest of the code for foo2()
if __name__ == "__main__":
A.foo2();
Currently when I use this format, the subprocess does start the file 2 and run it but the rest of the code for foo2() after the if-else condition runs only when the process is terminated( another condition that I have setup inside file 2).
I am trying to work it in such a way that, file 2 will start running in the background once the if-else condition is met, and will give outputs in the command window but also run the rest of file 1. Not pausing the running of file 1 till file2 is done. If not in subprocess is there another way to start both files simultaneous but control the output of file 2 by passing the value of the "variable". I am trying to figure a proper work-around.
I am new to Python.
EDIT 1:
I used the command:
process = subprocess.Popen('py file2.py' ,shell=True,stdin=None, stdout=None, stderr=None, close_fds=True)
Even if I use process.kill(), the subprocess still runs in the background. It won't quit even if use the task manager.
I also wanted to pass a variable to the second file. I am looking into something like
variable = input("enter variable)
subprocess.Popen('py file2.py -a' + variable ,shell=True,stdin=None, stdout=None, stderr=None, close_fds=True)
But as far as I have looked, it was told that I can only pass strings through a subprocess. is it true?
I believe you can do this with both multithreading and multiprocessing. If you want to start them both right away and then monitor the variable, you can connect them with a pipe or queue.
starting when triggered:
from py_file2.py import your_func
import threading
Class A:
foo1():
.
.
foo2();
if variable<30;
#do this
else;
#put something here to make sure it only starts once
t = threading.Thread(target = your_func)
t.start()
#rest of the code for foo2()
if __name__ == "__main__":
A.foo2();
starting right away:
from py_file2.py import your_func
import threading
from queue import Queue
Class A:
foo1():
.
.
foo2(your_queue);
if variable<30;
#do this
else;
your_queue.put(variable)
#rest of the code for foo2()
if __name__ == "__main__":
your_queue = Queue()
t = threading.Thread(target = your_func, args = (your_queue,))
t.start()
A.foo2(your_queue);
Let's say I have three modules:
mod1
mod2
mod3
where each of them runs infinitely long as soon as mod.launch() is called.
What are some elegant ways to launch all these infinite loops at once, without waiting for one to finish before calling the other?
Let's say I'd have a kind of launcher.py, where I'd try to:
import mod1
import mod2
import mod3
if __name__ == "__main__":
mod1.launch()
mod2.launch()
mod3.launch()
This obviously doesn't work, as It will wait for mod1.launch() to finish before launching mod2.launch().
Any kind of help is appreciated.
If you would like to execute multiple functions in parallel, you can use either the multiprocessing library, or concurrent.futures.ProcessPoolExecutor. ProcessPoolExecutor uses multiprocessing internally, but has a simpler interface.
Depending on the nature of the work being done in each task, the answer varies.
If each task is mostly or all IO-bound, I would recommend multithreading.
If each task is CPU-bound, I would recommend multiprocessing (due to the GIL in python).
You can also use the threading module to run each module on a separate thread, but within the same process:
import threading
import mod1
import mod2
import mod3
if __name__ == "__main__":
# make a list of all modules we want to run, for convenience
mods = [mod1, mod2, mod3]
# Prepare a thread for each module to run the `launch()` method
threads = [threading.Thread(target=mod.launch) for mod in mods]
# run all threads
for thread in threads:
thread.start()
# wait for all threads to finish
for thread in threads:
thread.join()
The multiprocess module performs a very similar set of tasks and has a very similar API, but uses separate processes instead of threads, so you can use that too.
I'd suggest using Ray, which is a library for parallel and distributed Python. It has some advantages over the standard threading and multiprocessing libraries.
The same code will run on a single machine or on multiple machines.
You can parallelize both functions and classes.
Objects are shared efficiently between tasks using shared memory.
To provide a simple runnable example, I'll use functions and classes instead of modules, but you can always wrap the module in a function or class.
Approach 1: Parallel functions using tasks.
import ray
import time
ray.init()
#ray.remote
def mod1():
time.sleep(3)
#ray.remote
def mod2():
time.sleep(3)
#ray.remote
def mod3():
time.sleep(3)
if __name__ == '__main__':
# Start the tasks. These will run in parallel.
result_id1 = mod1.remote()
result_id2 = mod2.remote()
result_id3 = mod3.remote()
# Don't exit the interpreter before the tasks have finished.
ray.get([result_id1, result_id2, result_id3])
Approach 2: Parallel classes using actors.
import ray
import time
# Don't run this again if you've already run it.
ray.init()
#ray.remote
class Mod1(object):
def run(self):
time.sleep(3)
#ray.remote
class Mod2(object):
def run(self):
time.sleep(3)
#ray.remote
class Mod3(object):
def run(self):
time.sleep(3)
if __name__ == '__main__':
# Create 3 actors.
mod1 = Mod1.remote()
mod2 = Mod2.remote()
mod3 = Mod3.remote()
# Start the methods, these will run in parallel.
result_id1 = mod1.run.remote()
result_id2 = mod2.run.remote()
result_id3 = mod3.run.remote()
# Don't exit the interpreter before the tasks have finished.
ray.get([result_id1, result_id2, result_id3])
You can see the Ray documentation.
I'm currently going through some pre-existing code with the goal of speeding it up. There's a few places that are extremely good candidates for parallelization. Since Python has the GIL, I thought I'd use the multiprocess module.
However from my understanding the only way this will work on windows is if I call the function that needs multiple processes from the highest-level script with the if __name__=='__main__' safeguard. However, this particular program was meant to be distributed and imported as a module, so it'd be kind of clunky to have the user copy and paste that safeguard and is something I'd really like to avoid doing.
Am I out of luck or misunderstanding something as far as multiprocessing goes? Or is there any other way to do it with Windows?
For everyone still searching:
inside module
from multiprocessing import Process
def printing(a):
print(a)
def foo(name):
var={"process":{}}
if name == "__main__":
for i in range(10):
var["process"][i] = Process(target=printing , args=(str(i)))
var["process"][i].start()
for i in range(10):
var["process"][i].join
inside main.py
import data
name = __name__
data.foo(name)
output:
>>2
>>6
>>0
>>4
>>8
>>3
>>1
>>9
>>5
>>7
I am a complete noob so please don't judge the coding OR presentation but at least it works.
As explained in comments, perhaps you could do something like
#client_main.py
from mylib.mpSentinel import MPSentinel
#client logic
if __name__ == "__main__":
MPSentinel.As_master()
#mpsentinel.py
class MPSentinel(object):
_is_master = False
#classmethod
def As_master(cls):
cls._is_master = True
#classmethod
def Is_master(cls):
return cls._is_master
It's not ideal in that it's effectively a singleton/global but it would work around window's lack of fork. Still you could use MPSentinel.Is_master() to use multiprocessing optionally and it should prevent Windows from process bombing.
On ms-windows, you should be able to import the main module of a program without side effects like starting a process.
When Python imports a module, it actually runs it.
So one way of doing that is in the if __name__ is '__main__' block.
Another way is to do it from within a function.
The following won't work on ms-windows:
from multiprocessing import Process
def foo():
print('hello')
p = Process(target=foo)
p.start()
This is because it tries to start a process when importing the module.
The following example from the programming guidelines is OK:
from multiprocessing import Process, freeze_support, set_start_method
def foo():
print('hello')
if __name__ == '__main__':
freeze_support()
set_start_method('spawn')
p = Process(target=foo)
p.start()
Because the code in the if block doesn't run when the module is imported.
But putting it in a function should also work:
from multiprocessing import Process
def foo():
print('hello')
def bar()
p = Process(target=foo)
p.start()
When this module is run, it will define two new functions, not run then.
i've been developing an instagram images scraper so in order to get the download & save operations run faster i've implemented multiprocesing in one auxiliary module, note that this code it's inside an auxiliary module and not inside the main module.
The solution I found is adding this line:
if __name__ != '__main__':
pretty simple but it's actually working!
def multi_proces(urls, profile):
img_saved = 0
if __name__ != '__main__': # line needed for the sake of getting this NOT to crash
processes = []
for url in urls:
try:
process = multiprocessing.Process(target=download_save, args=[url, profile, img_saved])
processes.append(process)
img_saved += 1
except:
continue
for proce in processes:
proce.start()
for proce in processes:
proce.join()
return img_saved
def download_save(url, profile,img_saved):
file = requests.get(url, allow_redirects=True) # Download
open(f"scraped_data\{profile}\{profile}-{img_saved}.jpg", 'wb').write(file.content) # Save