For a Raspberry Pi-based project I'm working on, I want to have a main program and a "status checker" secondary script. In other words, when the first program is started, I want it to start as a background service and kick me back out to Terminal, and then the secondary program can be used to check the first program's status/progress.
I need the main program to send variable values to the status checking script, which will then print them to Terminal. I found this old post, but it doesn't seem to work.
I modified the code a bit from the old post, but here it is. The import main doesn't import the function, it seems to just run main.py. I added the for loop in main.py as a placeholder for the stuff I would be doing in the main script.
#main.py
from multiprocessing import Process,Pipe
import time
def f(child_conn):
msg = "Hello"
child_conn.send(msg)
child_conn.close()
for i in range(1000000):
print(i)
time.sleep(0.05)
#second.py
from multiprocessing import Process,Queue,Pipe
from main import f
if __name__ == '__main__':
parent_conn,child_conn = Pipe()
p = Process(target=f, args=(child_conn,))
p.start()
print(parent_conn.recv()) # prints "Hello"
The problem is that when second.py imports f from main.py, it runs everything on the global scope. If you remove them, you can see that your process and pipe do work. If you want to keep that part as well you can do something like:
if __name__ == '__main__':
for i in range(1000000):
print(i)
time.sleep(0.05)
Refer to this answer why this is the case:
Related
I have a big code that take a while to make calculation, I have decided to learn about multithreading and multiprocessing because only 20% of my processor was being used to make the calculation. After not having any improvement with multithreading, I have decided to try multiprocessing and whenever I try to use it, it just show a lot of errors even on a very simple code.
this is the code that I tested after starting having problems with my big calculation heavy code :
from concurrent.futures import ProcessPoolExecutor
def func():
print("done")
def func_():
print("done")
def main():
executor = ProcessPoolExecutor(max_workers=3)
p1 = executor.submit(func)
p2 = executor.submit(func_)
main()
and in the error message that I amhaving it says
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
this is not the whole message because it is very big but I think that I may be helpful in order to help me. Pretty much everything else on the error message is just like "error at line ... in ..."
If it may be helpful the big code is at : https://github.com/nobody48sheldor/fuseeinator2.0
it might not be the latest version.
I updated your code to show main being called. This is an issue with spawning operating systems like Windows. To test on my linux machine I had to add a bit of code. But this crashes on my machine:
# Test code to make linux spawn like Windows and generate error. This code
# # is not needed on windows.
if __name__ == "__main__":
import multiprocessing as mp
mp.freeze_support()
mp.set_start_method('spawn')
# test script
from concurrent.futures import ProcessPoolExecutor
def func():
print("done")
def func_():
print("done")
def main():
executor = ProcessPoolExecutor(max_workers=3)
p1 = executor.submit(func)
p2 = executor.submit(func_)
main()
In a spawning system, python can't just fork into a new execution context. Instead, it runs a new instance of the python interpreter, imports the module and pickles/unpickles enough state to make a child execution environment. This can be a very heavy operation.
But your script is not import safe. Since main() is called at module level, the import in the child would run main again. That would create a grandchild subprocess which runs main again (and etc until you hang your machine). Python detects this infinite loop and displays the message instead.
Top level scripts are always called "__main__". Put all of the code that should only be run once at the script level inside an if. If the module is imported, nothing harmful is run.
if __name__ == "__main__":
main()
and the script will work.
There are code analyzers out there that import modules to extract doc strings, or other useful stuff. Your code shouldn't fire the missiles just because some tool did an import.
Another way to solve the problem is to move everything multiprocessing related out of the script and into a module. Suppose I had a module with your code in it
whatever.py
from concurrent.futures import ProcessPoolExecutor
def func():
print("done")
def func_():
print("done")
def main():
executor = ProcessPoolExecutor(max_workers=3)
p1 = executor.submit(func)
p2 = executor.submit(func_)
myscript.py
#!/usr/bin/env pythnon3
import whatever
whatever.main()
Now, since the pool is laready in an imported module that doesn't do this crazy restart-itself thing, no if __name__ == "__main__": is necessary. Its a good idea to put it in myscript.py anyway, but not required.
I have two Python files, file 1 and file 2 that does two separate things. I want to run them together. I am using VS2017
The pseudo code for file 1 is:
Class A:
foo1():
.
.
foo2();
if variable<30;
#do this
else;
subprocess.Popen('py file2.py')
#rest of the code for foo2()
if __name__ == "__main__":
A.foo2();
Currently when I use this format, the subprocess does start the file 2 and run it but the rest of the code for foo2() after the if-else condition runs only when the process is terminated( another condition that I have setup inside file 2).
I am trying to work it in such a way that, file 2 will start running in the background once the if-else condition is met, and will give outputs in the command window but also run the rest of file 1. Not pausing the running of file 1 till file2 is done. If not in subprocess is there another way to start both files simultaneous but control the output of file 2 by passing the value of the "variable". I am trying to figure a proper work-around.
I am new to Python.
EDIT 1:
I used the command:
process = subprocess.Popen('py file2.py' ,shell=True,stdin=None, stdout=None, stderr=None, close_fds=True)
Even if I use process.kill(), the subprocess still runs in the background. It won't quit even if use the task manager.
I also wanted to pass a variable to the second file. I am looking into something like
variable = input("enter variable)
subprocess.Popen('py file2.py -a' + variable ,shell=True,stdin=None, stdout=None, stderr=None, close_fds=True)
But as far as I have looked, it was told that I can only pass strings through a subprocess. is it true?
I believe you can do this with both multithreading and multiprocessing. If you want to start them both right away and then monitor the variable, you can connect them with a pipe or queue.
starting when triggered:
from py_file2.py import your_func
import threading
Class A:
foo1():
.
.
foo2();
if variable<30;
#do this
else;
#put something here to make sure it only starts once
t = threading.Thread(target = your_func)
t.start()
#rest of the code for foo2()
if __name__ == "__main__":
A.foo2();
starting right away:
from py_file2.py import your_func
import threading
from queue import Queue
Class A:
foo1():
.
.
foo2(your_queue);
if variable<30;
#do this
else;
your_queue.put(variable)
#rest of the code for foo2()
if __name__ == "__main__":
your_queue = Queue()
t = threading.Thread(target = your_func, args = (your_queue,))
t.start()
A.foo2(your_queue);
I have two independent scripts that are in an infinite loop. I need to call both of them from another master script and make them run simultaneously. Producing results at the same time.
Here are some scripts
script1.py
y= 1000000000
while True:
y=y-1
print("y is now: ", y)
script2.py
x= 0
while True:
x=x+1
print("x is now: ", x)
The Aim Is to compile the master script with pyinstaller into one console
You can use the python 'multiprocessing' module.
import os
from multiprocessing import Process
def script1:
os.system("script1.py")
def script2:
os.system("script2.py")
if __name__ == '__main__':
p = Process(target=script1)
q = Process(target=script2)
p.start()
q.start()
p.join()
q.join()
Note that print statement might not be the accurate way to check parallelism of the processes.
Python scripts are executed when imported.
So if you really want to keep your two scripts untouched, you can import each one of then in a separate process, like the following.
from threading import Thread
def one(): import script1
def two(): import script2
Thread(target=one).start()
Thread(target=two).start()
Analogous if you want two processes instead of threads:
from multiprocessing import Process
def one(): import script1
def two(): import script2
Process(target=one).start()
Process(target=two).start()
Wrap scripts' code in functions so that they can be imported.
def main():
# script's code goes here
...
Use "if main" to keep ability to run as scripts.
if __name__ == '__main__':
main()
Use multiprocessing or threading to run created functions.
If you really can't make your scripts importable, you can always use subprocess module, but communication between the runner and your scripts (if needed) will be more complicated.
I'm currently going through some pre-existing code with the goal of speeding it up. There's a few places that are extremely good candidates for parallelization. Since Python has the GIL, I thought I'd use the multiprocess module.
However from my understanding the only way this will work on windows is if I call the function that needs multiple processes from the highest-level script with the if __name__=='__main__' safeguard. However, this particular program was meant to be distributed and imported as a module, so it'd be kind of clunky to have the user copy and paste that safeguard and is something I'd really like to avoid doing.
Am I out of luck or misunderstanding something as far as multiprocessing goes? Or is there any other way to do it with Windows?
For everyone still searching:
inside module
from multiprocessing import Process
def printing(a):
print(a)
def foo(name):
var={"process":{}}
if name == "__main__":
for i in range(10):
var["process"][i] = Process(target=printing , args=(str(i)))
var["process"][i].start()
for i in range(10):
var["process"][i].join
inside main.py
import data
name = __name__
data.foo(name)
output:
>>2
>>6
>>0
>>4
>>8
>>3
>>1
>>9
>>5
>>7
I am a complete noob so please don't judge the coding OR presentation but at least it works.
As explained in comments, perhaps you could do something like
#client_main.py
from mylib.mpSentinel import MPSentinel
#client logic
if __name__ == "__main__":
MPSentinel.As_master()
#mpsentinel.py
class MPSentinel(object):
_is_master = False
#classmethod
def As_master(cls):
cls._is_master = True
#classmethod
def Is_master(cls):
return cls._is_master
It's not ideal in that it's effectively a singleton/global but it would work around window's lack of fork. Still you could use MPSentinel.Is_master() to use multiprocessing optionally and it should prevent Windows from process bombing.
On ms-windows, you should be able to import the main module of a program without side effects like starting a process.
When Python imports a module, it actually runs it.
So one way of doing that is in the if __name__ is '__main__' block.
Another way is to do it from within a function.
The following won't work on ms-windows:
from multiprocessing import Process
def foo():
print('hello')
p = Process(target=foo)
p.start()
This is because it tries to start a process when importing the module.
The following example from the programming guidelines is OK:
from multiprocessing import Process, freeze_support, set_start_method
def foo():
print('hello')
if __name__ == '__main__':
freeze_support()
set_start_method('spawn')
p = Process(target=foo)
p.start()
Because the code in the if block doesn't run when the module is imported.
But putting it in a function should also work:
from multiprocessing import Process
def foo():
print('hello')
def bar()
p = Process(target=foo)
p.start()
When this module is run, it will define two new functions, not run then.
i've been developing an instagram images scraper so in order to get the download & save operations run faster i've implemented multiprocesing in one auxiliary module, note that this code it's inside an auxiliary module and not inside the main module.
The solution I found is adding this line:
if __name__ != '__main__':
pretty simple but it's actually working!
def multi_proces(urls, profile):
img_saved = 0
if __name__ != '__main__': # line needed for the sake of getting this NOT to crash
processes = []
for url in urls:
try:
process = multiprocessing.Process(target=download_save, args=[url, profile, img_saved])
processes.append(process)
img_saved += 1
except:
continue
for proce in processes:
proce.start()
for proce in processes:
proce.join()
return img_saved
def download_save(url, profile,img_saved):
file = requests.get(url, allow_redirects=True) # Download
open(f"scraped_data\{profile}\{profile}-{img_saved}.jpg", 'wb').write(file.content) # Save
I have a module named multi.py. If I simply wanted to execute multi.py as a script, then the workaround to avoid crashing on Windows (spawning an infinite number of processes) is to put the multiprocessing code under:
if __name__ == '__main__':
However, I am trying to import it as a module from another script and call multi.start(). How can this be accomplished?
# multi.py
import multiprocessing
def test(x):
x**=2
def start():
pool = multiprocessing.Pool(processes=multiprocessing.cpu_count()-2)
pool.map(test, (i for i in range(1000*1000)))
pool.terminate()
print('done.')
if __name__ == '__main__':
print('runs as a script,',__name__)
else:
print('runs as imported module,',__name__)
This is my test.py I run:
# test.py
import multi
multi.start()
I don't quite get what you're asking. You don't need to do anything to prevent this from spawning infinitely many processes. I just ran it on Windows XP --- imported the file and ran multi.start() --- and it completed fine in a couple seconds.
The reason you have to do the if __name__=="__main__" protection is that, on Windows, multiprocessing has to import the main script in order to run the target function, which means top-level module code in that file will be executed. The problem only arises if that top-level module code itself tries to spawn a new process. In your example, the top level module code doesn't use multiprocessing, so there's no infinite process chain.
Edit: Now I get what you're asking. You don't need to protect multi.py. You need to protect your main script, whatever it is. If you're getting a crash, it's because in your main script you are doing multi.start() in the top level module code. Your script needs to look like this:
import multi
if __name__=="__main__":
multi.start()
The "protection" is always needed in the main script.
if __name__ == '__main__':
print('runs as a script,',__name__)
else:
print('runs as imported module,',__name__)