Python parallel threads - python

Here the code which download 3 files, and do something with it.
But before starting Thread2 it waits until Thread1 will be finished. How make them run together?
Specify some examples with commentary. Thanks
import threading
import urllib.request
def testForThread1():
print('[Thread1]::Started')
resp = urllib.request.urlopen('http://192.168.85.16/SOME_FILE')
data = resp.read()
# Do something with it
return 'ok'
def testForThread2():
print('[Thread2]::Started')
resp = urllib.request.urlopen('http://192.168.85.10/SOME_FILE')
data = resp.read()
# Do something with it
return 'ok'
if __name__ == "__main__":
t1 = threading.Thread(name="Hello1", target=testForThread1())
t1.start()
t2 = threading.Thread(name="Hello2", target=testForThread2())
t2.start()
print(threading.enumerate())
t1.join()
t2.join()
exit(0)

You are executing the target function for the thread in the thread instance creation.
if __name__ == "__main__":
t1 = threading.Thread(name="Hello1", target=testForThread1()) # <<-- here
t1.start()
This is equivalent to:
if __name__ == "__main__":
result = testForThread1() # == 'ok', this is the blocking execution
t1 = threading.Thread(name="Hello1", target=result)
t1.start()
It's Thread.start()'s job to execute that function and store its result somewhere for you to reclaim. As you can see, the previous format was executing the blocking function in the main thread, preventing you from being able to parallelize (e.g. it would have to finish that function execution before getting to the line where it calls the second function).
The proper way to set the thread in a non-blocking fashion would be:
if __name__ == "__main__":
t1 = threading.Thread(name="Hello1", target=testForThread1) # tell thread what the target function is
# notice no function call braces for the function "testForThread1"
t1.start() # tell the thread to execute the target function

For this, we can use threading but it's not efficient since you want to download files. so the total time will be equal to the sum of download time of all files.
If you have good internet speed, then multiprocessing is the best way.
import multiprocessing
def test_function():
for i in range(199999998):
pass
t1 = multiprocessing.Process(target=test_function)
t2 = multiprocessing.Process(target=test_function)
t1.start()
t2.start()
This is the fastest solution. You can check this using following command:
time python3 filename.py
you will get the following output like this:
real 0m6.183s
user 0m12.277s
sys 0m0.009s
here, real = user + sys
user time is the time taken by python file to execute.
but you can see that above formula doesn't satisfy because each function takes approx 6.14. But due to multiprocessing, both take 6.18 seconds and reduced total time by multiprocessing in parallel.
You can get more about it from here.

Related

Python Multi-Threading use data from a thread in another thread

I'm new to Python threading and what I'm trying to do is :
1 Thread with While loop that will execute a GET request to an API each N seconds to refresh the data
A second Thread with a While loop that will use the data (ip addresses) to ping targets each N seconds.
So I was looking for a way to start the first Thread then only start the second after the first API call and then share these data to the second Thread so it can execute its logic.
Anyone can help me pls ? Thanks.
As per you requirements, here is a simple boilerplate code you might want to try,
import time
import threading
available = False
def thread1():
global available
while True:
# TODO: call API
# --------------
available = True # set available True after API call
time.sleep(5) # perform API calls after every 5 seconds
def thread2():
while True:
# TODO: perform ping
# --------------
# perform ping request after every 5 seconds
time.sleep(5)
if __name__ == "__main__":
t1 = threading.Thread(target=thread1, name="thread1")
t2 = threading.Thread(target=thread2, name="thread2")
t1.start()
while not available:
time.sleep(0.1)
else:
t2.start()

running two scripts simultaneously from a master script when each script has multiple threads within it in python

I want to run two or more python scripts simultaneously from a master script. Each of these scripts already have threads within them which are running in parallel. For example I run
script1.py
if __name__ == '__main__':
pid_vav = PID_VAV('B2')
t1 = threading.Thread(target=pid_vav.Controls)
t1.daemon = False
t1.start()
t2 = threading.Thread(target=pid_vav.mqttConnection)
t2.daemon = False
t2.start()
script2.py
if __name__ == '__main__':
pid_vav = PID_VAV('B4')
t1 = threading.Thread(target=pid_vav.Controls)
t1.daemon = False
t1.start()
t2 = threading.Thread(target=pid_vav.mqttConnection)
t2.daemon = False
t2.start()
I am running this script1.py and script2.py separately. Only difference is the parameter which I am passing to the class. Is it possible to have a master script such that if I just run that, both these scripts will run ?
Thanks
Assuming you want the output of both scripts to be shown when you run the master script.
You can make use of the subprocess module to call the python file, and you can use the threading module to start separate threads
from threading import Thread
import subprocess
t1 = Thread(target=subprocess.run, args=(["python", "script1.py"],))
t2 = Thread(target=subprocess.run, args=(["python", "script2.py"],))
t1.start()
t2.start()
t1.join()
t2.join()
If u want to trigger 2 scripts from a master script u can use the below method.
It will help you trigger both scripts as thread and the thread can also produce different threads based on the callable scripts. You can even make Scripts run independently.
import subprocess
pid1 = subprocess.Popen([sys.executable, "script1.py"])
pid2 = subprocess.Popen([sys.executable, "script2.py"])
Yes, ofc.
script_master.py:
from os import system
system('start script1.py && start script2.py')
But I think you could to use this code:
script_together.py:
if __name__ == '__main__':
todo=[]
todo.append(threading.Thread(target=lambda: PID_VAV('B2').Controls, daemon=False))
todo.append(threading.Thread(target=lambda: PID_VAV('B4').mqttConnection, daemon=False))
for th in todo:
th.start()
for th in todo:
th.join()
If you're happy to have the code for both live in one file, you can use multiprocessing to run them concurrently on different CPU cores.
import multiprocessing as mp
from threading import Thread
def start_process(pid_vav_label):
pid_vav, threads = PID_VAV(pid_vav_label), []
threads.append(Thread(target=pid_vav.Controls))
threads.append(Thread(target=pid_vav.mqttConnection))
for thread in threads:
thread.start()
# Join if necessary
for thread in threads:
thread.join()
if __name__ == '__main__':
processes = []
for label in ['B2', 'B4']:
processes.append(mp.Process(target=start_process, args=(label,)))
processes[-1].start()
# Again, can join if necessary
for process in processes:
process.join()

Using multiple threads to unblock network calls

For simplification purposes, let's suppose I'm downloading multiple large files from S3 to my local machine.
def get_file(name):
# pull from S3 and returns DataFrame
return df
if __name__ == "__main__":
df1 = get_file("large_file_1.csv")
df2 = get_file("large_file_2.csv")
df3 = get_file("large_file_3.csv")
and I want to refactor this code to make these calls non-blocking (i.e. start pulling all of them from S3 at once and wait for them to finish). My first instinct is to use the threading module with something like
from threading import Thread
if __name__ == "__main__":
t1 = Thread(target=get_file, args=("large_file_1.csv",))
t2 = Thread(target=get_file, args=("large_file_2.csv",))
t3 = Thread(target=get_file, args=("large_file_3.csv",))
t1.start()
t2.start()
t3.start()
t1.join()
t2.join()
t3.join()
However, Thread doesn't expose a way to assign the return value of the target function to a variable. What's the preferred way of going about this is in Python?
A simple way to do the work concurrently, and get a response back from each thread, is to use a ThreadPoolExecutor:
from concurrent.futures import ThreadPoolExecutor
def get_file(f):
# Do real work here
return f + "1" # Return a real result here
l = ["large_file_1.csv", "large_file_2.csv", "large_file3.csv"]
pool = ThreadPoolExecutor(3)
out = pool.map(get_file, l)
print(list(out))
Output:
['large_file_1.csv1', 'large_file_2.csv1', 'large_file3.csv1']
You could also keep using Thread directly, and use a Queue to get the results back, but ThreadPoolExecutor is abstracting that away for you, so there's really no need.

os._exit() called on another thread does not exit a program

I just start a new thread:
self.thread = ThreadedFunc()
self.thread.start()
after something happens I want to exit my program so I'm calling os._exit():
os._exit(1)
The program still works. Everything is functional and it just looks like the os._exit() didn't execute.
Is there a different way to exit a whole program from different thread? How to fix this?
EDIT: Added more complete code sample.
self.thread = DownloadThread()
self.thread.data_downloaded.connect(self.on_data_ready)
self.thread.data_progress.connect(self.on_progress_ready)
self.progress_initialized = False
self.thread.start()
class DownloadThread(QtCore.QThread):
# downloading stuff etc.
sleep(1)
subprocess.call(os.getcwd() + "\\another_process.exe")
sleep(2)
os._exit(1)
EDIT 2: SOLVED! There is a quit(), terminate() or exit() function which just stops the thread. It was that easy. Just look at the docs.
Calling os._exit(1) works for me.
You should use the standard lib threading.
I guess you are using multiprocessing, which is a process-based “threading” interface, which uses similar API to threading, but creates child process instead of child thread. so os._exit(1) only exits child process, not affecting the main process
Also you should ensure you have called join() function in the main thread. Otherwise, it is possible that the operating system schedules to run the main thread to the end before starting to do anything in child thread.
sys.exit() does not work because it is the same as raising a SystemExit exception. Raising an exception in thread only exits that thread, rather than the entire process.
Sample code. Tested under ubuntu by python3 thread.py; echo $?.
Return code is 1 as expected
import os
import sys
import time
import threading
# Python Threading Example for Beginners
# First Method
def greet_them(people):
for person in people:
print("Hello Dear " + person + ". How are you?")
os._exit(1)
time.sleep(0.5)
# Second Method
def assign_id(people):
i = 1
for person in people:
print("Hey! {}, your id is {}.".format(person, i))
i += 1
time.sleep(0.5)
people = ['Richard', 'Dinesh', 'Elrich', 'Gilfoyle', 'Gevin']
t = time.time()
#Created the Threads
t1 = threading.Thread(target=greet_them, args=(people,))
t2 = threading.Thread(target=assign_id, args=(people,))
#Started the threads
t1.start()
t2.start()
#Joined the threads
t1.join() # Cannot remove this join() for this example
t2.join()
# Possible to reach here if join() removed
print("I took " + str(time.time() - t))
Credit: Sample code is copied and modified from https://www.simplifiedpython.net/python-threading-example/

Threading an endless while loop in Python 2

I'm not sure why this does not work. The thread starts as soon as it is defined and seems to not be in an actual thread... Maybe I'm missing something.
import threading
import time
def endless_loop1():
while True:
print('EndlessLoop1:'+str(time.time()))
time.sleep(2)
def endless_loop2():
while True:
print('EndlessLoop2:'+str(time.time()))
time.sleep(1)
print('Here1')
t1 = threading.Thread(name='t1', target=endless_loop1(), daemon=True)
print('Here2')
t2 = threading.Thread(name='t2', target=endless_loop2(), daemon=True)
print('Here3')
t1.start()
print('Here4')
t2.start()
Outputs:
Here1
EndlessLoop1:1446675282.8
EndlessLoop1:1446675284.8
EndlessLoop1:1446675286.81
You need to give target= a callable object.
target=endless_loop1()
Here you're actually calling endless_loop1(), so it gets executed in your main thread right away. What you want to do is:
target=endless_loop1
which passes your Thread the function object so it can call it itself.
Also, daemon isn't actually an init parameter, you need to set it separately before calling start:
t1 = threading.Thread(name='t1', target=endless_loop1)
t1.daemon = True

Categories

Resources