More threads contemporary - python

I am trying to run a code, with multiple threads, the user can decide how many threads he wants to run. I tried doing it with the threading module in Python 3.7
My code is shown below, but my problem is, instead of running all the threads together, it runs one after the other...
import threading
x=int(input("Enter number of threads: "))
def main():
print("My main function")
print("Does some stuff...")
while x > 0:
print("Starting Threads.")
x=x-1 #At every time the while loops gets passed, x gets decremented, so once it hits 0 it stops
t1=threading.Thread(target=main) #for every time the loop passes, a new thread gets created
t1.start() #and the thread starts here
Now I need to find out, how can I make it, that they run at the same time, not one after the other. Thanks

your code runs in parallel (note: just on one single core though; that is a python limitation due the global interpreter lock).
to make it more obvious change your main function a little; the way it is right now it just finishes too quickly. i suggest:
from time import sleep
from random import random
def main():
print("main starting")
sleep(random())
print("main done")
this will output something like
Enter number of threads: 4
Starting Threads.
main starting
Starting Threads.
main starting
Starting Threads.
main starting
Starting Threads.
main starting
main done
main done
main done
main done

Related

Execute threads in certain order

When we launch threads, is it known for SURE which thread will be executed first or is it something not predictable ?
I say this because square is always called first and then cube.
import threading
def print_cube(num):
# function to print cube of given num
print("Cube: {}".format(num * num * num))
def print_square(num):
# function to print square of given num
print("Square: {}".format(num * num))
if __name__ == "__main__":
# creating thread
cuadrado = threading.Thread(target=print_square, args=(10,))
cubo = threading.Thread(target=print_cube, args=(10,))
# starting thread 1
cuadrado.start()
# starting thread 2
cubo.start()
print("Done!")
I would like to understand the method Threading.start()
Does the order of calling Threading start() matter?
But, if I sleep the yarns with the same time, then it is random order
import threading
import time
def print_cube(num):
# function to print cube of given num
time.sleep(3)
print("Cube: {}".format(num * num * num))
def print_square(num):
# function to print square of given num
time.sleep(3)
print("Square: {}".format(num * num))
if __name__ == "__main__":
# creating thread
cuadrado = threading.Thread(target=print_square, args=(10,))
cubo = threading.Thread(target=print_cube, args=(10,))
# starting thread 1
cuadrado.start()
# starting thread 2
cubo.start()
# both threads completely executed
print("Done!")
I would like to understand the method Threading.start()
A Python threading.Thread object is not the same thing as a thread. A thread is an object in the operating system—separate from your code. I like to think of a thread as an agent who executes your target function.
The purpose of the Python Thread class is, to provide a platform-independent interface to the various different thread APIs of various different operating systems. One peculiarity of Python's Thread is that it does not actually create the operating system thread until you call its start() method. That's what start() does: It creates the underlying OS thread.
Does the order of calling Threading start() matter?
Depends what you mean. Your program definitely always starts the cuadrado thread before it starts the cubo thread, but the whole point of threads is to provide a means to achieve concurrency in your program; and what "concurrency" means is that the things happening in different threads are not required to happen in any definite order. By calling print_cube() and print_square() in different threads, you effectively are telling Python (and the OS) that you don't care which one prints first.
Maybe print_square() will always be called first on your computer. Maybe print_cube() will always be called first on somebody else's computer. Maybe it will be unpredictable which one goes first on a third computer.
Sounds a little chaotic, but the reason why we like concurrency is that it gives the OS and the Python system more freedom to get things done in the most efficient order. E.g., if one thread is waiting for some network packet to arrive, some other thread can be allowed to do some useful work. So long as the "useful work" doesn't need the packet that the other thread was waiting for, that's a Good Thing.

how to start multiple jobs in python and communicate with the main job

I am a novice user of python multithreading/multiprocessing, so please bear with me.
I would like to solve the following problem and I need some help/suggestions in this regard.
Let me describe in brief:
I would like to start a python script which does something in the
beginning sequentially.
After the sequential part is over, I would like to start some jobs
in parallel.
Assume that there are four parallel jobs I want to start.
I would like to also start these jobs on some other machines using "lsf" on the computing cluster.My initial script is also running on a ” lsf”
machine.
The four jobs which I started on four machines will perform two logical steps A and B---one after the other.
When a job started initially, they start with logical step A and finish it.
After every job (4jobs) has finished the Step A; they should notify the first job which started these. In other words, the main job which started is waiting for the confirmation from these four jobs.
Once the main job receives confirmation from these four jobs; it should notify all the four jobs to do the logical step B.
Logical step B will automatically terminate the jobs after finishing the task.
Main job is waiting for the all jobs to finish and later on it should continue with the sequential part.
An example scenario would be:
Python script running on an “lsf” machine in the cluster starts four "tcl shells" on four “lsf” machines.
In each tcl shell, a script is sourced to do the logical step A.
Once the step A is done, somehow they should inform the python script which is waiting for the acknowledgement.
Once the acknowledgement is received from all the four, python script inform them to do the logical step B.
Logical step B is also a script which is sourced in their tcl shell; this script will also close the tcl shell at the end.
Meanwhile, python script is waiting for all the four jobs to finish.
After all four jobs are finished; it should continue with the sequential part again and finish later on.
Here are my questions:
I am confused about---should I use multithreading/multiprocessing. Which one suits better?
In fact what is the difference between these two? I read about these but I wasn't able to conclude.
What is python GIL? I also read somewhere at any one point in time only one thread will execute.
I need some explanation here. It gives me an impression that I can't use threads.
Any suggestions on how could I solve my problem systematically and in a more pythonic way.
I am looking for some verbal step by step explanation and some pointers to read on each step.
Once the concepts are clear, I would like to code it myself.
Thanks in advance.
In addition to roganjosh's answer, I would include some signaling to start the step B after A has finished:
import multiprocessing as mp
import time
import random
import sys
def func_A(process_number, queue, proceed):
print "Process {} has started been created".format(process_number)
print "Process {} has ended step A".format(process_number)
sys.stdout.flush()
queue.put((process_number, "done"))
proceed.wait() #wait for the signal to do the second part
print "Process {} has ended step B".format(process_number)
sys.stdout.flush()
def multiproc_master():
queue = mp.Queue()
proceed = mp.Event()
processes = [mp.Process(target=func_A, args=(x, queue)) for x in range(4)]
for p in processes:
p.start()
#block = True waits until there is something available
results = [queue.get(block=True) for p in processes]
proceed.set() #set continue-flag
for p in processes: #wait for all to finish (also in windows)
p.join()
return results
if __name__ == '__main__':
split_jobs = multiproc_master()
print split_jobs
1) From the options you listed in your question, you should probably use multiprocessing in this case to leverage multiple CPU cores and compute things in parallel.
2) Going further from point 1: the Global Interpreter Lock (GIL) means that only one thread can actually execute code at any one time.
A simple example for multithreading that pops up often here is having a prompt for user input for, say, an answer to a maths problem. In the background, they want a timer to keep incrementing at one second intervals to register how long the person took to respond. Without multithreading, the program would block whilst waiting for user input and the counter would not increment. In this case, you could have the counter and the input prompt run on different threads so that they appear to be running at the same time. In reality, both threads are sharing the same CPU resource and are constantly passing an object backwards and forwards (the GIL) to grant them individual access to the CPU. This is hopeless if you want to properly process things in parallel. (Note: In reality, you'd just record the time before and after the prompt and calculate the difference rather than bothering with threads.)
3) I have made a really simple example using multiprocessing. In this case, I spawn 4 processes that compute the sum of squares for a randomly chosen range. These processes do not have a shared GIL and therefore execute independently unlike multithreading. In this example, you can see that all processes start and end at slightly different times, but we can aggregate the results of the processes into a single queue object. The parent process will wait for all 4 child processes to return their computations before moving on. You could then repeat the code for func_B (not included in the code).
import multiprocessing as mp
import time
import random
import sys
def func_A(process_number, queue):
start = time.time()
print "Process {} has started at {}".format(process_number, start)
sys.stdout.flush()
my_calc = sum([x**2 for x in xrange(random.randint(1000000, 3000000))])
end = time.time()
print "Process {} has ended at {}".format(process_number, end)
sys.stdout.flush()
queue.put((process_number, my_calc))
def multiproc_master():
queue = mp.Queue()
processes = [mp.Process(target=func_A, args=(x, queue)) for x in xrange(4)]
for p in processes:
p.start()
# Unhash the below if you run on Linux (Windows and Linux treat multiprocessing
# differently as Windows lacks os.fork())
#for p in processes:
# p.join()
results = [queue.get() for p in processes]
return results
if __name__ == '__main__':
split_jobs = multiproc_master()
print split_jobs

Have the script achieved it was supposed to please check i am a beginner

i am just a beginner in python.What i try'ed to achieve is making two threads and calling different functions in different thread.I made the function in thread 1 to execute a function for 60 seconds and thread 2 to execute simultaneously and wait the main thread to wait for 70 second.When thread one exits it should also exit the second thread and finally control should come to main thread and again the call to thread one and thread two should go and same procedure repeat.
I try'ed achieving it using the below thread but i thing i was not able to
I have made a script in which i have started two thread named thread 1 and thread 2.
In thread 1 one function will run named func1 and in thread 2 function 2 will run named func 2.
Thread 1 will execute a command and wait for 60 seconds.
Thread 2 will run only till thread 1 is running .
Again after that the same process continues in while after a break of 80 Seconds.
I am a beginner in python.
Please suggest what all i have done wrong and how to correct it.
#!/usr/bin/python
import threading
import time
import subprocess
import datetime
import os
import thread
thread.start_new_thread( print_time, (None, None))
thread.start_new_thread( print_time1, (None, None))
command= "strace -o /root/Desktop/a.txt -c ./server"
final_dir = "/root/Desktop"
exitflag = 0
# Define a function for the thread
def print_time(*args):
os.chdir(final_dir)
print "IN first thread"
proc = subprocess.Popen(command,shell=True,stdout=subprocess.PIPE, stderr=subprocess.PIPE)
proc.wait(70)
exitflag=1
def print_time1(*args):
print "In second thread"
global exitflag
while exitflag:
thread.exit()
#proc = subprocess.Popen(command1,shell=True,stdout=subprocess.PIPE, sterr=subprocess.PIPE)
# Create two threads as follows
try:
while (1):
t1=threading.Thread(target=print_time)
t1.start()
t2=threading.Thread(target=print_time1)
t2=start()
time.sleep(80)
z = t1.isAlive()
z1 = t2.isAlive()
if z:
z.exit()
if z1:
z1.exit()
threading.Thread(target=print_time1).start()
threading.Thread(target=print_time1).start()
print "In try"
except:
print "Error: unable to start thread"
I can't get the example to run, I need to change the function definitons to
def print_time(*args)
and the thread call to
thread.start_new_thread( print_time, (None, None))
then you have a number of problems
you are currently not waiting for the exitflag to be set in the second thread, it justs runs to completion.
to share variables between thread you need to declare them global in the thread, otherwise you get a local variable.
thread.exit() in the print_time1 function generates an error
Your timings in the problem description and in the code does not match
So, to solve issue 1-3 for print_time1 declare it like (removing exit from the end)
def print_time1(*args):
global exitflag
while exitflag == 0: # wait for print_time
next
# Do stuff when thread is finalizing
But, check the doc for the thread module (https://docs.python.org/2/library/thread.html), "[...] however, you should consider using the high-level threading module instead."
import threading
...
while(1):
threading.Thread(target=print_time).start()
threading.Thread(target=print_time1).start()
time.sleep(80)
One final tought about the code is that you should check that the threads are actually finalized before starting new ones. Right now two new threads are started every 80 sec, this is regardless of whether the old threads have run to completion or not. Unless this is the wanted behaviour I would add a check for that in the while loop. Also while you are at it, move the try clause to be as close as possible to where the exception might be raised, i.e. where the threads are created. The way you have it now with the try encapsulating a while loop is not very common and imo not very pythonic (increases complexity of code)

Python: run one function until another function finishes

I have two functions, draw_ascii_spinner and findCluster(companyid).
I would like to:
Run findCluster(companyid) in the backround and while its processing....
Run draw_ascii_spinner until findCluster(companyid) finishes
How do I begin to try to solve for this (Python 2.7)?
Use threads:
import threading, time
def wrapper(func, args, res):
res.append(func(*args))
res = []
t = threading.Thread(target=wrapper, args=(findcluster, (companyid,), res))
t.start()
while t.is_alive():
# print next iteration of ASCII spinner
t.join(0.2)
print res[0]
You can use multiprocessing. Or, if findCluster(companyid) has sensible stopping points, you can turn it into a generator along with draw_ascii_spinner, to do something like this:
for tick in findCluster(companyid):
ascii_spinner.next()
Generally, you will use Threads. Here is a simplistic approach which assumes, that there are only two threads: 1) the main thread executing a task, 2) the spinner thread:
#!/usr/bin/env python
import time
import thread
def spinner():
while True:
print '.'
time.sleep(1)
def task():
time.sleep(5)
if __name__ == '__main__':
thread.start_new_thread(spinner, ())
# as soon as task finishes (and so the program)
# spinner will be gone as well
task()
This can be done with threads. FindCluster runs in a separate thread and when done, it can simply signal another thread that is polling for a reply.
You'll want to do some research on threading, the general form is going to be this
Create a new thread for findCluster and create some way for the program to know the method is running - simplest in Python is just a global boolean
Run draw_ascii_spinner in a while loop conditioned on whether it is still running, you'll probably want to have this thread sleep for a short period of time between iterations
Here's a short tutorial in Python - http://linuxgazette.net/107/pai.html
Run findCluster() in a thread (the Threading module makes this very easy), and then draw_ascii_spinner until some condition is met.
Instead of using sleep() to set the pace of the spinner, you can wait on the thread's wait() with a timeout.
It is possible to have a working example? I am new in Python. I have 6 tasks to run in one python program. These 6 tasks should work in coordinations, meaning that one should start when another finishes. I saw the answers , but I couldn't adopted the codes you shared to my program.
I used "time.sleep" but I know that it is not good because I cannot know how much time it takes each time.
# Sending commands
for i in range(0,len(cmdList)): # port Sending commands
cmd = cmdList[i]
cmdFull = convert(cmd)
port.write(cmd.encode('ascii'))
# s = port.read(10)
print(cmd)
# Terminate the command + close serial port
port.write(cmdFull.encode('ascii'))
print('Termination')
port.close()
# time.sleep(1*60)

How to execute a function asynchronously every 60 seconds in Python?

I want to execute a function every 60 seconds on Python but I don't want to be blocked meanwhile.
How can I do it asynchronously?
import threading
import time
def f():
print("hello world")
threading.Timer(3, f).start()
if __name__ == '__main__':
f()
time.sleep(20)
With this code, the function f is executed every 3 seconds within the 20 seconds time.time.
At the end it gives an error and I think that it is because the threading.timer has not been canceled.
How can I cancel it?
You could try the threading.Timer class: http://docs.python.org/library/threading.html#timer-objects.
import threading
def f(f_stop):
# do something here ...
if not f_stop.is_set():
# call f() again in 60 seconds
threading.Timer(60, f, [f_stop]).start()
f_stop = threading.Event()
# start calling f now and every 60 sec thereafter
f(f_stop)
# stop the thread when needed
#f_stop.set()
The simplest way is to create a background thread that runs something every 60 seconds. A trivial implementation is:
import time
from threading import Thread
class BackgroundTimer(Thread):
def run(self):
while 1:
time.sleep(60)
# do something
# ... SNIP ...
# Inside your main thread
# ... SNIP ...
timer = BackgroundTimer()
timer.start()
Obviously, if the "do something" takes a long time, then you'll need to accommodate for it in your sleep statement. But, 60 seconds serves as a good approximation.
I googled around and found the Python circuits Framework, which makes it possible to wait
for a particular event.
The .callEvent(self, event, *channels) method of circuits contains a fire and suspend-until-response functionality, the documentation says:
Fire the given event to the specified channels and suspend execution
until it has been dispatched. This method may only be invoked as
argument to a yield on the top execution level of a handler (e.g.
"yield self.callEvent(event)"). It effectively creates and returns
a generator that will be invoked by the main loop until the event has
been dispatched (see :func:circuits.core.handlers.handler).
I hope you find it as useful as I do :)
./regards
It depends on what you actually want to do in the mean time. Threads are the most general and least preferred way of doing it; you should be aware of the issues with threading when you use it: not all (non-Python) code allows access from multiple threads simultaneously, communication between threads should be done using thread-safe datastructures like Queue.Queue, you won't be able to interrupt the thread from outside it, and terminating the program while the thread is still running can lead to a hung interpreter or spurious tracebacks.
Often there's an easier way. If you're doing this in a GUI program, use the GUI library's timer or event functionality. All GUIs have this. Likewise, if you're using another event system, like Twisted or another server-process model, you should be able to hook into the main event loop to cause it to call your function regularly. The non-threading approaches do cause your program to be blocked while the function is pending, but not between functioncalls.
Why dont you create a dedicated thread, in which you put a simple sleeping loop:
#!/usr/bin/env python
import time
while True:
# Your code here
time.sleep(60)
I think the right way to run a thread repeatedly is the next:
import threading
import time
def f():
print("hello world") # your code here
myThread.run()
if __name__ == '__main__':
myThread = threading.Timer(3, f) # timer is set to 3 seconds
myThread.start()
time.sleep(10) # it can be loop or other time consuming code here
if myThread.is_alive():
myThread.cancel()
With this code, the function f is executed every 3 seconds within the 10 seconds time.sleep(10). At the end running of thread is canceled.
If you want to invoke the method "on the clock" (e.g. every hour on the hour), you can integrate the following idea with whichever threading mechanism you choose:
import time
def wait(n):
'''Wait until the next increment of n seconds'''
x = time.time()
time.sleep(n-(x%n))
print(time.asctime())
[snip. removed non async version]
To use asyncing you would use trio. I recommend trio to everyone who asks about async python. It is much easier to work with especially sockets. With sockets I have a nursery with 1 read and 1 write function and the write function writes data from an deque where it is placed by the read function; and waiting to be sent. The following app works by using trio.run(function,parameters) and then opening an nursery where the program functions in loops with an await trio.sleep(60) between each loop to give the rest of the app a chance to run. This will run the program in a single processes but your machine can handle 1500 TCP connections insead of just 255 with the non async method.
I have not yet mastered the cancellation statements but I put at move_on_after(70) which is means the code will wait 10 seconds longer than to execute a 60 second sleep before moving on to the next loop.
import trio
async def execTimer():
'''This function gets executed in a nursery simultaneously with the rest of the program'''
while True:
trio.move_on_after(70):
await trio.sleep(60)
print('60 Second Loop')
async def OneTime_OneMinute():
'''This functions gets run by trio.run to start the entire program'''
with trio.open_nursery() as nursery:
nursery.start_soon(execTimer)
nursery.start_soon(print,'do the rest of the program simultaneously')
def start():
'''You many have only one trio.run in the entire application'''
trio.run(OneTime_OneMinute)
if __name__ == '__main__':
start()
This will run any number of functions simultaneously in the nursery. You can use any of the cancellable statements for checkpoints where the rest of the program gets to continue running. All trio statements are checkpoints so use them a lot. I did not test this app; so if there are any questions just ask.
As you can see trio is the champion of easy-to-use functionality. It is based on using functions instead of objects but you can use objects if you wish.
Read more at:
[1]: https://trio.readthedocs.io/en/stable/reference-core.html

Categories

Resources