Python thread.join(timeout) not timing out

Python thread.join(timeout) not timing out - python

I am using threading python module. I want to execute a function which runs an expression entered by a user. I want to wait for it to finish execution or until a timeout period is reached. The following code should timeout after 5 second, but it never times out.
def longFunc():
# this expression could be entered by the user
return 45 ** 10 ** 1000
thread = threading.Thread(target=longFunc, args=(), daemon=True)
thread.start()
thread.join(5.0)
print("end") # never reaches this point :(
Why is this and how can I fix this behaviour? Should I try to use multiprocessing instead?

I suspect in this case you're hitting an issue where the join can't execute while the global interpreter lock is held by the very long-running calculation, which I believe would happen as a single atomic operation. If you change longFunc to something that happens over multiple instructions, such as a busy loop, e.g.
def longFunc():
while True:
pass
then it works as expected. Is the single expensive calculation realistic for your situation or did the example just happen to hit a very bad case?
Using the multiprocessing module does appear to resolve this issue:
from multiprocessing import Process
def longFunc():
# this expression could be entered by the user
return 45 ** 10 ** 1000
if __name__ == "__main__":
thread = Process(target=longFunc, args=(), daemon=True)
thread.start()
thread.join(5.0)
print("end")
This prints "end" as expected.

Related

Execute threads in certain order

When we launch threads, is it known for SURE which thread will be executed first or is it something not predictable ?
I say this because square is always called first and then cube.
import threading
def print_cube(num):
# function to print cube of given num
print("Cube: {}".format(num * num * num))
def print_square(num):
# function to print square of given num
print("Square: {}".format(num * num))
if __name__ == "__main__":
# creating thread
cuadrado = threading.Thread(target=print_square, args=(10,))
cubo = threading.Thread(target=print_cube, args=(10,))
# starting thread 1
cuadrado.start()
# starting thread 2
cubo.start()
print("Done!")
I would like to understand the method Threading.start()
Does the order of calling Threading start() matter?
But, if I sleep the yarns with the same time, then it is random order
import threading
import time
def print_cube(num):
# function to print cube of given num
time.sleep(3)
print("Cube: {}".format(num * num * num))
def print_square(num):
# function to print square of given num
time.sleep(3)
print("Square: {}".format(num * num))
if __name__ == "__main__":
# creating thread
cuadrado = threading.Thread(target=print_square, args=(10,))
cubo = threading.Thread(target=print_cube, args=(10,))
# starting thread 1
cuadrado.start()
# starting thread 2
cubo.start()
# both threads completely executed
print("Done!")

I would like to understand the method Threading.start()
A Python threading.Thread object is not the same thing as a thread. A thread is an object in the operating system—separate from your code. I like to think of a thread as an agent who executes your target function.
The purpose of the Python Thread class is, to provide a platform-independent interface to the various different thread APIs of various different operating systems. One peculiarity of Python's Thread is that it does not actually create the operating system thread until you call its start() method. That's what start() does: It creates the underlying OS thread.
Does the order of calling Threading start() matter?
Depends what you mean. Your program definitely always starts the cuadrado thread before it starts the cubo thread, but the whole point of threads is to provide a means to achieve concurrency in your program; and what "concurrency" means is that the things happening in different threads are not required to happen in any definite order. By calling print_cube() and print_square() in different threads, you effectively are telling Python (and the OS) that you don't care which one prints first.
Maybe print_square() will always be called first on your computer. Maybe print_cube() will always be called first on somebody else's computer. Maybe it will be unpredictable which one goes first on a third computer.
Sounds a little chaotic, but the reason why we like concurrency is that it gives the OS and the Python system more freedom to get things done in the most efficient order. E.g., if one thread is waiting for some network packet to arrive, some other thread can be allowed to do some useful work. So long as the "useful work" doesn't need the packet that the other thread was waiting for, that's a Good Thing.

Why doesn't eventlet GreenPool call func after spawn_n unless waitall()?

This code prints nothing:
def foo(i):
print i
def main():
pool = eventlet.GreenPool(size=100)
for i in xrange(100):
pool.spawn_n(foo, i)
while True:
pass
But this code prints numbers:
def foo(i):
print i
def main():
pool = eventlet.GreenPool(size=100)
for i in xrange(100):
pool.spawn_n(foo, i)
pool.waitall()
while True:
pass
The only difference is pool.waitall(). In my mind, waitall() means wait until all greenthreads in the pool are finished working, but an infinite loop waits for every greenthread, so pool.waitall() is not necessary.
So why does this happen?
Reference: http://eventlet.net/doc/modules/greenpool.html#eventlet.greenpool.GreenPool.waitall

The threads created in an eventlet GreenPool are green threads. This means that they all exist within one thread at the operating-system level, and the Python interpreter handles switching between them. This switching can only happen when one thread either yields (deliberately provides an opportunity for other threads to run) or is waiting for I/O.
When your code runs:
while True:
pass
… that thread of execution is blocked – stuck on that code – and no other green threads can get scheduled.
When you instead run:
pool.waitall()
… eventlet makes sure that it yields while waiting.
You could emulate this same behaviour by modifying your while loop slightly to call the eventlet.sleep function, which yields:
while True:
eventlet.sleep()
This could be useful if you wanted to do something else in the while True: loop while waiting for the threads in your pool to complete. Otherwise, just use pool.waitall() – that’s what it’s for.

Return whichever expression returns first

I have two different functions f, and g that compute the same result with different algorithms. Sometimes one or the other takes a long time while the other terminates quickly. I want to create a new function that runs each simultaneously and then returns the result from the first that finishes.
I want to create that function with a higher order function
h = firstresult(f, g)
What is the best way to accomplish this in Python?
I suspect that the solution involves threading. I'd like to avoid discussion of the GIL.

I would simply use a Queue for this. Start the threads and the first one which has a result ready writes to the queue.
Code
from threading import Thread
from time import sleep
from Queue import Queue
def firstresult(*functions):
queue = Queue()
threads = []
for f in functions:
def thread_main():
queue.put(f())
thread = Thread(target=thread_main)
threads.append(thread)
thread.start()
result = queue.get()
return result
def slow():
sleep(1)
return 42
def fast():
return 0
if __name__ == '__main__':
print firstresult(slow, fast)
Live demo
http://ideone.com/jzzZX2
Notes
Stopping the threads is an entirely different topic. For this you need to add some state variable to the threads which needs to be checked in regular intervals. As I want to keep this example short I simply assumed that part and assumed that all workers get the time to finish their work even though the result is never read.
Skipping the discussion about the Gil as requested by the questioner. ;-)

Now - unlike my suggestion on the other answer, this piece of code does exactly what you are requesting:
from multiprocessing import Process, Queue
import random
import time
def firstresult(func1, func2):
queue = Queue()
proc1 = Process(target=func1,args=(queue,))
proc2 = Process(target=func2, args=(queue,))
proc1.start();proc2.start()
result = queue.get()
proc1.terminate(); proc2.terminate()
return result
def algo1(queue):
time.sleep(random.uniform(0,1))
queue.put("algo 1")
def algo2(queue):
time.sleep(random.uniform(0,1))
queue.put("algo 2")
print firstresult(algo1, algo2)

Run each function in a new worker thread, the 2 worker threads send the result back to the main thread in a 1 item queue or something similar. When the main thread receives the result from the winner, it kills (do python threads support kill yet? lol.) both worker threads to avoid wasting time (one function may take hours while the other only takes a second).
Replace the word thread with process if you want.

You will need to run each function in another process (with multiprocessing) or in a different thread.
If both are CPU bound, multithread won help much - exactly due to the GIL -
so multiprocessing is the way.
If the return value is a pickleable (serializable) object, I have this decorator I created that simply runs the function in background, in another process:
https://bitbucket.org/jsbueno/lelo/src
It is not exactly what you want - as both are non-blocking and start executing right away. The tirck with this decorator is that it blocks (and waits for the function to complete) as when you try to use the return value.
But on the other hand - it is just a decorator that does all the work.

Python: run one function until another function finishes

I have two functions, draw_ascii_spinner and findCluster(companyid).
I would like to:
Run findCluster(companyid) in the backround and while its processing....
Run draw_ascii_spinner until findCluster(companyid) finishes
How do I begin to try to solve for this (Python 2.7)?

Use threads:
import threading, time
def wrapper(func, args, res):
res.append(func(*args))
res = []
t = threading.Thread(target=wrapper, args=(findcluster, (companyid,), res))
t.start()
while t.is_alive():
# print next iteration of ASCII spinner
t.join(0.2)
print res[0]

You can use multiprocessing. Or, if findCluster(companyid) has sensible stopping points, you can turn it into a generator along with draw_ascii_spinner, to do something like this:
for tick in findCluster(companyid):
ascii_spinner.next()

Generally, you will use Threads. Here is a simplistic approach which assumes, that there are only two threads: 1) the main thread executing a task, 2) the spinner thread:
#!/usr/bin/env python
import time
import thread
def spinner():
while True:
print '.'
time.sleep(1)
def task():
time.sleep(5)
if __name__ == '__main__':
thread.start_new_thread(spinner, ())
# as soon as task finishes (and so the program)
# spinner will be gone as well
task()

This can be done with threads. FindCluster runs in a separate thread and when done, it can simply signal another thread that is polling for a reply.

You'll want to do some research on threading, the general form is going to be this
Create a new thread for findCluster and create some way for the program to know the method is running - simplest in Python is just a global boolean
Run draw_ascii_spinner in a while loop conditioned on whether it is still running, you'll probably want to have this thread sleep for a short period of time between iterations
Here's a short tutorial in Python - http://linuxgazette.net/107/pai.html

Run findCluster() in a thread (the Threading module makes this very easy), and then draw_ascii_spinner until some condition is met.
Instead of using sleep() to set the pace of the spinner, you can wait on the thread's wait() with a timeout.

It is possible to have a working example? I am new in Python. I have 6 tasks to run in one python program. These 6 tasks should work in coordinations, meaning that one should start when another finishes. I saw the answers , but I couldn't adopted the codes you shared to my program.
I used "time.sleep" but I know that it is not good because I cannot know how much time it takes each time.
# Sending commands
for i in range(0,len(cmdList)): # port Sending commands
cmd = cmdList[i]
cmdFull = convert(cmd)
port.write(cmd.encode('ascii'))
# s = port.read(10)
print(cmd)
# Terminate the command + close serial port
port.write(cmdFull.encode('ascii'))
print('Termination')
port.close()
# time.sleep(1*60)

How to execute a function asynchronously every 60 seconds in Python?

I want to execute a function every 60 seconds on Python but I don't want to be blocked meanwhile.
How can I do it asynchronously?
import threading
import time
def f():
print("hello world")
threading.Timer(3, f).start()
if __name__ == '__main__':
f()
time.sleep(20)
With this code, the function f is executed every 3 seconds within the 20 seconds time.time.
At the end it gives an error and I think that it is because the threading.timer has not been canceled.
How can I cancel it?

You could try the threading.Timer class: http://docs.python.org/library/threading.html#timer-objects.
import threading
def f(f_stop):
# do something here ...
if not f_stop.is_set():
# call f() again in 60 seconds
threading.Timer(60, f, [f_stop]).start()
f_stop = threading.Event()
# start calling f now and every 60 sec thereafter
f(f_stop)
# stop the thread when needed
#f_stop.set()

The simplest way is to create a background thread that runs something every 60 seconds. A trivial implementation is:
import time
from threading import Thread
class BackgroundTimer(Thread):
def run(self):
while 1:
time.sleep(60)
# do something
# ... SNIP ...
# Inside your main thread
# ... SNIP ...
timer = BackgroundTimer()
timer.start()
Obviously, if the "do something" takes a long time, then you'll need to accommodate for it in your sleep statement. But, 60 seconds serves as a good approximation.

I googled around and found the Python circuits Framework, which makes it possible to wait
for a particular event.
The .callEvent(self, event, *channels) method of circuits contains a fire and suspend-until-response functionality, the documentation says:
Fire the given event to the specified channels and suspend execution
until it has been dispatched. This method may only be invoked as
argument to a yield on the top execution level of a handler (e.g.
"yield self.callEvent(event)"). It effectively creates and returns
a generator that will be invoked by the main loop until the event has
been dispatched (see :func:circuits.core.handlers.handler).
I hope you find it as useful as I do :)
./regards

It depends on what you actually want to do in the mean time. Threads are the most general and least preferred way of doing it; you should be aware of the issues with threading when you use it: not all (non-Python) code allows access from multiple threads simultaneously, communication between threads should be done using thread-safe datastructures like Queue.Queue, you won't be able to interrupt the thread from outside it, and terminating the program while the thread is still running can lead to a hung interpreter or spurious tracebacks.
Often there's an easier way. If you're doing this in a GUI program, use the GUI library's timer or event functionality. All GUIs have this. Likewise, if you're using another event system, like Twisted or another server-process model, you should be able to hook into the main event loop to cause it to call your function regularly. The non-threading approaches do cause your program to be blocked while the function is pending, but not between functioncalls.

Why dont you create a dedicated thread, in which you put a simple sleeping loop:
#!/usr/bin/env python
import time
while True:
# Your code here
time.sleep(60)

I think the right way to run a thread repeatedly is the next:
import threading
import time
def f():
print("hello world") # your code here
myThread.run()
if __name__ == '__main__':
myThread = threading.Timer(3, f) # timer is set to 3 seconds
myThread.start()
time.sleep(10) # it can be loop or other time consuming code here
if myThread.is_alive():
myThread.cancel()
With this code, the function f is executed every 3 seconds within the 10 seconds time.sleep(10). At the end running of thread is canceled.

If you want to invoke the method "on the clock" (e.g. every hour on the hour), you can integrate the following idea with whichever threading mechanism you choose:
import time
def wait(n):
'''Wait until the next increment of n seconds'''
x = time.time()
time.sleep(n-(x%n))
print(time.asctime())

[snip. removed non async version]
To use asyncing you would use trio. I recommend trio to everyone who asks about async python. It is much easier to work with especially sockets. With sockets I have a nursery with 1 read and 1 write function and the write function writes data from an deque where it is placed by the read function; and waiting to be sent. The following app works by using trio.run(function,parameters) and then opening an nursery where the program functions in loops with an await trio.sleep(60) between each loop to give the rest of the app a chance to run. This will run the program in a single processes but your machine can handle 1500 TCP connections insead of just 255 with the non async method.
I have not yet mastered the cancellation statements but I put at move_on_after(70) which is means the code will wait 10 seconds longer than to execute a 60 second sleep before moving on to the next loop.
import trio
async def execTimer():
'''This function gets executed in a nursery simultaneously with the rest of the program'''
while True:
trio.move_on_after(70):
await trio.sleep(60)
print('60 Second Loop')
async def OneTime_OneMinute():
'''This functions gets run by trio.run to start the entire program'''
with trio.open_nursery() as nursery:
nursery.start_soon(execTimer)
nursery.start_soon(print,'do the rest of the program simultaneously')
def start():
'''You many have only one trio.run in the entire application'''
trio.run(OneTime_OneMinute)
if __name__ == '__main__':
start()
This will run any number of functions simultaneously in the nursery. You can use any of the cancellable statements for checkpoints where the rest of the program gets to continue running. All trio statements are checkpoints so use them a lot. I did not test this app; so if there are any questions just ask.
As you can see trio is the champion of easy-to-use functionality. It is based on using functions instead of objects but you can use objects if you wish.
Read more at:
[1]: https://trio.readthedocs.io/en/stable/reference-core.html

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.