Is it an issue to schedule too many tiny threads?

Is it an issue to schedule too many tiny threads? - python

Is there any issue scheduling too many tiny threads that are never overlap?
For instance, what "unexpected" behavior should be expected from something like this in python:
from threading import Timer
def print_num_thread(wait_for=0.1, num_thread=0):
t = Timer(0.1, print_num_thread, [wait_for, num_thread+1])
# do something really quick (do << wait_for) like:
print(f'thread number {num_thread}')
t.start()
print_num_thread()
What are the advantages of this compared to something like:
from time import sleep
def print_num_while(wait_for=0.1, num_while=0):
while True:
# do something really quick like:
print(f'loop_number {num_while}')
num_while += 1
sleep(wait_for)
print_num_while()
And how about stopping the threads in the first scenario without setting t.daemon = True if using some resources?

And how about stopping the threads in the first scenario without
setting t.daemon = True if using some resources?
If you set t.daemon to True the thread will not stop but become a daemon. Meaning it should keep running upon program termination. You can kill/stop a thread using thread.kill.
To compare: make a thread that is a demon in your python repl, make a thread that is none, exit python repl, check running processes.
What are the advantages of this compared to something like:
...
The first example will not block the thread, while the second one will.
To compare, try:
print_num_thread()
print("I GOT HERE")
vs:
print_num_while()
print("I GOT HERE")

Related

Python: Timer, how to stop thread when program ends?

I have a function I'm calling every 5 seconds like such:
def check_buzz(super_buzz_words):
print 'Checking buzz'
t = Timer(5.0, check_buzz, args=(super_buzz_words,))
t.dameon = True
t.start()
buzz_word = get_buzz_word()
if buzz_word is not 'fail':
super_buzz_words.put(buzz_word)
main()
check_buzz()
I'm exiting the script by either catching a KeyboardInterrupt or by catching a System exit and calling this:
sys.exit('\nShutting Down\n')
I'm also restarting the program every so often by calling:
execv(sys.executable, [sys.executable] + sys.argv)
My question is, how do I get that timer thread to shut off? If I keyboard interrupt, the timer keeps going.

I think you just spelled daemon wrong, it should have been:
t.daemon = True
Then sys.exit() should work

Expanding on the answer from notorious.no, and the comment asking:
How can I call t.cancel() if I have no access to t oustide the
function?
Give the Timer thread a distinct name when you first create it:
import threading
def check_buzz(super_buzz_words):
print 'Checking buzz'
t = Timer(5.0, check_buzz, args=(super_buzz_words,))
t.daemon = True
t.name = "check_buzz_daemon"
t.start()
Although the local variable t soon goes out of scope, the Timer thread that t pointed to still exists and still retains the name assigned to it.
Your atexit-registered method can then identify this thread by its name and cancel it:
from atexit import register
def all_done():
for thr in threading._enumerate():
if thr.name == "check_buzz_daemon":
if thr.is_alive():
thr.cancel()
thr.join()
register(all_done)
Calling join() after calling cancel()is based on a StackOverflow answer by Cédric Julien.
HOWEVER, your thread is set to be a Daemon. According to this StackOverflow post, daemon threads do not need to be explicitly terminated.

from atexit import register
def all_done():
if t.is_alive():
# do something that will close your thread gracefully
register(all_done)
Basically when your code is about to exit, it will fire one last function and this is where you will check if your thread is still running. If it is, do something that will either cancel the transaction or otherwise exit gracefully. In general, it's best to let threads finish by themselves, but if it's not doing anything important (please note the emphasis) than you can just do t.cancel(). Design your code so that threads will finish on their own if possible.

Another way would be to use the Queue() module to send and recieve info from a thread using the .put() outside the thread and the .get() inside the thread.
What you can also do is create a txt file and make program write to it when you exit And put an if statement in the thread function to check it after each iteration (this is not a really good solution but it also works)
I would have put a code exemple but i am writing from mobile sorry

Return whichever expression returns first

I have two different functions f, and g that compute the same result with different algorithms. Sometimes one or the other takes a long time while the other terminates quickly. I want to create a new function that runs each simultaneously and then returns the result from the first that finishes.
I want to create that function with a higher order function
h = firstresult(f, g)
What is the best way to accomplish this in Python?
I suspect that the solution involves threading. I'd like to avoid discussion of the GIL.

I would simply use a Queue for this. Start the threads and the first one which has a result ready writes to the queue.
Code
from threading import Thread
from time import sleep
from Queue import Queue
def firstresult(*functions):
queue = Queue()
threads = []
for f in functions:
def thread_main():
queue.put(f())
thread = Thread(target=thread_main)
threads.append(thread)
thread.start()
result = queue.get()
return result
def slow():
sleep(1)
return 42
def fast():
return 0
if __name__ == '__main__':
print firstresult(slow, fast)
Live demo
http://ideone.com/jzzZX2
Notes
Stopping the threads is an entirely different topic. For this you need to add some state variable to the threads which needs to be checked in regular intervals. As I want to keep this example short I simply assumed that part and assumed that all workers get the time to finish their work even though the result is never read.
Skipping the discussion about the Gil as requested by the questioner. ;-)

Now - unlike my suggestion on the other answer, this piece of code does exactly what you are requesting:
from multiprocessing import Process, Queue
import random
import time
def firstresult(func1, func2):
queue = Queue()
proc1 = Process(target=func1,args=(queue,))
proc2 = Process(target=func2, args=(queue,))
proc1.start();proc2.start()
result = queue.get()
proc1.terminate(); proc2.terminate()
return result
def algo1(queue):
time.sleep(random.uniform(0,1))
queue.put("algo 1")
def algo2(queue):
time.sleep(random.uniform(0,1))
queue.put("algo 2")
print firstresult(algo1, algo2)

Run each function in a new worker thread, the 2 worker threads send the result back to the main thread in a 1 item queue or something similar. When the main thread receives the result from the winner, it kills (do python threads support kill yet? lol.) both worker threads to avoid wasting time (one function may take hours while the other only takes a second).
Replace the word thread with process if you want.

You will need to run each function in another process (with multiprocessing) or in a different thread.
If both are CPU bound, multithread won help much - exactly due to the GIL -
so multiprocessing is the way.
If the return value is a pickleable (serializable) object, I have this decorator I created that simply runs the function in background, in another process:
https://bitbucket.org/jsbueno/lelo/src
It is not exactly what you want - as both are non-blocking and start executing right away. The tirck with this decorator is that it blocks (and waits for the function to complete) as when you try to use the return value.
But on the other hand - it is just a decorator that does all the work.

Python: run one function until another function finishes

I have two functions, draw_ascii_spinner and findCluster(companyid).
I would like to:
Run findCluster(companyid) in the backround and while its processing....
Run draw_ascii_spinner until findCluster(companyid) finishes
How do I begin to try to solve for this (Python 2.7)?

Use threads:
import threading, time
def wrapper(func, args, res):
res.append(func(*args))
res = []
t = threading.Thread(target=wrapper, args=(findcluster, (companyid,), res))
t.start()
while t.is_alive():
# print next iteration of ASCII spinner
t.join(0.2)
print res[0]

You can use multiprocessing. Or, if findCluster(companyid) has sensible stopping points, you can turn it into a generator along with draw_ascii_spinner, to do something like this:
for tick in findCluster(companyid):
ascii_spinner.next()

Generally, you will use Threads. Here is a simplistic approach which assumes, that there are only two threads: 1) the main thread executing a task, 2) the spinner thread:
#!/usr/bin/env python
import time
import thread
def spinner():
while True:
print '.'
time.sleep(1)
def task():
time.sleep(5)
if __name__ == '__main__':
thread.start_new_thread(spinner, ())
# as soon as task finishes (and so the program)
# spinner will be gone as well
task()

This can be done with threads. FindCluster runs in a separate thread and when done, it can simply signal another thread that is polling for a reply.

You'll want to do some research on threading, the general form is going to be this
Create a new thread for findCluster and create some way for the program to know the method is running - simplest in Python is just a global boolean
Run draw_ascii_spinner in a while loop conditioned on whether it is still running, you'll probably want to have this thread sleep for a short period of time between iterations
Here's a short tutorial in Python - http://linuxgazette.net/107/pai.html

Run findCluster() in a thread (the Threading module makes this very easy), and then draw_ascii_spinner until some condition is met.
Instead of using sleep() to set the pace of the spinner, you can wait on the thread's wait() with a timeout.

It is possible to have a working example? I am new in Python. I have 6 tasks to run in one python program. These 6 tasks should work in coordinations, meaning that one should start when another finishes. I saw the answers , but I couldn't adopted the codes you shared to my program.
I used "time.sleep" but I know that it is not good because I cannot know how much time it takes each time.
# Sending commands
for i in range(0,len(cmdList)): # port Sending commands
cmd = cmdList[i]
cmdFull = convert(cmd)
port.write(cmd.encode('ascii'))
# s = port.read(10)
print(cmd)
# Terminate the command + close serial port
port.write(cmdFull.encode('ascii'))
print('Termination')
port.close()
# time.sleep(1*60)

How to interrupt python multithreaded app?

I'm trying to run the following code (it i simplified a bit):
def RunTests(self):
from threading import Thread
import signal
global keep_running
keep_running = True
signal.signal( signal.SIGINT, stop_running )
for i in range(0, NumThreads):
thread = Thread(target = foo)
self._threads.append(thread)
thread.start()
# wait for all threads to finish
for t in self._threads:
t.join()
def stop_running(signl, frme):
global keep_testing
keep_testing = False
print "Interrupted by the Master. Good by!"
return 0
def foo(self):
global keep_testing
while keep_testing:
DO_SOME_WORK();
I expect that the user presses Ctrl+C the program will print the good by message and interrupt. However it doesn't work. Where is the problem?
Thanks

Unlike regular processes, Python doesn't appear to handle signals in a truly asynchronous manner. The 'join()' call is somehow blocking the main thread in a manner that prevents it from responding to the signal. I'm a bit surprised by this since I don't see anything in the documentation indicating that this can/should happen. The solution, however, is simple. In your main thread, add the following loop prior to calling 'join()' on the threads:
while keep_testing:
signal.pause()

From the threading docs:
A thread can be flagged as a “daemon thread”. The significance of this flag is that the entire Python program exits when only daemon threads are left. The initial value is inherited from the creating thread. The flag can be set through the daemon property.
You could try setting thread.daemon = True before calling start() and see if that solves your problem.

Python: Pass or Sleep for long running processes?

I am writing an queue processing application which uses threads for waiting on and responding to queue messages to be delivered to the app. For the main part of the application, it just needs to stay active. For a code example like:
while True:
pass
or
while True:
time.sleep(1)
Which one will have the least impact on a system? What is the preferred way to do nothing, but keep a python app running?

I would imagine time.sleep() will have less overhead on the system. Using pass will cause the loop to immediately re-evaluate and peg the CPU, whereas using time.sleep will allow the execution to be temporarily suspended.
EDIT: just to prove the point, if you launch the python interpreter and run this:
>>> while True:
... pass
...
You can watch Python start eating up 90-100% CPU instantly, versus:
>>> import time
>>> while True:
... time.sleep(1)
...
Which barely even registers on the Activity Monitor (using OS X here but it should be the same for every platform).

Why sleep? You don't want to sleep, you want to wait for the threads to finish.
So
# store the threads you start in a your_threads list, then
for a_thread in your_threads:
a_thread.join()
See: thread.join

If you are looking for a short, zero-cpu way to loop forever until a KeyboardInterrupt, you can use:
from threading import Event
Event().wait()
Note: Due to a bug, this only works on Python 3.2+. In addition, it appears to not work on Windows. For this reason, while True: sleep(1) might be the better option.
For some background, Event objects are normally used for waiting for long running background tasks to complete:
def do_task():
sleep(10)
print('Task complete.')
event.set()
event = Event()
Thread(do_task).start()
event.wait()
print('Continuing...')
Which prints:
Task complete.
Continuing...

signal.pause() is another solution, see https://docs.python.org/3/library/signal.html#signal.pause
Cause the process to sleep until a signal is received; the appropriate handler will then be called. Returns nothing. Not on Windows. (See the Unix man page signal(2).)

I've always seen/heard that using sleep is the better way to do it. Using sleep will keep your Python interpreter's CPU usage from going wild.

You don't give much context to what you are really doing, but maybe Queue could be used instead of an explicit busy-wait loop? If not, I would assume sleep would be preferable, as I believe it will consume less CPU (as others have already noted).
[Edited according to additional information in comment below.]
Maybe this is obvious, but anyway, what you could do in a case where you are reading information from blocking sockets is to have one thread read from the socket and post suitably formatted messages into a Queue, and then have the rest of your "worker" threads reading from that queue; the workers will then block on reading from the queue without the need for neither pass, nor sleep.

Running a method as a background thread with sleep in Python:
import threading
import time
class ThreadingExample(object):
""" Threading example class
The run() method will be started and it will run in the background
until the application exits.
"""
def __init__(self, interval=1):
""" Constructor
:type interval: int
:param interval: Check interval, in seconds
"""
self.interval = interval
thread = threading.Thread(target=self.run, args=())
thread.daemon = True # Daemonize thread
thread.start() # Start the execution
def run(self):
""" Method that runs forever """
while True:
# Do something
print('Doing something imporant in the background')
time.sleep(self.interval)
example = ThreadingExample()
time.sleep(3)
print('Checkpoint')
time.sleep(2)
print('Bye')

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Is it an issue to schedule too many tiny threads? - python

Related

Python: Timer, how to stop thread when program ends?

Return whichever expression returns first

Python: run one function until another function finishes

How to interrupt python multithreaded app?

Python: Pass or Sleep for long running processes?

Categories

Resources