I want to run a function continuoulsy in parallel to my main process.How do i do it in python?multiprocessing?threading or thread module?
I am new to python.Any help much appreciated.
If the aim is to capture stderr and do some action you can simply replace sys.stderr by a custom object:
>>> import sys
>>> class MyLogger(object):
... def __init__(self, callback):
... self._callback = callback
... def write(self, text):
... if 'log' in text:
... self._callback(text)
... sys.__stderr__.write(text) # continue writing to normal stderr
...
>>> def the_callback(s):
... print('Stderr: %r' % s)
...
>>> sys.stderr = MyLogger(the_callback)
>>> sys.stderr.write('Some log message\n')
Stderr: 'Some log message'
Some log message
>>>
>>> sys.stderr.write('Another message\n')
Another message
If you want to handle tracebacks and exceptions you can use sys.excepthook.
If you want to capture logs created by the logging module you can implement your own Handler class similar to the above Logger but reimplementing the emit method.
A more interesting, but less practical solution would be to use some kind of scheduler and generators to simulate parallel execution without actually creating threads(searching on the internet will yield some nice results about this)
It definitely depends on your aim, but I'd suggest looking at the threading module. There are many good StackOverflow questions on the use of threading and multithreading (e.g., Multiprocessing vs Threading Python).
Here's a brief skeleton from one of my projects:
import threading # Threading module itself
import Queue # A handy way to pass tasks to your thread
job_queue = Queue.Queue()
job_queue.append('one job to do')
# This is the function that we want to keep running while our program does its thing
def function_to_run_in_background():
# Do something...here is one form of flow control
while True:
job_to_do = job_queue.get() # Get the task from the Queue
print job_to_do # Print what it was we fetched
job_queue.task_done() # Signal that we've finished with that queue item
# Launch the thread...
t = threadingThread(target=function_to_run_in_background, args=(args_to_pass,))
t.daemon = True # YOU MAY NOT WANT THIS: Only use this line if you want the program to exit without waiting for the thread to finish
t.start() # Starts the thread
t.setName('threadName') # Makes it easier to interact with the thread later
# Do other stuff
sleep(5)
print "I am still here..."
job_queue.append('Here is another job for the thread...')
# Wait for everything in job_queue to finish. Since the thread is a daemon, the program will now exit, killing the thread.
job_queue.join()
if you just want to run a function in background in the same process, do:
import thread
def function(a):
pass
thread.start_new(function, (1,)) # a is 1 then
I found that client-server architecture was solution for me. Running server, and spawning many clients talking to server and between clients directly, something like messenger.
Talking/comunication can be achieved through network or text file located in memory, (to speed things up and save hard drive).
Bakuriu: give u a good tip about logging module.
Related
I've got a Python script that sometimes displays images to the user. The images can, at times, be quite large, and they are reused often. Displaying them is not critical, but displaying the message associated with them is. I've got a function that downloads the image needed and saves it locally. Right now it's run inline with the code that displays a message to the user, but that can sometimes take over 10 seconds for non-local images. Is there a way I could call this function when it's needed, but run it in the background while the code continues to execute? I would just use a default image until the correct one becomes available.
Do something like this:
def function_that_downloads(my_args):
# do some long download here
then inline, do something like this:
import threading
def my_inline_function(some_args):
# do some stuff
download_thread = threading.Thread(target=function_that_downloads, name="Downloader", args=some_args)
download_thread.start()
# continue doing stuff
You may want to check if the thread has finished before going on to other things by calling download_thread.isAlive()
Typically the way to do this would be to use a thread pool and queue downloads which would issue a signal, a.k.a an event, when that task has finished processing. You can do this within the scope of the threading module Python provides.
To perform said actions, I would use event objects and the Queue module.
However, a quick and dirty demonstration of what you can do using a simple threading.Thread implementation can be seen below:
import os
import threading
import time
import urllib2
class ImageDownloader(threading.Thread):
def __init__(self, function_that_downloads):
threading.Thread.__init__(self)
self.runnable = function_that_downloads
self.daemon = True
def run(self):
self.runnable()
def downloads():
with open('somefile.html', 'w+') as f:
try:
f.write(urllib2.urlopen('http://google.com').read())
except urllib2.HTTPError:
f.write('sorry no dice')
print 'hi there user'
print 'how are you today?'
thread = ImageDownloader(downloads)
thread.start()
while not os.path.exists('somefile.html'):
print 'i am executing but the thread has started to download'
time.sleep(1)
print 'look ma, thread is not alive: ', thread.is_alive()
It would probably make sense to not poll like I'm doing above. In which case, I would change the code to this:
import os
import threading
import time
import urllib2
class ImageDownloader(threading.Thread):
def __init__(self, function_that_downloads):
threading.Thread.__init__(self)
self.runnable = function_that_downloads
def run(self):
self.runnable()
def downloads():
with open('somefile.html', 'w+') as f:
try:
f.write(urllib2.urlopen('http://google.com').read())
except urllib2.HTTPError:
f.write('sorry no dice')
print 'hi there user'
print 'how are you today?'
thread = ImageDownloader(downloads)
thread.start()
# show message
thread.join()
# display image
Notice that there's no daemon flag set here.
I prefer to use gevent for this sort of thing:
import gevent
from gevent import monkey; monkey.patch_all()
greenlet = gevent.spawn( function_to_download_image )
display_message()
# ... perhaps interaction with the user here
# this will wait for the operation to complete (optional)
greenlet.join()
# alternatively if the image display is no longer important, this will abort it:
#greenlet.kill()
Everything runs in one thread, but whenever a kernel operation blocks, gevent switches contexts when there are other "greenlets" running. Worries about locking, etc are much reduced, as there is only one thing running at a time, yet the image will continue to download whenever a blocking operation executes in the "main" context.
Depending on how much, and what kind of thing you want to do in the background, this can be either better or worse than threading-based solutions; certainly, it is much more scaleable (ie you can do many more things in the background), but that might not be of concern in the current situation.
import threading
import os
def killme():
if keyboard.read_key() == "q":
print("Bye ..........")
os._exit(0)
threading.Thread(target=killme, name="killer").start()
If you want to add more keys, add defs and threading.Thread(target=killme, name="killer").start() lines multiple times. It looks bad but works much better than complex codes.
I have a piece of code below that creates a few threads to perform a task, which works perfectly well on its own. However I'm struggling to understand why the print statements I call in my function do not execute until all threads complete and the print 'finished' statement is called. I would expect them to be called as the thread executes. Is there any simple way to accomplish this, and why does this work this way in the first place?
def func(param):
time.sleep(.25)
print param*2
if __name__ == '__main__':
print 'starting execution'
launchTime = time.clock()
params = range(10)
pool=multiprocessing.Pool(processes=100) #use N processes to download the data
_=pool.map(func,params)
print 'finished'
For python 3 you can now use the flush param like that:
print('Your text', flush=True)
This happens due to stdout buffering. You still can flush the buffers:
import sys
print 'starting'
sys.stdout.flush()
You can find more info on this issue here and here.
Having run into plenty of issues around this and garbled outputs (especially under Windows when adding colours to the output..), my solution has been to have an exclusive printing thread which consumes a queue
If this still doesn't work, also add flush=True to your print statement(s) as suggested by #Or Duan
Further, you may find the "most correct", but a heavy-handed approach to displaying messages with threading is to use the logging library which can wrap a queue (and write to many places asynchronously, including stdout) or write to a system-level queue (outside Python; availability depends greatly on OS support)
import threading
from queue import Queue
def display_worker(display_queue):
while True:
line = display_queue.get()
if line is None: # simple termination logic, other sentinels can be used
break
print(line, flush=True) # remove flush if slow or using Python2
def some_other_worker(display_queue, other_args):
# NOTE accepts queue reference as an argument, though it could be a global
display_queue.put("something which should be printed from this thread")
def main():
display_queue = Queue() # synchronizes console output
screen_printing_thread = threading.Thread(
target=display_worker,
args=(display_queue,),
)
screen_printing_thread.start()
### other logic ###
display_queue.put(None) # end screen_printing_thread
screen_printing_thread.stop()
I've got a Python script that sometimes displays images to the user. The images can, at times, be quite large, and they are reused often. Displaying them is not critical, but displaying the message associated with them is. I've got a function that downloads the image needed and saves it locally. Right now it's run inline with the code that displays a message to the user, but that can sometimes take over 10 seconds for non-local images. Is there a way I could call this function when it's needed, but run it in the background while the code continues to execute? I would just use a default image until the correct one becomes available.
Do something like this:
def function_that_downloads(my_args):
# do some long download here
then inline, do something like this:
import threading
def my_inline_function(some_args):
# do some stuff
download_thread = threading.Thread(target=function_that_downloads, name="Downloader", args=some_args)
download_thread.start()
# continue doing stuff
You may want to check if the thread has finished before going on to other things by calling download_thread.isAlive()
Typically the way to do this would be to use a thread pool and queue downloads which would issue a signal, a.k.a an event, when that task has finished processing. You can do this within the scope of the threading module Python provides.
To perform said actions, I would use event objects and the Queue module.
However, a quick and dirty demonstration of what you can do using a simple threading.Thread implementation can be seen below:
import os
import threading
import time
import urllib2
class ImageDownloader(threading.Thread):
def __init__(self, function_that_downloads):
threading.Thread.__init__(self)
self.runnable = function_that_downloads
self.daemon = True
def run(self):
self.runnable()
def downloads():
with open('somefile.html', 'w+') as f:
try:
f.write(urllib2.urlopen('http://google.com').read())
except urllib2.HTTPError:
f.write('sorry no dice')
print 'hi there user'
print 'how are you today?'
thread = ImageDownloader(downloads)
thread.start()
while not os.path.exists('somefile.html'):
print 'i am executing but the thread has started to download'
time.sleep(1)
print 'look ma, thread is not alive: ', thread.is_alive()
It would probably make sense to not poll like I'm doing above. In which case, I would change the code to this:
import os
import threading
import time
import urllib2
class ImageDownloader(threading.Thread):
def __init__(self, function_that_downloads):
threading.Thread.__init__(self)
self.runnable = function_that_downloads
def run(self):
self.runnable()
def downloads():
with open('somefile.html', 'w+') as f:
try:
f.write(urllib2.urlopen('http://google.com').read())
except urllib2.HTTPError:
f.write('sorry no dice')
print 'hi there user'
print 'how are you today?'
thread = ImageDownloader(downloads)
thread.start()
# show message
thread.join()
# display image
Notice that there's no daemon flag set here.
I prefer to use gevent for this sort of thing:
import gevent
from gevent import monkey; monkey.patch_all()
greenlet = gevent.spawn( function_to_download_image )
display_message()
# ... perhaps interaction with the user here
# this will wait for the operation to complete (optional)
greenlet.join()
# alternatively if the image display is no longer important, this will abort it:
#greenlet.kill()
Everything runs in one thread, but whenever a kernel operation blocks, gevent switches contexts when there are other "greenlets" running. Worries about locking, etc are much reduced, as there is only one thing running at a time, yet the image will continue to download whenever a blocking operation executes in the "main" context.
Depending on how much, and what kind of thing you want to do in the background, this can be either better or worse than threading-based solutions; certainly, it is much more scaleable (ie you can do many more things in the background), but that might not be of concern in the current situation.
import threading
import os
def killme():
if keyboard.read_key() == "q":
print("Bye ..........")
os._exit(0)
threading.Thread(target=killme, name="killer").start()
If you want to add more keys, add defs and threading.Thread(target=killme, name="killer").start() lines multiple times. It looks bad but works much better than complex codes.
I am writing an queue processing application which uses threads for waiting on and responding to queue messages to be delivered to the app. For the main part of the application, it just needs to stay active. For a code example like:
while True:
pass
or
while True:
time.sleep(1)
Which one will have the least impact on a system? What is the preferred way to do nothing, but keep a python app running?
I would imagine time.sleep() will have less overhead on the system. Using pass will cause the loop to immediately re-evaluate and peg the CPU, whereas using time.sleep will allow the execution to be temporarily suspended.
EDIT: just to prove the point, if you launch the python interpreter and run this:
>>> while True:
... pass
...
You can watch Python start eating up 90-100% CPU instantly, versus:
>>> import time
>>> while True:
... time.sleep(1)
...
Which barely even registers on the Activity Monitor (using OS X here but it should be the same for every platform).
Why sleep? You don't want to sleep, you want to wait for the threads to finish.
So
# store the threads you start in a your_threads list, then
for a_thread in your_threads:
a_thread.join()
See: thread.join
If you are looking for a short, zero-cpu way to loop forever until a KeyboardInterrupt, you can use:
from threading import Event
Event().wait()
Note: Due to a bug, this only works on Python 3.2+. In addition, it appears to not work on Windows. For this reason, while True: sleep(1) might be the better option.
For some background, Event objects are normally used for waiting for long running background tasks to complete:
def do_task():
sleep(10)
print('Task complete.')
event.set()
event = Event()
Thread(do_task).start()
event.wait()
print('Continuing...')
Which prints:
Task complete.
Continuing...
signal.pause() is another solution, see https://docs.python.org/3/library/signal.html#signal.pause
Cause the process to sleep until a signal is received; the appropriate handler will then be called. Returns nothing. Not on Windows. (See the Unix man page signal(2).)
I've always seen/heard that using sleep is the better way to do it. Using sleep will keep your Python interpreter's CPU usage from going wild.
You don't give much context to what you are really doing, but maybe Queue could be used instead of an explicit busy-wait loop? If not, I would assume sleep would be preferable, as I believe it will consume less CPU (as others have already noted).
[Edited according to additional information in comment below.]
Maybe this is obvious, but anyway, what you could do in a case where you are reading information from blocking sockets is to have one thread read from the socket and post suitably formatted messages into a Queue, and then have the rest of your "worker" threads reading from that queue; the workers will then block on reading from the queue without the need for neither pass, nor sleep.
Running a method as a background thread with sleep in Python:
import threading
import time
class ThreadingExample(object):
""" Threading example class
The run() method will be started and it will run in the background
until the application exits.
"""
def __init__(self, interval=1):
""" Constructor
:type interval: int
:param interval: Check interval, in seconds
"""
self.interval = interval
thread = threading.Thread(target=self.run, args=())
thread.daemon = True # Daemonize thread
thread.start() # Start the execution
def run(self):
""" Method that runs forever """
while True:
# Do something
print('Doing something imporant in the background')
time.sleep(self.interval)
example = ThreadingExample()
time.sleep(3)
print('Checkpoint')
time.sleep(2)
print('Bye')
(I'm using the pyprocessing module in this example, but replacing processing with multiprocessing should probably work if you run python 2.6 or use the multiprocessing backport)
I currently have a program that listens to a unix socket (using a processing.connection.Listener), accept connections and spawns a thread handling the request. At a certain point I want to quit the process gracefully, but since the accept()-call is blocking and I see no way of cancelling it in a nice way. I have one way that works here (OS X) at least, setting a signal handler and signalling the process from another thread like so:
import processing
from processing.connection import Listener
import threading
import time
import os
import signal
import socket
import errno
# This is actually called by the connection handler.
def closeme():
time.sleep(1)
print 'Closing socket...'
listener.close()
os.kill(processing.currentProcess().getPid(), signal.SIGPIPE)
oldsig = signal.signal(signal.SIGPIPE, lambda s, f: None)
listener = Listener('/tmp/asdf', 'AF_UNIX')
# This is a thread that handles one already accepted connection, left out for brevity
threading.Thread(target=closeme).start()
print 'Accepting...'
try:
listener.accept()
except socket.error, e:
if e.args[0] != errno.EINTR:
raise
# Cleanup here...
print 'Done...'
The only other way I've thought about is reaching deep into the connection (listener._listener._socket) and setting the non-blocking option...but that probably has some side effects and is generally really scary.
Does anyone have a more elegant (and perhaps even correct!) way of accomplishing this? It needs to be portable to OS X, Linux and BSD, but Windows portability etc is not necessary.
Clarification:
Thanks all! As usual, ambiguities in my original question are revealed :)
I need to perform cleanup after I have cancelled the listening, and I don't always want to actually exit that process.
I need to be able to access this process from other processes not spawned from the same parent, which makes Queues unwieldy
The reasons for threads are that:
They access a shared state. Actually more or less a common in-memory database, so I suppose it could be done differently.
I must be able to have several connections accepted at the same time, but the actual threads are blocking for something most of the time. Each accepted connection spawns a new thread; this in order to not block all clients on I/O ops.
Regarding threads vs. processes, I use threads for making my blocking ops non-blocking and processes to enable multiprocessing.
Isnt that what select is for??
Only call accept on the socket if the select indicates it will not block...
The select has a timeout, so you can break out occasionally occasionally to check
if its time to shut down....
I thought I could avoid it, but it seems I have to do something like this:
from processing import connection
connection.Listener.fileno = lambda self: self._listener._socket.fileno()
import select
l = connection.Listener('/tmp/x', 'AF_UNIX')
r, w, e = select.select((l, ), (), ())
if l in r:
print "Accepting..."
c = l.accept()
# ...
I am aware that this breaks the law of demeter and introduces some evil monkey-patching, but it seems this would be the most easy-to-port way of accomplishing this. If anyone has a more elegant solution I would be happy to hear it :)
I'm new to the multiprocessing module, but it seems to me that mixing the processing module and the threading module is counter-intuitive, aren't they targetted at solving the same problem?
Anyway, how about wrapping your listen functions into a process itself? I'm not clear how this affects the rest of your code, but this may be a cleaner alternative.
from multiprocessing import Process
from multiprocessing.connection import Listener
class ListenForConn(Process):
def run(self):
listener = Listener('/tmp/asdf', 'AF_UNIX')
listener.accept()
# do your other handling here
listen_process = ListenForConn()
listen_process.start()
print listen_process.is_alive()
listen_process.terminate()
listen_process.join()
print listen_process.is_alive()
print 'No more listen process.'
Probably not ideal, but you can release the block by sending the socket some data from the signal handler or the thread that is terminating the process.
EDIT: Another way to implement this might be to use the Connection Queues, since they seem to support timeouts (apologies, I misread your code in my first read).
I ran into the same issue. I solved it by sending a "stop" command to the listener. In the listener's main thread (the one that processes the incoming messages), every time a new message is received, I just check to see if it's a "stop" command and exit out of the main thread.
Here's the code I'm using:
def start(self):
"""
Start listening
"""
# set the command being executed
self.command = self.COMMAND_RUN
# startup the 'listener_main' method as a daemon thread
self.listener = Listener(address=self.address, authkey=self.authkey)
self._thread = threading.Thread(target=self.listener_main, daemon=True)
self._thread.start()
def listener_main(self):
"""
The main application loop
"""
while self.command == self.COMMAND_RUN:
# block until a client connection is recieved
with self.listener.accept() as conn:
# receive the subscription request from the client
message = conn.recv()
# if it's a shut down command, return to stop this thread
if isinstance(message, str) and message == self.COMMAND_STOP:
return
# process the message
def stop(self):
"""
Stops the listening thread
"""
self.command = self.COMMAND_STOP
client = Client(self.address, authkey=self.authkey)
client.send(self.COMMAND_STOP)
client.close()
self._thread.join()
I'm using an authentication key to prevent would be hackers from shutting down my service by sending a stop command from an arbitrary client.
Mine isn't a perfect solution. It seems a better solution might be to revise the code in multiprocessing.connection.Listener, and add a stop() method. But, that would require sending it through the process for approval by the Python team.