I'm trying to multithread my Python application. This is how i thought the application would work:
A list of ipv4 addresses is created by the user
For each ipv4 address, the application establishes an SSH connection and logs in. This part would benefit from multithreading since each device takes about 10 seconds to complete. The ssh bit is all handled by my ConfDumper class.
in each thread, a bit of data is fetched from the network device and should be returned to the main thread (where there is a list of devices)
Once all threads are done, a result is presented.
Being new to Python and having no experience with multithreading, I've tried something like this:
import threading
import confDumper
class MyThread (threading.Thread):
device = None
# A device object is sent as agument
def __init__(self, device):
threading.Thread.__init__(self)
self.device = device
def run(self):
print "Starting scan..."
self.sshscan()
print "Exiting thread"
def sshscan(self):
s = confDumper.ConfDumper(self.device.mgmt_ip, self.device.username, self.device.password, self.device.enable_password)
t = s.getConf()
if t:
# We got the conf, return it to the main thread, somehow...
It seems to be working when I debug the code and step though the lines one by one, but once the thread is closed all results from the thread are lost. How do I return the result to the main thread?
You can use a Queue:
import Queue
import threading
import random
import time
class Worker(threading.Thread):
def __init__(self, queue):
super(Worker, self).__init__()
self._queue = queue
def run(self):
time.sleep(5.0 * random.random())
self._queue.put(str(self))
queue = Queue.Queue()
workers = [Worker(queue) for _ in xrange(10)]
for worker in workers:
worker.start()
for worker in workers:
worker.join()
while queue.qsize():
print queue.get()
This was much easier than I thought. As far as I can see you don't have to return anything, the object sent to the thread is the same as the source.
Related
I am writing multiprocess program. There are four class: Main, Worker, Request and Ack. The Main class is the entry point of program. It will create the sub-process called Worker to do some jobs. The main process put the Request into JoinableQueue, and than Worker get request from queue. When Worker finished the request, it will put the ACK into queue. The part of code shown as below:
Main:
class Main():
def __init__(self):
self.cmd_queue = JoinableQueue()
self.worker = Worker(self.cmd_queue)
def call_worker(self, cmd_code):
if self.cmd_queue.empty() is True:
request = Request(cmd_code)
self.cmd_queue.put(request)
self.cmd_queue.join()
ack = self.cmd_queue.get()
self.cmd_queue.task_done()
if ack.value == 0:
return True
else:
return False
else:
# TODO: Error Handling.
pass
def run_worker(self):
self.worker.start()
Worker:
class Worker(Process):
def __init__(self, cmd_queue):
super(Worker, self).__init__()
self.cmd_queue = cmd_queue
...
def run(self):
while True:
ack = Ack(0)
try:
request = self.cmd_queue.get()
if request.cmd_code == ReqCmd.enable_handler:
self.enable_handler()
elif request.cmd_code == ReqCmd.disable_handler:
self.disable_handler()
else:
pass
except Exception:
ack.value = -1
finally:
self.cmd_queue.task_done()
self.cmd_queue.put(ack)
self.cmd_queue.join()
It often works normally. But Main process stuck at self.cmd_queue.join(), and the Worker stuck at self.cmd_queue.join() sometimes. It is so weird! Does anyone have any ideas? Thanks
There's nothing weird in the above issue: you shouldn't call queue's join within a typical single worker process activity because
Queue.join()
Blocks until all items in the queue have been gotten and
processed.
Such a calls where they are in your current implementation will make the processing pipeline wait.
Usually queue.join() is called in the main (supervisor) thread after initiating/starting all threads/workers.
https://docs.python.org/3/library/queue.html#queue.Queue.join
I am trying to implement a Python (2.6.x/2.7.x) thread pool that would check for network connectivity(ping or whatever), the entire pool threads must be killed/terminated when the check is successful.
So I am thinking of creating a pool of, let's say, 10 worker threads. If any one of them is successful in pinging, the main thread should terminate all the rest.
How do I implement this?
This is not a compilable code, this is just to give you and idea how to make threads communicate..
Inter process or threads communication happens through queues or pipes and some other ways..here I'm using queues for communication.
It works like this.. I'll send ip addresses in in_queue and add response to out_queue, my main thread monitors out_queue and if it gets desired result, it marks all the threads to terminate.
Below is the pinger thread definition..
import threading
from Queue import Queue, Empty
# A thread that pings ip.
class Pinger(threading.Thread):
def __init__(self, kwargs=None):
threading.Thread.__init__(self)
self.kwargs = kwargs
self.stop_pinging = False
def run(self):
ip_queue = self.kwargs.get('in_queue')
out_queue = self.kwargs.get('out_queue')
while not self.stop_pinging:
try:
data = ip_quque.get(timeout=1)
ping_status = ping(data)
# This is pseudo code, you've to takecare of
# your own ping.
if ping_status:
out_queue.put('success')
# you can even break here if you don't want to
# continue after one success
else:
out_queue.put('failure')
if ip_queue.empty()
break
except Empty, e:
pass
Here is the main thread block..
# Create the shared queue and launch both thread pools
in_queue = Queue()
out_queue = Queue()
ip_list = ['ip1', 'ip2', '....']
# This is to add all the ips to the queue or you can
# customize to add through some producer way.
for ip in ip_list:
in_queue.put(ip)
pingerer_pool = []
for i in xrange(1, 10):
pingerer_worker = Pinger(kwargs={'in_queue': in_queue, 'out_queue': out_queue}, name=str(i))
pingerer_pool.append(pinger_worker)
pingerer_worker.start()
while 1:
if out_queue.get() == 'success':
for pinger in pinger_pool:
pinger_worker.stop_pinging = True
break
Note: This is a pseudo code, you should make this workable as you like.
I am trying to start multiple processes in a Python program, using multiprocessing.Queue to share data between them.
My code is shown as follows, TestClass is the process that receives packets from a zmq socket, and feeds them into the queue. There is another process(I took it out from the code) keeps fetching messages from the queue. I also have a script running to publish messages to this zmq channel.
from multiprocessing import Process, Queue
import zmq
import time
class TestClass(Process):
def __init__(self, queue):
super(TestClass, self).__init__()
# Setting up connections
self.context = zmq.Context()
self.socket = self.context.socket(zmq.SUB)
self.socket.connect("tcp://192.168.0.6:8577")
self.socket.setsockopt(zmq.SUBSCRIBE, b'')
self.queue = queue
def run(self):
while True:
msg = self.socket.recv()
self.queue.put(msg)
queue = Queue()
c = TestClass(queue)
c.run()
# Do something else
If I use c.run() to start the process, it runs fine, but it is not started as a Process because it blocks the following statement.
Then I switched to c.start() to start the process, but it was stuck at the line socket.recv() and cannot get any incoming messages. Can anybody please explain this and suggest a good solution? Thanks
The issue is that you're creating the zmq socket in the parent process, but then trying to use it in the child. Something in the forking process is breaking the socket, so it's not working when you try using it. You can fix it by simply creating the socket in the child, rather than the parent. This has no negative side effects, since you're not trying to use the socket in the parent to begin with.
from multiprocessing import Process, Queue
import zmq
import time
class TestClass(Process):
def __init__(self, queue):
super(TestClass, self).__init__()
self.queue = queue
def run(self):
# Setting up connections
self.context = zmq.Context()
self.socket = self.context.socket(zmq.SUB)
self.socket.connect("tcp://192.168.0.6:8577")
self.socket.setsockopt(zmq.SUBSCRIBE, b'')
while True:
msg = self.socket.recv()
self.queue.put(msg)
if __name__ == "__main__":
queue = Queue()
c = TestClass(queue)
c.start() # Don't use run()
# Do something else
I'm looking for a Python class (preferably part of the standard language, rather than a 3rd party library) to manage asynchronous 'broadcast style' messaging.
I will have one thread which puts messages on the queue (the 'putMessageOnQueue' method must not block) and then multiple other threads which will all be waiting for messages, having presumably called some blocking 'waitForMessage' function. When a message is placed on the queue I want each of the waiting threads to get its own copy of the message.
I've looked at the built-in Queue class, but I don't think this is suitable because consuming messages seems to involve removing them from the queue, so only 1 client thread would see each one.
This seems like it should be a common use-case, can anyone recommend a solution?
I think the typical approach to this is to use a separate message queue for each thread, and push the message onto every queue which has previously registered an interest in receiving such messages.
Something like this ought to work, but it's untested code...
from time import sleep
from threading import Thread
from Queue import Queue
class DispatcherThread(Thread):
def __init__(self, *args, **kwargs):
super(DispatcherThread, self).__init__(*args, **kwargs)
self.interested_threads = []
def run(self):
while 1:
if some_condition:
self.dispatch_message(some_message)
else:
sleep(0.1)
def register_interest(self, thread):
self.interested_threads.append(thread)
def dispatch_message(self, message):
for thread in self.interested_threads:
thread.put_message(message)
class WorkerThread(Thread):
def __init__(self, *args, **kwargs):
super(WorkerThread, self).__init__(*args, **kwargs)
self.queue = Queue()
def run(self):
# Tell the dispatcher thread we want messages
dispatcher_thread.register_interest(self)
while 1:
# Wait for next message
message = self.queue.get()
# Process message
# ...
def put_message(self, message):
self.queue.put(message)
dispatcher_thread = DispatcherThread()
dispatcher_thread.start()
worker_threads = []
for i in range(10):
worker_thread = WorkerThread()
worker_thread.start()
worker_threads.append(worker_thread)
dispatcher_thread.join()
I think this is a more straight forward example (taken from the Queue example in Python Lib )
from threading import Thread
from Queue import Queue
num_worker_threads = 2
def worker():
while True:
item = q.get()
do_work(item)
q.task_done()
q = Queue()
for i in range(num_worker_threads):
t = Thread(target=worker)
t.daemon = True
t.start()
for item in source():
q.put(item)
q.join() # block until all tasks are done
I was reading a article on Python multi threading using Queues and have a basic question.
Based on the print stmt, 5 threads are started as expected. So, how does the queue works?
1.The thread is started initially and when the queue is populated with a item does it gets restarted and starts processing that item?
2.If we use the queue system and threads process each item by item in the queue, how there is a improvement in performance..Is it not similar to serial processing ie; 1 by 1.
import Queue
import threading
import urllib2
import datetime
import time
hosts = ["http://yahoo.com", "http://google.com", "http://amazon.com",
"http://ibm.com", "http://apple.com"]
queue = Queue.Queue()
class ThreadUrl(threading.Thread):
def __init__(self, queue):
threading.Thread.__init__(self)
print 'threads are created'
self.queue = queue
def run(self):
while True:
#grabs host from queue
print 'thread startting to run'
now = datetime.datetime.now()
host = self.queue.get()
#grabs urls of hosts and prints first 1024 bytes of page
url = urllib2.urlopen(host)
print 'host=%s ,threadname=%s' % (host,self.getName())
print url.read(20)
#signals to queue job is done
self.queue.task_done()
start = time.time()
if __name__ == '__main__':
#spawn a pool of threads, and pass them queue instance
print 'program start'
for i in range(5):
t = ThreadUrl(queue)
t.setDaemon(True)
t.start()
#populate queue with data
for host in hosts:
queue.put(host)
#wait on the queue until everything has been processed
queue.join()
print "Elapsed Time: %s" % (time.time() - start)
A queue is similar to a list container, but with internal locking to make it a thread-safe way to communicate data.
What happens when you start all of your threads is that they all block on the self.queue.get() call, waiting to pull an item from the queue. When an item is put into the queue from your main thread, one of the threads will become unblocked and receive the item. It can then continue to process it until it finishes and returns to a blocking state.
All of your threads can run concurrently because they all are able to receive items from the queue. This is where you would see your improvement in performance. If the urlopen and read take time in one thread and it is waiting on IO, that means another thread can do work. The queue objects job is simply to manage the locking access, and popping off items to the callers.