For server automation, we're trying to develop a tool, which can handle and execute a lot of tasks on different servers. We send the task and the server hostname into a queue. The queue is then consumed from a requester, which give the information to the ansible api. To achieve that we can execute more then one task at once, we're using threading.
Now we're stuck with the acknowledge of the message...
What we have done so far:
The requester.py consumes the queue and starts then a thread, in which the ansible task is running. The result is then sended into another queue. So each new messages creates a new thread. Is the task finished, the thread dies.
But now comes difficult part. We have to made the messages persistent, in case our server dies. So each message should be acknowledged after the result from ansible was sended back.
Our problem is now, when we try to acknowledged the message in the thread itselfs, there is no more "simultaneously" work done, because the consume of pika waits for the acknowledge. So how we can achieve, that the consume consumes messages and dont wait for the acknowledge? Or how we can work around or improve our little programm?
requester.py
#!/bin/python
from worker import *
import ansible.inventory
import ansible.runner
import threading
class Requester(Worker):
def __init__(self):
Worker.__init__(self)
self.connection(self.selfhost, self.from_db)
self.receive(self.from_db)
def send(self, result, ch, method):
self.channel.basic_publish(exchange='',
routing_key=self.to_db,
body=result,
properties=pika.BasicProperties(
delivery_mode=2,
))
print "[x] Sent \n" + result
ch.basic_ack(delivery_tag = method.delivery_tag)
def callAnsible(self, cmd, ch, method):
#call ansible api pre 2.0
result = json.dumps(result, sort_keys=True, indent=4, separators=(',', ': '))
self.send(result, ch, method)
def callback(self, ch, method, properties, body):
print(" [x] Received by requester %r" % body)
t = threading.Thread(target=self.callAnsible, args=(body,ch,method,))
t.start()
worker.py
import pika
import ConfigParser
import json
import os
class Worker(object):
def __init__(self):
#read some config files
def callback(self, ch, method, properties, body):
raise Exception("Call method in subclass")
def receive(self, queue):
self.channel.basic_qos(prefetch_count=1)
self.channel.basic_consume(self.callback,queue=queue)
self.channel.start_consuming()
def connection(self,server,queue):
self.connection = pika.BlockingConnection(pika.ConnectionParameters(
host=server,
credentials=self.credentials))
self.channel = self.connection.channel()
self.channel.queue_declare(queue=queue, durable=True)
We're working with Python 2.7 and pika 0.10.0.
And yes, we noticed in the pika FAQ: http://pika.readthedocs.io/en/0.10.0/faq.html
that pika is not thread safe.
Disable auto-acknowledge and set the prefetch count to something bigger then 1, depending on how many messages would you like your consumer to take.
Here is how to set prefetch
channel.basic_qos(prefetch_count=1), found here.
Related
We have several tasks that we consume from a message queue. The runtimes of those tasks are dependent on fetching some data from a database. Therefore we would like to work with Gevent to not block the program if some database requests take a long time. We are trying to couple it with the Pika client, which has some asynchronous adapters, one of them for gevent: pika.adapters.gevent_connection.GeventConnection.
I set up some toy code, which consumes from a MQ tasks that consists of integers and publishes them on another queue, while sleeping for 4 seconds for each odd number:
# from gevent import monkey
# # Monkeypatch core python libraries to support asynchronous operations.
# monkey.patch_time()
import pika
from pika.adapters.gevent_connection import GeventConnection
from datetime import datetime
import time
def handle_delivery(unused_channel, method, header, body):
"""Called when we receive a message from RabbitMQ"""
print(f"Received: {body} at {datetime.now()}")
channel.basic_ack(method.delivery_tag)
num = int(body)
print(num)
if num % 2 != 0:
time.sleep(4)
channel.basic_publish(
exchange='my_test_exchange2',
routing_key='my_test_queue2',
body=body
)
print("Finished processing")
def on_connected(connection):
"""Called when we are fully connected to RabbitMQ"""
# Open a channel
connection.channel(on_open_callback=on_channel_open)
def on_channel_open(new_channel):
"""Called when our channel has opened"""
global channel
channel = new_channel
channel.basic_qos(prefetch_count=1)
channel.queue_declare(queue="my_queue_gevent5")
channel.exchange_declare("my_test_exchange2")
channel.queue_declare(queue="my_test_queue2")
channel.queue_bind(exchange="my_test_exchange2", queue="my_test_queue2")
channel.basic_consume("my_queue_gevent5", handle_delivery)
def start_loop(i):
conn = GeventConnection(pika.ConnectionParameters('localhost'), on_open_callback=on_connected)
conn.ioloop.start()
start_loop(1)
If I run it without the monkey.patch_time() call it works OK and it publishes results on the my_test_queue2, but it works sequentially. The expected behaviour after adding monkey.patch_time() patch would be that it still works but concurrently. However, the code gets stuck (nothing happens anymore) after it comes to the call time.sleep(4). It processes and publishes the first integer, which is 0, and then gets stuck at 1, when the if clause gets triggered. What am I doing wrong?
With the help of ChatGPT I managed to make it work. There was a gevent.spawn() call missing:
def handle_delivery(unused_channel, method, header, body):
print("Handling delivery")
gevent.spawn(process_message, method, body)
def process_message(method, body):
print(f"Received: {body} at {datetime.now()}")
channel.basic_ack(method.delivery_tag)
num = int(body)
print(num)
if num % 2 != 0:
time.sleep(4)
channel.basic_publish(
exchange='my_test_exchange2',
routing_key='my_test_queue2',
body=body
)
print("Finished processing")
I have attempted to follow guidance given here: Handling long running tasks in pika / RabbitMQ and here: https://github.com/pika/pika/issues/753#issuecomment-318124510 on how to run long tasks in a separate thread to avoid interrupting the connection heartbeat. I'm a beginner to threading and still struggling to understand this solution.
For my final use case, I need to make function calls that are several minutes long, represented in the example code below by the long_function(). I've found that if the sleep call in long_function() exceeds the length of the heartbeat timeout, I lose connection (presumably because this function is blocking thread #2 from receiving/acknowledging the heartbeat messages from thread #1) and I get this message in the logs: ERROR: Unexpected connection close detected: StreamLostError: ("Stream connection lost: RxEndOfFile(-1, 'End of input stream (EOF)')",). A sleep call of the same length in the target function of thread #2 does not lead to a StreamLostError.
What's the proper solution for overcoming the StreamLostError here? Do I launch all subsequent function calls in their own threads to avoid blocking thread #2? Do I increase the heartbeat to be longer than long_function()? If this is the solution, what was the point of running my long task in a separate thread? Why not just make the heartbeat timeout in the main thread long enough to accommodate the whole message being processed? Thanks!
import functools
import logging
import pika
import threading
import time
import os
import ssl
from common_utils.rabbitmq_utils import send_message_to_queue, initialize_rabbitmq_channel
import json
import traceback
logging.basicConfig(format='%(asctime)s %(levelname)s: %(message)s',
level=logging.INFO,
datefmt='%Y-%m-%d %H:%M:%S')
def send_message_to_queue(channel, queue_name, body):
channel.basic_publish(exchange='',
routing_key=queue_name,
body=json.dumps(body),
properties=pika.BasicProperties(delivery_mode=2)
)
logging.info("RabbitMQ publish to queue {} confirmed".format(queue_name))
def initialize_rabbitmq_channel(timeout=5*60):
credentials = pika.PlainCredentials(os.environ.get("RABBITMQ_USER"), os.environ.get("RABBITMQ_PASSWORD"))
context = ssl.SSLContext(ssl.PROTOCOL_TLSv1_2)
params = pika.ConnectionParameters(port=5671, host=os.environ.get("RABBITMQ_HOST"), credentials=credentials,
ssl_options=pika.SSLOptions(context), virtual_host="/", heartbeat=timeout)
connection = pika.BlockingConnection(params)
return connection.channel(), connection
def long_function():
logging.info("Long function starting...")
time.sleep(5)
logging.info("Long function finished.")
def ack_message(channel, delivery_tag):
"""
Note that `channel` must be the same pika channel instance via which
the message being ACKed was retrieved (AMQP protocol constraint).
"""
if channel.is_open:
channel.basic_ack(delivery_tag)
logging.info("Message {} acknowledged".format(delivery_tag))
else:
logging.error("Channel is closed and message acknowledgement will fail")
pass
def do_work(connection, channel, delivery_tag, body):
thread_id = threading.get_ident()
fmt1 = 'Thread id: {} Delivery tag: {} Message body: {}'
logging.info(fmt1.format(thread_id, delivery_tag, body))
# Simulating work including a call to another function that exceeds heartbeat timeout
time.sleep(5)
long_function()
send_message_to_queue(channel, "test_inactive", json.loads(body))
cb = functools.partial(ack_message, channel, delivery_tag)
connection.add_callback_threadsafe(cb)
def on_message(connection, channel, method, property, body):
t = threading.Thread(target=do_work, args=(connection, channel, method.delivery_tag, body))
t.start()
t.join()
if __name__ == "__main__":
channel, connection = initialize_rabbitmq_channel(timeout=3)
channel.basic_qos(prefetch_count=1)
channel.basic_consume(queue="test_queue",
auto_ack=False,
on_message_callback=lambda channel, method, property, body: on_message(connection, channel, method, property, body)
)
channel.start_consuming()
I have two, separate RabbitMQ instances. I'm trying to find the best way to listen to events from both.
For example, I can consume events on one with the following:
credentials = pika.PlainCredentials(user, pass)
connection = pika.BlockingConnection(pika.ConnectionParameters(host="host1", credentials=credentials))
channel = connection.channel()
result = channel.queue_declare(Exclusive=True)
self.channel.queue_bind(exchange="my-exchange", result.method.queue, routing_key='*.*.*.*.*')
channel.basic_consume(callback_func, result.method.queue, no_ack=True)
self.channel.start_consuming()
I have a second host, "host2", that I'd like to listen to as well. I thought about creating two separate threads to do this, but from what I've read, pika isn't thread safe. Is there a better way? Or would creating two separate threads, each listening to a different Rabbit instance (host1, and host2) be sufficient?
The answer to "what is the best way" depends heavily on your usage pattern of queues and what you mean by "best". Since I can't comment on questions yet, I'll just try to suggest some possible solutions.
In each example I'm going to assume exchange is already declared.
Threads
You can consume messages from two queues on separate hosts in single process using pika.
You are right - as its own FAQ states, pika is not thread safe, but it can be used in multi-threaded manner by creating connections to RabbitMQ hosts per thread. Making this example run in threads using threading module looks as follows:
import pika
import threading
class ConsumerThread(threading.Thread):
def __init__(self, host, *args, **kwargs):
super(ConsumerThread, self).__init__(*args, **kwargs)
self._host = host
# Not necessarily a method.
def callback_func(self, channel, method, properties, body):
print("{} received '{}'".format(self.name, body))
def run(self):
credentials = pika.PlainCredentials("guest", "guest")
connection = pika.BlockingConnection(
pika.ConnectionParameters(host=self._host,
credentials=credentials))
channel = connection.channel()
result = channel.queue_declare(exclusive=True)
channel.queue_bind(result.method.queue,
exchange="my-exchange",
routing_key="*.*.*.*.*")
channel.basic_consume(self.callback_func,
result.method.queue,
no_ack=True)
channel.start_consuming()
if __name__ == "__main__":
threads = [ConsumerThread("host1"), ConsumerThread("host2")]
for thread in threads:
thread.start()
I've declared callback_func as a method purely to use ConsumerThread.name while printing message body. It might as well be a function outside the ConsumerThread class.
Processes
Alternatively, you can always just run one process with consumer code per queue you want to consume events.
import pika
import sys
def callback_func(channel, method, properties, body):
print(body)
if __name__ == "__main__":
credentials = pika.PlainCredentials("guest", "guest")
connection = pika.BlockingConnection(
pika.ConnectionParameters(host=sys.argv[1],
credentials=credentials))
channel = connection.channel()
result = channel.queue_declare(exclusive=True)
channel.queue_bind(result.method.queue,
exchange="my-exchange",
routing_key="*.*.*.*.*")
channel.basic_consume(callback_func, result.method.queue, no_ack=True)
channel.start_consuming()
and then run by:
$ python single_consume.py host1
$ python single_consume.py host2 # e.g. on another console
If the work you're doing on messages from queues is CPU-heavy and as long as number of cores in your CPU >= number of consumers, it is generally better to use this approach - unless your queues are empty most of the time and consumers won't utilize this CPU time*.
Async
Another alternative is to involve some asynchronous framework (for example Twisted) and running whole thing in single thread.
You can no longer use BlockingConnection in asynchronous code; fortunately, pika has adapter for Twisted:
from pika.adapters.twisted_connection import TwistedProtocolConnection
from pika.connection import ConnectionParameters
from twisted.internet import protocol, reactor, task
from twisted.python import log
class Consumer(object):
def on_connected(self, connection):
d = connection.channel()
d.addCallback(self.got_channel)
d.addCallback(self.queue_declared)
d.addCallback(self.queue_bound)
d.addCallback(self.handle_deliveries)
d.addErrback(log.err)
def got_channel(self, channel):
self.channel = channel
return self.channel.queue_declare(exclusive=True)
def queue_declared(self, queue):
self._queue_name = queue.method.queue
self.channel.queue_bind(queue=self._queue_name,
exchange="my-exchange",
routing_key="*.*.*.*.*")
def queue_bound(self, ignored):
return self.channel.basic_consume(queue=self._queue_name)
def handle_deliveries(self, queue_and_consumer_tag):
queue, consumer_tag = queue_and_consumer_tag
self.looping_call = task.LoopingCall(self.consume_from_queue, queue)
return self.looping_call.start(0)
def consume_from_queue(self, queue):
d = queue.get()
return d.addCallback(lambda result: self.handle_payload(*result))
def handle_payload(self, channel, method, properties, body):
print(body)
if __name__ == "__main__":
consumer1 = Consumer()
consumer2 = Consumer()
parameters = ConnectionParameters()
cc = protocol.ClientCreator(reactor,
TwistedProtocolConnection,
parameters)
d1 = cc.connectTCP("host1", 5672)
d1.addCallback(lambda protocol: protocol.ready)
d1.addCallback(consumer1.on_connected)
d1.addErrback(log.err)
d2 = cc.connectTCP("host2", 5672)
d2.addCallback(lambda protocol: protocol.ready)
d2.addCallback(consumer2.on_connected)
d2.addErrback(log.err)
reactor.run()
This approach would be even better, the more queues you would consume from and the less CPU-bound the work performing by consumers is*.
Python 3
Since you've mentioned pika, I've restricted myself to Python 2.x-based solutions, because pika is not yet ported.
But in case you would want to move to >=3.3, one possible option is to use asyncio with one of AMQP protocol (the protocol you speak in with RabbitMQ) , e.g. asynqp or aioamqp.
* - please note that these are very shallow tips - in most cases choice is not that obvious; what will be the best for you depends on queues "saturation" (messages/time), what work do you do upon receiving these messages, what environment you run your consumers in etc.; there's no way to be sure other than to benchmark all implementations
Below is an example of how I use one rabbitmq instance to listen to 2 queues at the same time:
import pika
import threading
threads=[]
def client_info(channel):
channel.queue_declare(queue='proxy-python')
print (' [*] Waiting for client messages. To exit press CTRL+C')
def callback(ch, method, properties, body):
print (" Received %s" % (body))
channel.basic_consume(callback, queue='proxy-python', no_ack=True)
channel.start_consuming()
def scenario_info(channel):
channel.queue_declare(queue='savi-virnet-python')
print (' [*] Waiting for scenrio messages. To exit press CTRL+C')
def callback(ch, method, properties, body):
print (" Received %s" % (body))
channel.basic_consume(callback, queue='savi-virnet-python', no_ack=True)
channel.start_consuming()
def manager():
connection1= pika.BlockingConnection(pika.ConnectionParameters
(host='localhost'))
channel1 = connection1.channel()
connection2= pika.BlockingConnection(pika.ConnectionParameters
(host='localhost'))
channel2 = connection2.channel()
t1 = threading.Thread(target=client_info, args=(channel1,))
t1.daemon = True
threads.append(t1)
t1.start()
t2 = threading.Thread(target=scenario_info, args=(channel2,))
t2.daemon = True
threads.append(t2)
t2.start()
for t in threads:
t.join()
manager()
import asyncio
import tornado.ioloop
import tornado.web
from aio_pika import connect_robust, Message
tornado.ioloop.IOLoop.configure("tornado.platform.asyncio.AsyncIOLoop")
io_loop = tornado.ioloop.IOLoop.current()
asyncio.set_event_loop(io_loop.asyncio_loop)
QUEUE = asyncio.Queue()
class SubscriberHandler(tornado.web.RequestHandler):
async def get(self):
message = await QUEUE.get()
self.finish(message.body)
class PublisherHandler(tornado.web.RequestHandler):
async def post(self):
connection = self.application.settings["amqp_connection"]
channel = await connection.channel()
try:
await channel.default_exchange.publish(
Message(body=self.request.body), routing_key="test",
)
finally:
await channel.close()
print('ok')
self.finish("OK")
async def make_app():
amqp_connection = await connect_robust()
channel = await amqp_connection.channel()
queue = await channel.declare_queue("test", auto_delete=True)
await queue.consume(QUEUE.put, no_ack=True)
return tornado.web.Application(
[(r"/publish", PublisherHandler), (r"/subscribe", SubscriberHandler)],
amqp_connection=amqp_connection,
)
if __name__ == "__main__":
app = io_loop.asyncio_loop.run_until_complete(make_app())
app.listen(8888)
tornado.ioloop.IOLoop.current().start()
You can use aio-pika in async way
more examples here
https://buildmedia.readthedocs.org/media/pdf/aio-pika/latest/aio-pika.pdf
Happy coding :)
Pika can be used into a multithreaded consumer. The only requirement is to have a Pika connection per thread.
Pika Github repository has an example here.
A snippet from basic_consumer_threaded.py:
def on_message(ch, method_frame, _header_frame, body, args):
(conn, thrds) = args
delivery_tag = method_frame.delivery_tag
t = threading.Thread(target=do_work, args=(conn, ch, delivery_tag, body))
t.start()
thrds.append(t)
threads = []
on_message_callback = functools.partial(on_message, args=(connection, threads))
channel.basic_consume('standard', on_message_callback)
This is a long one.
I have a list of usernames and passwords. For each one I want to login to the accounts and do something things. I want to use several machines to do this faster. The way I was thinking of doing this is have a main machine whose job is just having a cron which from time to time checks if the rabbitmq queue is empty. If it is, read the list of usernames and passwords from a file and send it to the rabbitmq queue. Then have a bunch of machines which are subscribed to that queue whose job is receiving a user/pass, do stuff on it, acknowledge it, and move on to the next one, until the queue is empty and then the main machine fills it up again. So far I think I have everything down.
Now comes my problem. I have checked that the things to be done with each user/passes aren't so intensive and so I could have each machine doing three of them simultaneously using python's threading. In fact for a single machine I have implemented this where I load the user/passes into a python Queue() and then have three threads consume that Queue(). Now I want to do something similar, but instead of consuming from a python Queue(), each thread of each machine should consume from a rabbitmq queue. This is where I'm stuck. To run tests I started by using rabbitmq's tutorial.
send.py:
import pika, sys
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()
channel.queue_declare(queue='hello')
message = ' '.join(sys.argv[1:])
channel.basic_publish(exchange='',
routing_key='hello',
body=message)
connection.close()
worker.py
import time, pika
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()
channel.queue_declare(queue='hello')
def callback(ch, method, properties, body):
print ' [x] received %r' % (body,)
time.sleep( body.count('.') )
ch.basic_ack(delivery_tag = method.delivery_tag)
channel.basic_qos(prefetch_count=1)
channel.basic_consume(callback, queue='hello', no_ack=False)
channel.start_consuming()
For the above you can run two worker.py which will subscribe to the rabbitmq queue and consume as expected.
My threading without rabbitmq is something like this:
runit.py
class Threaded_do_stuff(threading.Thread):
def __init__(self, user_queue):
threading.Thread.__init__(self)
self.user_queue = user_queue
def run(self):
while True:
login = self.user_queue.get()
do_stuff(user=login[0], pass=login[1])
self.user_queue.task_done()
user_queue = Queue.Queue()
for i in range(3):
td = Threaded_do_stuff(user_queue)
td.setDaemon(True)
td.start()
## fill up the queue
for user in list_users:
user_queue.put(user)
## go!
user_queue.join()
This also works as expected: you fill up the queue and have 3 threads subscribe to it. Now what I want to do is something like runit.py but instead of using a python Queue(), using something like worker.py where the queue is actually a rabbitmq queue.
Here's something which I tried and didn't work (and I don't understand why)
rabbitmq_runit.py
import time, threading, pika
class Threaded_worker(threading.Thread):
def callback(self, ch, method, properties, body):
print ' [x] received %r' % (body,)
time.sleep( body.count('.') )
ch.basic_ack(delivery_tag = method.delivery_tag)
def __init__(self):
threading.Thread.__init__(self)
self.connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
self.channel = self.connection.channel()
self.channel.queue_declare(queue='hello')
self.channel.basic_qos(prefetch_count=1)
self.channel.basic_consume(self.callback, queue='hello')
def run(self):
print 'start consuming'
self.channel.start_consuming()
for _ in range(3):
print 'launch thread'
td = Threaded_worker()
td.setDaemon(True)
td.start()
I would expect that this launches three threads each of which is blocked by .start_consuming() which just stays there waiting for the rabbitmq queue to send them sometihing. Instead, this program starts, does some prints, and exits. The pattern of the exists is weird too:
launch thread
launch thread
start consuming
launch thread
start consuming
In particular notice there is one "start consuming" missing.
What's going on?
EDIT: One answer I found to a similar question is here
Consuming a rabbitmq message queue with multiple threads (Python Kombu)
and the answer is to "use celery", whatever that means. I don't buy it, I shouldn't need anything remotely as sophisticated as celery. In particular, I'm not trying to set up an RPC and I don't need to read replies from the do_stuff routines.
EDIT 2: The print pattern that I expected would be the following. I do
python send.py first message......
python send.py second message.
python send.py third message.
python send.py fourth message.
and the print pattern would be
launch thread
start consuming
[x] received 'first message......'
launch thread
start consuming
[x] received 'second message.'
launch thread
start consuming
[x] received 'third message.'
[x] received 'fourth message.'
The problem is that you're making the thread daemonic:
td = Threaded_worker()
td.setDaemon(True) # Shouldn't do that.
td.start()
Daemonic threads will be terminated as soon as the main thread exits:
A thread can be flagged as a “daemon thread”. The significance of this
flag is that the entire Python program exits when only daemon threads
are left. The initial value is inherited from the creating thread. The
flag can be set through the daemon property.
Leave out setDaemon(True) and you should see it behave the way you expect.
Also, the pika FAQ has a note about how to use it with threads:
Pika does not have any notion of threading in the code. If you want to
use Pika with threading, make sure you have a Pika connection per
thread, created in that thread. It is not safe to share one Pika
connection across threads.
This suggests you should move everything you're doing in __init__() into run(), so that the connection is created in the same thread you're actually consuming from the queue in.
I want to transmit data from a Queue using Twisted. I currently use a push producer to poll the queue for items and write to the transport.
class Producer:
implements(interfaces.IPushProducer)
def __init__(self, protocol, queue):
self.queue = queue
self.protocol = protocol
def resumeProducing(self):
self.paused = False
while not self.paused:
try:
data = self.queue.get_nowait()
logger.debug("Transmitting: '%s'", repr(data))
data = cPickle.dumps(data)
self.protocol.transport.write(data + "\r\n")
except Empty:
pass
def pauseProducing(self):
logger.debug("Transmitter paused.")
self.paused = True
def stopProducing(self):
pass
The problem is, that the data are sent very irregularly and if only one item was in the queue, the data is never going to be sent. It seems that Twisted waits until the data to be transmitted has grown to a specific value until it transmits it. Is the way I implemented my producer the right way? Can I force Twisted to transmit data now?
I've also tried using a pull producer, but Twisted does not call the resumeProducing() method of it at all. Do I have to call the resumeProducer() method from outside, when using a pull producer?
It's hard to say why your producer doesn't work well without seeing a complete example (that is, without also seeing the code that registers it with a consumer and the code which is putting items into that queue).
However, one problem you'll likely have is that if your queue is empty when resumeProducing is called, then you will write no bytes at all to the consumer. And when items are put into the queue, they'll sit there forever, because the consumer isn't going to call your resumeProducing method again.
And this generalizes to any other case where the queue does not have enough data in it to cause the consumer to call pauseProducing on your producer. As a push producer, it is your job to continue to produce data on your own until the consumer calls pauseProducing (or stopProducing).
For this particular case, that probably means that whenever you're going to put something in that queue - stop: check to see if the producer is not paused, and if it is not, write it to the consumer instead. Only put items in the queue when the producer is paused.
Here are two possible solutions:
1) Periodically poll your local application to see if you have additional data to send.
NB. This relies on a periodic async callback from the deferLater method in twisted. If you need a responsive application that sends data on demand, or a long running blocking operation (eg. ui that uses its own event loop) it may not be appropriate.
Code:
from twisted.internet.protocol import Factory
from twisted.internet.endpoints import TCP4ServerEndpoint
from twisted.internet.interfaces import IPushProducer
from twisted.internet.task import deferLater, cooperate
from twisted.internet.protocol import Protocol
from twisted.internet import reactor
from zope.interface import implementer
import time
# Deferred action
def periodically_poll_for_push_actions_async(reactor, protocol):
while True:
protocol.send(b"Hello World\n")
yield deferLater(reactor, 2, lambda: None)
# Push protocol
#implementer(IPushProducer)
class PushProtocol(Protocol):
def connectionMade(self):
self.transport.registerProducer(self, True)
gen = periodically_poll_for_push_actions_async(self.transport.reactor, self)
self.task = cooperate(gen)
def dataReceived(self, data):
self.transport.write(data)
def send(self, data):
self.transport.write(data)
def pauseProducing(self):
print 'Workload paused'
self.task.pause()
def resumeProducing(self):
print 'Workload resumed'
self.task.resume()
def stopProducing(self):
print 'Workload stopped'
self.task.stop()
def connectionLost(self, reason):
print 'Connection lost'
try:
self.task.stop()
except:
pass
# Push factory
class PushFactory(Factory):
def buildProtocol(self, addr):
return PushProtocol()
# Run the reactor that serves everything
endpoint = TCP4ServerEndpoint(reactor, 8089)
endpoint.listen(PushFactory())
reactor.run()
2) Manually keep track of Protocol instances and use reactor.callFromThread() from a different thread. Lets you get away with a long blocking operation in the other thread (eg. ui event loop).
Code:
from twisted.internet.protocol import Factory
from twisted.internet.endpoints import TCP4ServerEndpoint
from twisted.internet.interfaces import IPushProducer
from twisted.internet.task import deferLater, cooperate
from twisted.internet.protocol import Protocol
from twisted.internet import reactor, threads
import time
import random
import threading
# Connection
protocol = None
# Some other thread that does whatever it likes.
class SomeThread(threading.Thread):
def run(self):
while True:
print("Thread loop")
time.sleep(random.randint(0, 4))
if protocol is not None:
reactor.callFromThread(self.dispatch)
def dispatch(self):
global protocol
protocol.send("Hello World\n")
# Push protocol
class PushProtocol(Protocol):
def connectionMade(self):
global protocol
protocol = self
def dataReceived(self, data):
self.transport.write(data)
def send(self, data):
self.transport.write(data)
def connectionLost(self, reason):
print 'Connection lost'
# Push factory
class PushFactory(Factory):
def buildProtocol(self, addr):
return PushProtocol()
# Start thread
other = SomeThread()
other.start()
# Run the reactor that serves everything
endpoint = TCP4ServerEndpoint(reactor, 8089)
endpoint.listen(PushFactory())
reactor.run()
Personally, I find the fact that IPushProducer and IPullProducer require a periodic callback, makes them less useful. Others disagree... shrug. Take your pick.