Python Tornado - How to create background process? - python

Is it possible to let Python Tornado run some long background process, but concurrently, it is also serving all the handlers?
I have a Tornado Webapp that serves some webpages. But I also have a message queue, and I want Tornado to poll the message queue as a subscriber. Can this be done in Tornado?
I've searched around the user guide, and there seems to be something called a periodic_call_back I can use within the ioloop. It sounds like I can use a callback function that reads a message queue. However, is there a way to create a co-routine that never stops?
Any help is appreciated, thanks!

To Read from Zero-MQ:
Install Zero-MQ Python Library
Install the IOLoop before application.listen()
Use an executor (For python2, you can install executor libraries from python3) to execute a message queue listener, which setups tornado to listen to a message queue, and then it will utilize callbacks when it recieves data.
Example (main.py):
# Import tornado libraries
import tornado.ioloop
import tornado.web
# Import URL mappings
from url import application
# Import zeroMQ libraries
from zmq.eventloop import ioloop
# Import zeroMQ.py functions
from zeroMQ import startListenToMessageQueue
# Import zeroMQ settings
import zeroMQ_settings
# Import our executor
import executors
# Import our db_settings
import db_settings
# main.py is the main access point of the tornado app, to run the application, just run "python main.py"
# What this will do is listen to port 8888, and then we can access the app using
# http://localhost:8888 on any browser, or using python requests library
if __name__ == "__main__":
# Install PyZMQ's IOLoop
ioloop.install()
# Set the application to listen to port 8888
application.listen(8888)
# Get the current IOLoop
currentIOLoop = tornado.ioloop.IOLoop.current()
# Execute ZeroMQ Subscriber for our topics
executors.executor.submit(startListenToMessageQueue(zeroMQ_settings.server_subscriber_ports,
zeroMQ_settings.server_subscriber_IP,
zeroMQ_settings.server_subscribe_list))
# Test if the connection to our database is successful before we start the IOLoop
db_settings.testForDatabase(db_settings.database)
# Start the IOLoop
currentIOLoop.start()
Example (zeroMQ.py):
# Import our executor
import executors
# Import zeroMQ libraries
import zmq
from zmq.eventloop import ioloop, zmqstream
# Import db functions to process the message
import db
# zeroMQ.py deals with the communication between a zero message queue
def startListenToMessageQueue(subscribe_ports, subscribe_IP, subscribe_topic):
# Usage:
# This function starts the subscriber for our application that will listen to the
# address and ports specified in the zeroMQ_settings.py, it will spawn a callback when we
# received anything relevant to our topic.
# Arguments:
# None
# Return:
# None
# Get zmq context
context = zmq.Context()
# Get the context socket
socket_sub = context.socket(zmq.SUB)
# Connect to multiple subscriber ports
for ports in subscribe_ports:
socket_sub.connect("tcp://"+str(subscribe_IP)+":"+str(ports))
# Subscribe to our relevant topics
for topic in subscribe_topic:
socket_sub.setsockopt(zmq.SUBSCRIBE, topic)
# Setup ZMQ Stream with our socket
stream_sub = zmqstream.ZMQStream(socket_sub)
# When we recieve our data, we will process the data by using a callback
stream_sub.on_recv(processMessage)
# Print the Information to Console
print "Connected to publisher with IP:" + \
str(subscribe_IP) + ", Port" + str(subscribe_ports) + ", Topic:" + str(subscribe_topic)
def processMessage(message):
# Usage:
# This function processes the data using a callback format. The on_recv will call this function
# and populate the message variable with the data that we recieved through the message queue
# Arguments:
# message: a string containing the data that we recieved from the message queue
# Return:
# None
# Process the message with an executor, and use the addData function in our db to process the message
executors.executor.submit(db.addData, message)
Example (executors.py):
# Import futures library
from concurrent import futures
# executors.py will create our threadpools, and this can be shared around different python files
# which will not re-create 10 threadpools when we call it.
# we can a handful of executors for running synchronous tasks
# Create a 10 thread threadpool that we can use to call any synchronous/blocking functions
executor = futures.ThreadPoolExecutor(10)
Example (zeroMQ_settings.py):
# zeroMQ_settings.py keep the settings for zeroMQ, for example port, IP, and topics that
# we need to subscribe
# Set the Port to 5558
server_subscriber_ports = ["5556", "5558"]
# Set IP to localhost
server_subscriber_IP = "localhost"
# Set Message to Subscribe: metrics.dat
server_subscriber_topic_metrics = "metrics.dat"
# Set Message to Subscribe: test-010
server_subscribe_topics_test_010 = "test-010"
# List of Subscriptions
server_subscribe_list = [server_subscriber_topic_metrics, server_subscribe_topics_test_010]
Extra thanks to #dano

Related

Why is ZeroMQ multipart sending/reciving wrong messages?

In python I'm creating an application also using ZeroMQ. I'm using the PUSH/PULL method to send the loading status of one script to another. The message received on the PULL script runs inside of a Thread. The PULL script looks like this:
import time
from threading import Thread
import threading
import os
import zmq
import sys
context = zmq.Context()
zmqsocket = context.socket(zmq.PULL)
zmqsocket.bind("tcp://*:5555")
class TaskstatusUpdater(Thread):
def __init__(self):
Thread.__init__(self)
def run(self):
while True:
# Wait for next request from client
task_id = int(zmqsocket.recv_multipart()[0])
taskcolorstat = int(zmqsocket.recv_multipart()[1])
taskstatus = zmqsocket.recv_multipart()[2]
time.sleep(0.1)
print(task_id, taskstatus, taskcolorstat)
thread = TaskstatusUpdater()
thread.start()
The PUSH part sends constantly updates about the status of the other script. It looks something like this:
import time
import sys
import zmq
# zmq - client startup and connecting
try:
context = zmq.Context()
print("Connecting to server…")
zmqsocket = context.socket(zmq.PUSH)
zmqsocket.connect("tcp://localhost:5555")
print("succesful")
except:
print('error could not connect to service')
# zmq - client startup and connecting
for i in range(10):
zmqsocket.send_multipart([b_task_id, b"0", b"first message"])
time.sleep(3)# doing stuff
zmqsocket.send_multipart([b_task_id, b"1", b"second message"])
b_task_id is generated earlier in the program and is a simple binary value created out of an integer. There are multiple of those PUSH scripts running at the same time and thru the b_task_id I can define which script is responding to the PULL.
It is now often the case that those multipart messages get mixed up between each other. Can somebody explain to me why that is and how I can fix this problem?
For example, sometimes the output is:
2 b'second message' 0
The output that I was expecting is:
2 b'second message' 1

The gRPC object creates extra processes that don't close

I developed the application with gRPC servicer. The point of my application is:
gRPC servicer (class DexFxServicer in the code below) has Transmit method which is called by gRPC client outside.
Transmit method creates multiple channels and stubs for the different hosts from hostList.
Further application creates the process pool and launches it.
Each child process calls gRPC method SendHostListAndGetMetrics for its own stub and receives response iterator.
This code works well, the application invokes Transmit method and receive all needed results from the process pool. But I noticed when outside gRPC client calls Transmit method multiple times, this code didn't close some of its child processes. And it leads to extra nonclosing processes creation as htop shows.
When I try to close gRPC channels by channel.close() method, extra processes are being created more intensively.
Python 2.7.12
grpcio==1.16.1
grpcio-tools==1.16.1
Ubuntu 16.04.6 LTS 4.4.0-143-generic
from concurrent import futures
import sleep
import grpc
import sys
import cascade_pb2
import cascade_pb2_grpc
import metrics_pb2
import metrics_pb2_grpc
from multiprocessing import Pool
class DexFxServicer(cascade_pb2_grpc.DexFxServicer):
def __init__(self, args):
self.args = args
def Transmit(self, request, context):
entrypoint = request.sender.host_address # entrypoint is a string
hostList = [] # hostList is a list of strings
for rec in request.sender.receiver:
hostList.append(rec.host_address)
channels = {}
stubs = {}
for host in hostList:
try:
channels[host] = grpc.insecure_channel('%s:%d' % (host, self.args.cascadePort))
except Exception as e:
print(e)
sys.exit(0)
else:
stubs[host] = metrics_pb2_grpc.MetricsStub(channels[host])
def collect_metrics(host):
mtrx = []
hosts = (metrics_pb2.Host(hostname = i) for i in hostList + [entrypoint])
for i in stubs[host].SendHostListAndGetMetrics(hosts):
mtrx.append(i.mtrx)
return mtrx
pool = Pool(len(hostList))
results = pool.map(collect_metrics, hostList)
pool.close()
pool.terminate()
pool.join()
# Return the iterator of the results
I expect to see the code which doesn't create extra nonclosing processes. Please, suggest me what to do in this case.
The problem was solved by means of update grpcio version to 1.23.0. gRPC issue

ZeroMQ REQ .recv() hangs with messages larger than ~1kB if run inside Docker

I'm working on a relatively simple Python / ZeroMQ based work distribution system, using REQ/ROUTER sockets. The system is distributed and worker nodes are geographically distributed on different continents.
The ROUTER, responsible for distributing work, .bind()-s a ROUTER socket. Workers .connect() to it over TCP using a REQ socket.
In the process of setting up a new worker node, I've noticed that while smaller messages (up to 1kB) do the trip with no issues, replies of ~2kB and up, sent by the ROUTER-end are never received by the worker into their REQ-socket - when I call recv(), the socket just hangs.
The worker code runs inside Docker containers, and I was able to work around the issue when running the same image with --net=host - it seems to not happen if Docker is using the host network.
I'm wondering if this is something in the network stack configuration on the host machine or in Docker, or maybe something that can be prevented in my code?
Here is a simplified version of my code that reproduces this issue:
Worker
import sys
import zmq
import logging
import time
READY = 'R'
def worker(connect_to):
ctx = zmq.Context()
socket = ctx.socket(zmq.REQ)
socket.connect(connect_to)
log = logging.getLogger(__name__)
while True:
socket.send_string(READY)
log.debug("Send READY message, waiting for reply")
message = socket.recv()
log.debug("Got reply of %d bytes", len(message))
time.sleep(5)
if __name__ == '__main__':
logging.basicConfig(level=logging.DEBUG)
worker(sys.argv[1])
Router
import sys
import zmq
import logging
REPLY_SIZE = 1024 * 8
def router(bind_to):
ctx = zmq.Context()
socket = ctx.socket(zmq.ROUTER)
socket.bind(bind_to)
poller = zmq.Poller()
poller.register(socket, zmq.POLLIN)
log = logging.getLogger(__name__)
while True:
socks = dict(poller.poll(5000))
if socks.get(socket) == zmq.POLLIN:
message = socket.recv_multipart()
log.debug("Received message of %d parts", len(message))
identity, _ = message[:2]
res = handle_message(message[2:])
log.debug("Sending %d bytes back in response on socket", len(res))
socket.send_multipart([identity, '', res])
def handle_message(parts):
log = logging.getLogger(__name__)
log.debug("Got message: %s", parts)
return 'A' * REPLY_SIZE
if __name__ == '__main__':
logging.basicConfig(level=logging.DEBUG)
router(sys.argv[1])
FWIW I was able to reproduce this on Ubuntu 16.04 (both router and worker) with Docker 17.09.0-ce, libzmq 4.1.5 and PyZMQ 15.4.0.
No, sir, the socket does not hang at all:
Why?
The issue is, that you have instructed the Socket()-instance to enter into an infinitely blocking state, once having called .recv() method, without specifying a zmq.NOBLOCK flag ( the ZMQ_DONTWAIT flag in the ZeroMQ original API ).
This is the cause, that upon other circumstances reported yesterday, moves the code into infinite blocking, as there seem to be other issues that prevent Docker-container to properly deliver any first message to the hands of the Worker's Docker-embedded-ZeroMQ-Context() I/O-engine and to the hands of the REQ-access-point. As the REQ-archetype uses a strict two-step Finite-State-Automaton - strictly striding ( .send()->.recv()->.send()-> ... ad infimum )
This cause->effect reversing is wrong and misleading -
the issue of "socket just hangs"
is un-decideable
from an issue Docker does not deliver a single message ( to allow .recv() to return )
Next steps:
may use .poll() in REQ-side to sniff without blocking for any already arrived message in the Worker.
Once there are none such, focus on Docker first + next may benefit from ZeroMQ Context()-I/O-engine performance and link-level tweaking configuration options.

Understanding Autobahn and Twisted integration

I am trying to understand the examples given here: https://github.com/tavendo/AutobahnPython/tree/master/examples/twisted/wamp/basic/pubsub/basic
I built this script which is supposed to handle multiple pub/sub websocket connections and also open a tcp port ( 8123 ) for incoming control messages. When a message comes on the 8123 port, the application should broadcast to all the connected subscribers the message received on port 8123. How do i make NotificationProtocol or NotificationFactory talk to the websocket and make the websocket server broadcast a message.
Another thing that i do not understand is the url. The client javascript connects to the url http://:8080/ws . Where does the "ws" come from ?
Also can someone explain the purpose of RouterFactory, RouterSessionFactory and this bit:
from autobahn.wamp import types
session_factory.add( WsNotificationComponent(types.ComponentConfig(realm = "realm1" )))
my code is below:
import sys, time
from twisted.internet import reactor
from twisted.internet.protocol import Protocol, Factory
from twisted.internet.defer import inlineCallbacks
from autobahn.twisted.wamp import ApplicationSession
from autobahn.twisted.util import sleep
class NotificationProtocol(Protocol):
def __init__(self, factory):
self.factory = factory
def dataReceived(self, data):
print "received new data"
class NotificationFactory(Factory):
protocol = NotificationProtocol
class WsNotificationComponent(ApplicationSession):
#inlineCallbacks
def onJoin(self, details):
counter = 0
while True:
self.publish("com.myapp.topic1", "test %d" % counter )
counter += 1
yield sleep(1)
## we use an Autobahn utility to install the "best" available Twisted reactor
##
from autobahn.twisted.choosereactor import install_reactor
reactor = install_reactor()
## create a WAMP router factory
##
from autobahn.wamp.router import RouterFactory
router_factory = RouterFactory()
## create a WAMP router session factory
##
from autobahn.twisted.wamp import RouterSessionFactory
session_factory = RouterSessionFactory(router_factory)
from autobahn.wamp import types
session_factory.add( WsNotificationComponent(types.ComponentConfig(realm = "realm1" )))
from autobahn.twisted.websocket import WampWebSocketServerFactory
transport_factory = WampWebSocketServerFactory(session_factory)
transport_factory.setProtocolOptions(failByDrop = False)
from twisted.internet.endpoints import serverFromString
## start the server from an endpoint
##
server = serverFromString(reactor, "tcp:8080")
server.listen(transport_factory)
notificationFactory = NotificationFactory()
reactor.listenTCP(8123, notificationFactory)
reactor.run()
"How do i make NotificationProtocol or NotificationFactory talk to the websocket and make the websocket server broadcast a message":
Check out one of my other answers on SO: Persistent connection in twisted. Jump down to the example code and model your websocket logic like the "IO" logic and you'll have a good fit (You might also want to see the follow-on answer about the newer endpoint calls from one of the twisted core-team too)
"Where does the "ws" come from ?"
Websockets are implemented by retasking http connections, which by their nature have to have a specific path on the request. That "ws" path typically would map to a special http handler that autobahn is building for you to process websockets (or at least that's what your javascript is expecting...). Assuming thing are setup right you can actually point your web-browswer at that url and it should print back an error about the websocket handshake (Expected WebSocket Headers in my case, but I'm using cyclones websockets not autobahn).
P.S. one of the cool side-effects from "websockets must have a specific path" is that you can actually mix websockets and normal http content on the same handler/listen/port, this gets really handy when your trying to run them all on the same SSL port because your trying to avoid the requirement of a proxy front-ending your code.

pyzmq non-blocking socket

Can someone point me to an example of a REQ/REP non-blocking ZeroMQ (0MQ) with Python bindings? Perhaps my understanding of ZMQ is faulty but I couldn't find an example online.
I have a server in Node.JS that sends work from multiple clients to the server. The idea is that the server can spin up a bunch of jobs that operate in parallel instead of processing data for one client followed by the next
You can use for this goal both zmq.Poller (many examples you can find in zguide repo, eg rrbroker.py) or gevent-zeromq implementation (code sample).
The example provided in the accepted answer gives the gist of it, but you can get away with something a bit simpler as well by using zmq.device for the broker while otherwise sticking to the "Extended Request-Reply" pattern from the guide. As such, a hello worldy example for the server could look something like the following:
import time
import threading
import zmq
context = zmq.Context()
def worker():
socket = context.socket(zmq.REP)
socket.connect('inproc://workers')
while True:
msg = socket.recv_string()
print(f'Received request: [{msg}]')
time.sleep(1)
socket.send_string(msg)
url_client = 'tcp://*:5556'
clients = context.socket(zmq.ROUTER)
clients.bind(url_client)
workers = context.socket(zmq.DEALER)
workers.bind('inproc://workers')
for _ in range(4):
thread = threading.Thread(target=worker)
thread.start()
zmq.device(zmq.QUEUE, clients, workers)
Here we're letting four workers handle incoming requests in parallel. Now, you're using Node on the client side, but just to keep the example complete, one can use the Python client below to see that this works. Here, we're creating 10 requests which will then be handled in 3 batches:
import zmq
import threading
context = zmq.Context()
def make_request(a):
socket = context.socket(zmq.REQ)
socket.connect('tcp://localhost:5556')
print(f'Sending request {a} ...')
socket.send_string(str(a))
message = socket.recv_string()
print(f'Received reply from request {a} [{message}]')
for a in range(10):
thread = threading.Thread(target=make_request, args=(a,))
thread.start()

Categories

Resources