I am trying to build a small site with the server push functionality on Flask micro-web framework, but I did not know if there is a framework to work with directly.
I used Juggernaut, but it seems to be not working with redis-py in current version, and Juggernaut has been deprecated recently.
Does anyone has a suggestion with my case?
Have a look at Server-Sent Events. Server-Sent Events is a
browser API that lets you keep open a socket to your server, subscribing to a
stream of updates. For more Information read Alex MacCaw (Author of
Juggernaut) post on why he kills juggernaut and why the simpler
Server-Sent Events are in manny cases the better tool for the job than
Websockets.
The protocol is really easy. Just add the mimetype text/event-stream to your
response. The browser will keep the connection open and listen for updates. An Event
sent from the server is a line of text starting with data: and a following newline.
data: this is a simple message
<blank line>
If you want to exchange structured data, just dump your data as json and send the json over the wire.
An advantage is that you can use SSE in Flask without the need for an extra
Server. There is a simple chat application example on github which
uses redis as a pub/sub backend.
def event_stream():
pubsub = red.pubsub()
pubsub.subscribe('chat')
for message in pubsub.listen():
print message
yield 'data: %s\n\n' % message['data']
#app.route('/post', methods=['POST'])
def post():
message = flask.request.form['message']
user = flask.session.get('user', 'anonymous')
now = datetime.datetime.now().replace(microsecond=0).time()
red.publish('chat', u'[%s] %s: %s' % (now.isoformat(), user, message))
#app.route('/stream')
def stream():
return flask.Response(event_stream(),
mimetype="text/event-stream")
You do not need to use gunicron to run the
example app. Just make sure to use threading when running the app, because
otherwise the SSE connection will block your development server:
if __name__ == '__main__':
app.debug = True
app.run(threaded=True)
On the client side you just need a Javascript handler function which will be called when a new
message is pushed from the server.
var source = new EventSource('/stream');
source.onmessage = function (event) {
alert(event.data);
};
Server-Sent Events are supported by recent Firefox, Chrome and Safari browsers.
Internet Explorer does not yet support Server-Sent Events, but is expected to support them in
Version 10. There are two recommended Polyfills to support older browsers
EventSource.js
jquery.eventsource
Redis is overkill: use Server-Sent Events (SSE)
Late to the party (as usual), but IMHO using Redis may be overkill.
As long as you're working in Python+Flask, consider using generator functions as described in this excellent article by Panisuan Joe Chasinga. The gist of it is:
In your client index.html
var targetContainer = document.getElementById("target_div");
var eventSource = new EventSource("/stream")
eventSource.onmessage = function(e) {
targetContainer.innerHTML = e.data;
};
...
<div id="target_div">Watch this space...</div>
In your Flask server:
def get_message():
'''this could be any function that blocks until data is ready'''
time.sleep(1.0)
s = time.ctime(time.time())
return s
#app.route('/')
def root():
return render_template('index.html')
#app.route('/stream')
def stream():
def eventStream():
while True:
# wait for source data to be available, then push it
yield 'data: {}\n\n'.format(get_message())
return Response(eventStream(), mimetype="text/event-stream")
As a follow-up to #peter-hoffmann's answer, I've written a Flask extension specifically to handle server-sent events. It's called Flask-SSE, and it's available on PyPI. To install it, run:
$ pip install flask-sse
You can use it like this:
from flask import Flask
from flask_sse import sse
app = Flask(__name__)
app.config["REDIS_URL"] = "redis://localhost"
app.register_blueprint(sse, url_prefix='/stream')
#app.route('/send')
def send_message():
sse.publish({"message": "Hello!"}, type='greeting')
return "Message sent!"
And to connect to the event stream from Javascript, it works like this:
var source = new EventSource("{{ url_for('sse.stream') }}");
source.addEventListener('greeting', function(event) {
var data = JSON.parse(event.data);
// do what you want with this data
}, false);
Documentation is available on ReadTheDocs. Note that you'll need a running Redis server to handle pub/sub.
As a committer of https://github.com/WolfgangFahl/pyFlaskBootstrap4 i ran into the same need and created a flask blueprint for Server Sent Events that has no dependency to redis.
This solutions builds on the other answers that have been given here in the past.
https://github.com/WolfgangFahl/pyFlaskBootstrap4/blob/main/fb4/sse_bp.py has the source code (see also sse_bp.py below).
There are unit tests at https://github.com/WolfgangFahl/pyFlaskBootstrap4/blob/main/tests/test_sse.py
The idea is that you can use different modes to create your SSE stream:
by providing a function
by providing a generator
by using a PubSub helper class
by using the PubSub helper class and use pydispatch at the same time.
As of 2021-02-12 this is alpha code which i want to share nevertheless. Please comment here or as issues in the project.
There is a demo at http://fb4demo.bitplan.com/events and a description of the example use e.g. for a progress bar or time display at: http://wiki.bitplan.com/index.php/PyFlaskBootstrap4#Server_Sent_Events
example client javascript/html code
<div id="event_div">Watch this space...</div>
<script>
function fillContainerFromSSE(id,url) {
var targetContainer = document.getElementById(id);
var eventSource = new EventSource(url)
eventSource.onmessage = function(e) {
targetContainer.innerHTML = e.data;
};
};
fillContainerFromSSE("event_div","/eventfeed");
</script>
example server side code
def getTimeEvent(self):
'''
get the next time stamp
'''
time.sleep(1.0)
s=datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S')
return s
def eventFeed(self):
'''
create a Server Sent Event Feed
'''
sse=self.sseBluePrint
# stream from the given function
return sse.streamFunc(self.getTimeEvent)
sse_bp.py
'''
Created on 2021-02-06
#author: wf
'''
from flask import Blueprint, Response, request, abort,stream_with_context
from queue import Queue
from pydispatch import dispatcher
import logging
class SSE_BluePrint(object):
'''
a blueprint for server side events
'''
def __init__(self,app,name:str,template_folder:str=None,debug=False,withContext=False):
'''
Constructor
'''
self.name=name
self.debug=debug
self.withContext=False
if template_folder is not None:
self.template_folder=template_folder
else:
self.template_folder='templates'
self.blueprint=Blueprint(name,__name__,template_folder=self.template_folder)
self.app=app
app.register_blueprint(self.blueprint)
#self.app.route('/sse/<channel>')
def subscribe(channel):
def events():
PubSub.subscribe(channel)
self.stream(events)
def streamSSE(self,ssegenerator):
'''
stream the Server Sent Events for the given SSE generator
'''
response=None
if self.withContext:
if request.headers.get('accept') == 'text/event-stream':
response=Response(stream_with_context(ssegenerator), content_type='text/event-stream')
else:
response=abort(404)
else:
response= Response(ssegenerator, content_type='text/event-stream')
return response
def streamGen(self,gen):
'''
stream the results of the given generator
'''
ssegen=self.generateSSE(gen)
return self.streamSSE(ssegen)
def streamFunc(self,func,limit=-1):
'''
stream a generator based on the given function
Args:
func: the function to convert to a generator
limit (int): optional limit of how often the generator should be applied - 1 for endless
Returns:
an SSE Response stream
'''
gen=self.generate(func,limit)
return self.streamGen(gen)
def generate(self,func,limit=-1):
'''
create a SSE generator from a given function
Args:
func: the function to convert to a generator
limit (int): optional limit of how often the generator should be applied - 1 for endless
Returns:
a generator for the function
'''
count=0
while limit==-1 or count<limit:
# wait for source data to be available, then push it
count+=1
result=func()
yield result
def generateSSE(self,gen):
for result in gen:
yield 'data: {}\n\n'.format(result)
def enableDebug(self,debug:bool):
'''
set my debugging
Args:
debug(bool): True if debugging should be switched on
'''
self.debug=debug
if self.debug:
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s.%(msecs)03d %(levelname)s:\t%(message)s', datefmt='%Y-%m-%d %H:%M:%S')
def publish(self, message:str, channel:str='sse', debug=False):
"""
Publish data as a server-sent event.
Args:
message(str): the message to send
channel(str): If you want to direct different events to different
clients, you may specify a channel for this event to go to.
Only clients listening to the same channel will receive this event.
Defaults to "sse".
debug(bool): if True enable debugging
"""
return PubSub.publish(channel=channel, message=message,debug=debug)
def subscribe(self,channel,limit=-1,debug=False):
def stream():
for message in PubSub.subscribe(channel,limit,debug=debug):
yield str(message)
return self.streamGen(stream)
class PubSub:
'''
redis pubsub duck replacement
'''
pubSubByChannel={}
def __init__(self,channel:str='sse',maxsize:int=15, debug=False,dispatch=False):
'''
Args:
channel(string): the channel name
maxsize(int): the maximum size of the queue
debug(bool): whether debugging should be switched on
dispatch(bool): if true use the pydispatch library - otherwise only a queue
'''
self.channel=channel
self.queue=Queue(maxsize=maxsize)
self.debug=debug
self.receiveCount=0
self.dispatch=False
if dispatch:
dispatcher.connect(self.receive,signal=channel,sender=dispatcher.Any)
#staticmethod
def reinit():
'''
reinitialize the pubSubByChannel dict
'''
PubSub.pubSubByChannel={}
#staticmethod
def forChannel(channel):
'''
return a PubSub for the given channel
Args:
channel(str): the id of the channel
Returns:
PubSub: the PubSub for the given channel
'''
if channel in PubSub.pubSubByChannel:
pubsub=PubSub.pubSubByChannel[channel]
else:
pubsub=PubSub(channel)
PubSub.pubSubByChannel[channel]=pubsub
return pubsub
#staticmethod
def publish(channel:str,message:str,debug=False):
'''
publish a message via the given channel
Args:
channel(str): the id of the channel to use
message(str): the message to publish/send
Returns:
PubSub: the pub sub for the channel
'''
pubsub=PubSub.forChannel(channel)
pubsub.debug=debug
pubsub.send(message)
return pubsub
#staticmethod
def subscribe(channel,limit=-1,debug=False):
'''
subscribe to the given channel
Args:
channel(str): the id of the channel to use
limit(int): limit the maximum amount of messages to be received
debug(bool): if True debugging info is printed
'''
pubsub=PubSub.forChannel(channel)
pubsub.debug=debug
return pubsub.listen(limit)
def send(self,message):
'''
send the given message
'''
sender=object();
if self.dispatch:
dispatcher.send(signal=self.channel,sender=sender,msg=message)
else:
self.receive(sender,message)
def receive(self,sender,message):
'''
receive a message
'''
if sender is not None:
self.receiveCount+=1;
if self.debug:
logging.debug("received %d:%s" % (self.receiveCount,message))
self.queue.put(message)
def listen(self,limit=-1):
'''
listen to my channel
this is a generator for the queue content of received messages
Args:
limit(int): limit the maximum amount of messages to be received
Return:
generator: received messages to be yielded
'''
if limit>0 and self.receiveCount>limit:
return
yield self.queue.get()
def unsubscribe(self):
'''
unsubscribe me
'''
if self.dispatch:
dispatcher.disconnect(self.receive, signal=self.channel)
pass
Related
I'm trying to understand if grpc server using streams is able to wait for all client messages to be read in prior to sending responses.
I have a trivial application where I send in several numbers I'd like to add and return.
I've set up a basic proto file to test this:
syntax = "proto3";
message CalculateRequest{
int64 x = 1;
int64 y = 2;
};
message CalculateReply{
int64 result = 1;
}
service Svc {
rpc CalculateStream (stream CalculateRequest) returns (stream CalculateReply);
}
On my server-side I have implemented the following code which returns the answer message as the message is received:
class CalculatorServicer(contracts_pb2_grpc.SvcServicer):
def CalculateStream(self, request_iterator, context):
for request in request_iterator:
resultToOutput = request.x + request.y
yield contracts_pb2.CalculateReply(result=resultToOutput)
def serve():
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
contracts_pb2_grpc.add_SvcServicer_to_server(
CalculatorServicer(), server)
server.add_insecure_port('localhost:9000')
server.start()
server.wait_for_termination()
if __name__ == '__main__':
print( "We're up")
logging.basicConfig()
serve()
I'd like to tweak this to first read in all the numbers and then send these out at a later stage - something like the following:
class CalculatorServicer(contracts_pb2_grpc.SvcServicer):
listToReturn = []
def CalculateStream(self, request_iterator, context):
for request in request_iterator:
listToReturn.append (request.x + request.y)
# ...
# do some other stuff first before returning
for item in listToReturn:
yield contracts_pb2.CalculateReply(result=resultToOutput)
Currently, my implementation to write out later doesn't work as the code at the bottom is never reached. Is this by design that the connection seems to "close" before reaching there?
The grpc.io website suggests that this should be possible with BiDirectional streaming:
for example, the server could wait to receive all the client messages before writing its responses, or it could alternately read a message then write a message, or some other combination of reads and writes.
Thanks in advance for any help :)
The issue here is the definition of "all client messages." At the transport level, the server has no way of knowing whether the client has finished independent of the client closing its connection.
You need to add some indication of the client's having finished sending requests to the protocol. Either add a bool field to the existing CalculateRequest or add a top-level oneof with one of the options being something like a StopSendingRequests
I want to use grpc-python in the following scenario, but I don' t know how to realize it.
The scenario is that, in the python server, it uses class to calculate and update the instance' s state, then sends such state to corresponding client; in the client side, more than one clients need to communicate with the server to get its one result and not interfered by others.
Specifically, suppose there is a class with initial value self.i =0, then each time the client calls the class' s update function, it does self.i=self.i+1 and returns self.i. Actually there are two clients call such update function simultaneously, like when client1 calls update at third time, client2 calls update at first time.
I think this may can be solved by creating thread for each client to avoid conflict. If the new client calls, new thead will be created; if existing client calls, existing thread will be used. But I don' t know how to realize it?
Hope you can help me. Thanks in advance.
I think I solved this problem by myself. If you have any other better solutions, you can post here.
I edited helloworld example in grpc-python introduction to explain my aim.
For helloworld.proto
syntax = "proto3";
option java_multiple_files = true;
option java_package = "io.grpc.examples.helloworld";
option java_outer_classname = "HelloWorldProto";
option objc_class_prefix = "HLW";
package helloworld;
// The greeting service definition.
service Greeter {
// Sends a greeting
rpc SayHello (HelloRequest) returns (HelloReply) {}
rpc Unsubscribe (HelloRequest) returns (HelloReply) {}
}
// The request message containing the user's name.
message HelloRequest {
string name = 1;
}
// The response message containing the greetings
message HelloReply {
string message = 1;
}
I add Unsubsribe function to allow one specific client to diconnect from server.
In hello_server.py
import grpc
import helloworld_pb2
import helloworld_pb2_grpc
import threading
from threading import RLock
import time
from concurrent import futures
import logging
class Calcuate:
def __init__(self):
self.i = 0
def add(self):
self.i+=1
return self.i
class PeerSet(object):
def __init__(self):
self._peers_lock = RLock()
self._peers = {}
self.instances = {}
def connect(self, peer):
with self._peers_lock:
if peer not in self._peers:
print("Peer {} connecting".format(peer))
self._peers[peer] = 1
a = Calcuate()
self.instances[peer] = a
output = a.add()
return output
else:
self._peers[peer] += 1
a = self.instances[peer]
output = a.add()
return output
def disconnect(self, peer):
print("Peer {} disconnecting".format(peer))
with self._peers_lock:
if peer not in self._peers:
raise RuntimeError("Tried to disconnect peer '{}' but it was never connected.".format(peer))
del self._peers[peer]
del self.instances[peer]
def peers(self):
with self._peers_lock:
return self._peers.keys()
class Greeter(helloworld_pb2_grpc.GreeterServicer):
def __init__(self):
self._peer_set = PeerSet()
def _record_peer(self, context):
return self._peer_set.connect(context.peer())
def SayHello(self, request, context):
output = self._record_peer(context)
print("[thread {}] Peers: {}, output: {}".format(threading.currentThread().ident, self._peer_set.peers(), output))
time.sleep(1)
return helloworld_pb2.HelloReply(message='Hello, {}, {}!'.format(request.name, output))
def Unsubscribe(self, request, context):
self._peer_set.disconnect(context.peer())
return helloworld_pb2.HelloReply(message='{} disconnected!'.format(context.peer()))
def serve():
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
helloworld_pb2_grpc.add_GreeterServicer_to_server(Greeter(), server)
server.add_insecure_port('[::]:50051')
server.start()
server.wait_for_termination()
if __name__ == '__main__':
logging.basicConfig()
serve()
The use of context.peer() is adapted from Richard Belleville' s answer in this post. You can change add() function to any other functions that can be used to update instance' s state.
In hello_client.py
from __future__ import print_function
import logging
import grpc
import helloworld_pb2
import helloworld_pb2_grpc
def run():
# NOTE(gRPC Python Team): .close() is possible on a channel and should be
# used in circumstances in which the with statement does not fit the needs
# of the code.
with grpc.insecure_channel('localhost:50051') as channel:
stub = helloworld_pb2_grpc.GreeterStub(channel)
response = stub.SayHello(helloworld_pb2.HelloRequest(name='you'))
print("Greeter client received: " + response.message)
response = stub.SayHello(helloworld_pb2.HelloRequest(name='Tom'))
print("Greeter client received: " + response.message)
response = stub.SayHello(helloworld_pb2.HelloRequest(name='Jerry'))
print("Greeter client received: " + response.message)
stub.Unsubscribe(helloworld_pb2.HelloRequest(name="end"))
if __name__ == '__main__':
logging.basicConfig()
run()
If we run serveral hello_client.py simultaneously, the server can distinguish the different clients and send correct corresponding info to them.
I have a web app written in CherryPy: a user uploads a file, then some lengthy operation begins, passing through several stages. I want notifications for these stages to be pushed to all the connected clients. But I don't know how to communicate between processes. I guess I would have to launch the lengthy operation in a separate process, but then I don't know how to pass the "advanced to stage N" messages to the "server-sending function".
Conceptually, it would be something like this:
SSEtest.py:
from pathlib import Path
from time import sleep
import cherrypy
def lengthy_operation(name, stream):
for stage in range(10):
print(f'stage {stage}... ', end='')
sleep(2)
print('done')
print('finished')
class SSETest():
#cherrypy.expose
def index(self):
return Path('SSEtest.html').read_text()
#cherrypy.expose
def upload(self, file):
name = file.filename.encode('iso-8859-1').decode('utf-8')
lengthy_operation(name, file.file)
return 'OK'
#cherrypy.expose
def stage(self):
cherrypy.response.headers['Content-Type'] = 'text/event-stream;charset=utf-8'
def lengthy_operation():
for stage in range(5):
yield f'data: stage {stage}... \n\n'
sleep(2)
yield 'data: done\n\n'
yield 'data: finished\n\n'
return lengthy_operation()
stage._cp_config = {'response.stream': True, 'tools.encode.encoding': 'utf-8'}
cherrypy.quickstart(SSETest())
SSEtest.html:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>SSE Test</title>
</head>
<body>
<h1>SSE Test</h1>
<div>
<form id="load_file_form" action="" enctype="multipart/form-data">
<label for="load_file">Load a file: </label>
<input type="file" id="load_file" name="load_file">
<progress max="100" value="0" id="progress_bar"></progress>
</form>
</div>
<div id="status_messages">
<h3>Stages:</h3>
</div>
<script>
const load_file = document.getElementById('load_file');
const progress_bar = document.getElementById('progress_bar');
function update_progress_bar(event) {
if (event.lengthComputable) {
progress_bar.value = Math.round((event.loaded/event.total)*100);
}
}
load_file.onchange = function (event) {
let the_file = load_file.files[0];
let formData = new FormData();
let connection = new XMLHttpRequest();
formData.append('file', the_file, the_file.name);
connection.open('POST', 'upload', true);
connection.upload.onprogress = update_progress_bar;
connection.onload = function (event) {
if (connection.status != 200) {
alert('Error! ' + event);
}
};
connection.send(formData);
};
const status_messages = document.getElementById("status_messages");
const sse = new EventSource("stage");
sse.onopen = function (event) {
let new_message = document.createElement("p");
new_message.innerHTML = "Connection established: " + event.type;
status_messages.appendChild(new_message);
};
sse.onmessage = function (event) {
let new_message = document.createElement("p");
new_message.innerHTML = event.data;
status_messages.appendChild(new_message);
};
sse.onerror = function(event) {
let new_message = document.createElement("p");
if (event.readyState == EventSource.CLOSED) {
new_message.innerHTML = "Connections closed";
} else {
new_message.innerHTML = "Error: " + event.type;
}
status_messages.appendChild(new_message);
};
</script>
</body>
</html>
I need lengthy_operation() to be called only once, when the file is uploaded. And the messages generated by it to be sent to all the clients. Now it works with the local function, which is not what I want. How can I use the outer function and pass its messages into the stage() method?
I want notifications for these stages to be pushed to all the connected clients.
I suspect in the end you will want more control than that, but I will answer your question as it was asked. Later, you may want to build on the example below and filter the broadcasted notifications based on the user's session, or based on a certain starting timestamp, or some other relevant concept.
Each "connected client" is effectively hanging on a long-running request to /stage which the server will use to stream events to the client. In your example, each client will begin that request immediately and leave it open until the server terminates the stream. You can also close the stream from the client using close() on the EventSource.
Basic Solution
You asked how to have the /stage handler broadcast or mirror its events to all of the currently-connected clients. There are many ways you could accomplish this, but in a nutshell you want the lengthy_operation function to either post events to all /stage handler readers or to a persistent shared location from which all /stage handlers read. I will show a way to encapsulate the first idea described above.
Consider a generic stream event class that serializes to data: <some message>:
class StreamEvent:
def __init__(self, message: str) -> bytes:
self.message = message
def serialize(self) -> str:
return f'data: {self.message}\n\n'.encode('utf-8')
and a more specific derived case for file-related stream events:
class FileStreamEvent(StreamEvent):
def __init__(self, message: str, name: str):
super().__init__(message)
self.name = name
def serialize(self) -> bytes:
return f'data: file: {self.name}: {self.message}\n\n'.encode('utf-8')
You can create an extremely primitive publish/subscribe type of container where /stage can then subscribe listeners and lengthy_operation() can publish StreamEvent instances to all listeners:
class StreamSource:
def __init__(self):
self.listeners: List[Queue] = []
def put(self, event: StreamEvent):
for listener in self.listeners:
listener.put_nowait(event)
def get(self):
listener = Queue()
self.listeners.append(listener)
try:
while True:
event = listener.get()
yield event.serialize()
finally:
self.listeners.remove(listener)
In StreamSource.get(), you likely want to create an end case (e.g. check for a "close" or "finish" event) to exit from the generic while True and you likely want to set a timeout on the blocking Queue.get() call. But for the sake of this example, I kept everything basic.
Now, lengthy_operation() just needs a reference to a StreamSource:
def lengthy_operation(events: StreamSource, name: str, stream: BinaryIO):
for stage in range(10):
events.put(FileStreamEvent(f'stage {stage}: begin', name))
sleep(2)
events.put(FileStreamEvent(f'stage {stage}: end', name))
events.put(FileStreamEvent('finished', name))
SSETest can then provide a shared instance of StreamSource to each lengthy_operation() call and SSETest.stage() can use StreamSource.get() to register a listener on this shared instance:
class SSETest:
_stream_source: StreamSource = StreamSource()
#cherrypy.expose
def index(self):
return Path('SSETest.html').read_text()
#cherrypy.expose
def upload(self, file):
name = file.filename.encode('iso-8859-1').decode('utf-8')
lengthy_operation(self._stream_source, name, file.file)
return 'OK'
#cherrypy.expose
def stage(self):
cherrypy.response.headers['Cache-Control'] = 'no-cache'
cherrypy.response.headers['Content-Type'] = 'text/event-stream'
def stream():
yield from self._stream_source.get()
return stream()
stage._cp_config = {'response.stream': True}
This is a complete[1] example of how to resolve your immediate question but you will most likely want to adapt this as you work closer to the final user experience you probably have in mind.
[1]: I left out the imports for readability, so here they are:
from dataclasses import dataclass
from pathlib import Path
from queue import Queue
from time import sleep
from typing import BinaryIO, List
import cherrypy
Follow-on Exit Conditions
Since you are using cherrypy.quickstart(), in the minimal viable solution above you will have to forcefully exit the SSETest service as I did not assume any graceful "stop" behaviors for you. The first solution explicitly points this out but offers no solution for the sake of readability.
Let's look at a couple ways to provide some initial graceful "stop" conditions:
Add a stop condition to StreamSource
First, at least add a reasonable stop condition to StreamSource. For instance, add a running attribute that allows the StreamSource.get() while loop to exit gracefully. Next, set a reasonable Queue.get() timeout so the loop can periodically test this running attribute between processing messages. Next, ensure at least some relevant CherryPy bus messages trigger this stop behavior. Below, I have rolled all of this behavior into the StreamSource class but you could also register a separate application level CherryPy plugin to handle calling into StreamSource.stop() rather than making StreamSource a plugin. I will demonstrate what that looks like when I add a separate signal handler.
class StreamSource(plugins.SimplePlugin):
def __init__(self, bus: wspbus.Bus):
super().__init__(bus)
self.subscribe()
self.running = True
self.listeners: List[Queue] = []
def graceful(self):
self.stop()
def exit(self):
self.stop()
def stop(self):
self.running = False
def put(self, event: StreamEvent):
for listener in self.listeners:
listener.put_nowait(event)
def get(self):
listener = Queue()
self.listeners.append(listener)
try:
while self.running:
try:
event = listener.get(timeout=1.0)
yield event.serialize()
except Empty:
pass
finally:
self.listeners.remove(listener)
Now, SSETest will need to initialize StreamSource with a bus value since the class is now a SimplePlugin:
_stream_source: StreamSource = StreamSource(cherrypy.engine)
You will find that this solution gets you much closer to what you likely want in terms of user experience. Issue a keyboard interrupt and CherryPy will begin stopping the system, but the first graceful keyboard interrupt will not publish a stop message, for that you need to send a second keyboard interrupt.
Add a SIGINT handler to capture keyboard interrupts
Due to the way cherrypy.quickstart works with signal handlers, you may then want to register a SIGINT handler as a CherryPy-compatible SignalHandler plugin to gracefully stop the StreamSource at the first keyboard interrupt.
Here is an example:
class SignalHandler(plugins.SignalHandler):
def __init__(self, bus: wspbus.Bus, sse):
super().__init__(bus)
self.handlers = {
'SIGINT': self.handle_SIGINT,
}
self.sse = sse
def handle_SIGINT(self):
self.sse.stop()
raise KeyboardInterrupt()
Note that in this case I am demonstrating a generic application level handler which you can then configure and initialize by altering your startup cherrypy.quickstart() logic as follows:
sse = SSETest()
SignalHandler(cherrypy.engine, sse).subscribe()
cherrypy.quickstart(sse)
For this example, I expose a generic application SSETest.stop method to encapsulate the desired behavior:
class SSETest:
_stream_source: StreamSource = StreamSource(cherrypy.engine)
def stop(self):
self._stream_source.stop()
Wrap-up analysis
I am not a CherryPy user and I only started looking at it for the first time yesterday just to answer your question, so I will leave "CherryPy best practices" up to your discretion.
In reality, your problem is a very generic combination of the following Python questions:
how can I implement a simple publish/subscribe pattern? (answered with Queue);
how can I create an exit condition for the subscriber loop? (answered with Queue.get()'s timeout parameter and a running attribute)
how can I influence the exit condition with keyboard interrupts? (answered with a CherryPy-specific signal handler, but this merely sits on top of concepts you will find in Python's built in signal module)
You can solve all of these questions in many ways and some lean more toward generic "Pythonic" solutions (my preference where it makes sense) while others leverage CherryPy-centric concepts (and that makes sense in cases where you want to augment CherryPy behavior rather than rewrite or break it).
As an example, you could use CherryPy bus messages to convey stream messages, but to me that entangles your application logic a bit too much in CherryPy-specific features, so I would probably find a middle ground where you handle your application features generically (so as not to tie yourself to CherryPy) as seen in how my StreamSource example uses a standard Python Queue pattern. You could choose to make StreamSource a plugin so that it can respond to certain CherryPy bus messages directly (as I show above), or you could have a separate plugin that knows to call into the relevant application-specific domains such as StreamSource.stop() (similar to what I show with SignalHandler).
Last, all of your questions are great, but they have all likely been answered before on SO as generic Python questions, so while I am tying the answers here to your CherryPy problem space I also want to help you (and future readers) realize how to think about these particular problems more abstractly beyond CherryPy.
I'm trying to create a simple Kafka producer based on confluent_kafka. My code is the following:
#!/usr/bin/env python
from confluent_kafka import Producer
import json
def delivery_report(err, msg):
"""Called once for each message produced to indicate delivery result.
Triggered by poll() or flush().
see https://github.com/confluentinc/confluent-kafka-python/blob/master/README.md"""
if err is not None:
print('Message delivery failed: {}'.format(err))
else:
print('Message delivered to {} [{}]'.format(
msg.topic(), msg.partition()))
class MySource:
"""Kafka producer"""
def __init__(self, kafka_hosts, topic):
"""
:kafka_host list(str): hostnames or 'host:port' of Kafka
:topic str: topic to produce messages to
"""
self.topic = topic
# see https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md
config = {
'metadata.broker.list': ','.join(kafka_hosts),
'group.id': 'mygroup',
}
self.producer = Producer(config)
#staticmethod
def main():
topic = 'my-topic'
message = json.dumps({
'measurement': [1, 2, 3]})
mys = MySource(['kafka'], topic)
mys.producer.produce(
topic, message, on_delivery=delivery_report)
mys.producer.flush()
if __name__ == "__main__":
MySource.main()
The first time I use a topic (here: "my-topic"), Kafka does react with "Auto creation of topic my-topic with 1 partitions and replication factor 1 is successful (kafka.server.KafkaApis)". However, the call-back function (on_delivery=delivery_report) is never called and it hangs at flush() (it terminates if I set a timeout for flush) neither the first time nor subsequent times. The Kafka logs does not show anything if I use an existing topic.
I'm using Flask and Tweepy to search for live tweets. On the front-end I have a user text input, and button called "Search". Ideally, when a user gives a search-term into the input and clicks the "Search" button, the Tweepy should listen for the new search-term and stop the previous search-term stream. When the "Search" button is clicked it executes this function:
#app.route('/search', methods=['POST'])
# gets search-keyword and starts stream
def streamTweets():
search_term = request.form['tweet']
search_term_hashtag = '#' + search_term
# instantiate listener
listener = StdOutListener()
# stream object uses listener we instantiated above to listen for data
stream = tweepy.Stream(auth, listener)
if stream is not None:
print "Stream disconnected..."
stream.disconnect()
stream.filter(track=[search_term or search_term_hashtag], async=True)
redirect('/stream') # execute '/stream' sse
return render_template('index.html')
The /stream route that is executed in the second to last line in above code is as follows:
#app.route('/stream')
def stream():
# we will use Pub/Sub process to send real-time tweets to client
def event_stream():
# instantiate pubsub
pubsub = red.pubsub()
# subscribe to tweet_stream channel
pubsub.subscribe('tweet_stream')
# initiate server-sent events on messages pushed to channel
for message in pubsub.listen():
yield 'data: %s\n\n' % message['data']
return Response(stream_with_context(event_stream()), mimetype="text/event-stream")
My code works fine, in the sense that it starts a new stream and searches for a given term whenever the "Search" button is clicked, but it does not stop the previous search. For example, if my first search term was "NYC" and then I wanted to search for a different term, say "Los Angeles", it will give me results for both "NYC" and "Los Angeles", which is not what I want. I want just "Los Angeles" to be searched. How do I fix this? In other words, how do I stop the previous stream? I looked through other previous threads, and I know I have to use stream.disconnect(), but I'm not sure how to implement this in my code. Any help or input would be greatly appreciated. Thanks so much!!
Below is some code that will cancel old streams when a new stream is created. It works by adding new streams to a global list, and then calling stream.disconnect() on all streams in the list whenever a new stream is created.
diff --git a/app.py b/app.py
index 1e3ed10..f416ddc 100755
--- a/app.py
+++ b/app.py
## -23,6 +23,8 ## auth.set_access_token(access_token, access_token_secret)
app = Flask(__name__)
red = redis.StrictRedis()
+# Add a place to keep track of current streams
+streams = []
#app.route('/')
def index():
## -32,12 +34,18 ## def index():
#app.route('/search', methods=['POST'])
# gets search-keyword and starts stream
def streamTweets():
+ # cancel old streams
+ for stream in streams:
+ stream.disconnect()
+
search_term = request.form['tweet']
search_term_hashtag = '#' + search_term
# instantiate listener
listener = StdOutListener()
# stream object uses listener we instantiated above to listen for data
stream = tweepy.Stream(auth, listener)
+ # add this stream to the global list
+ streams.append(stream)
stream.filter(track=[search_term or search_term_hashtag],
async=True) # make sure stream is non-blocking
redirect('/stream') # execute '/stream' sse
What this does not solve is the problem of session management. With your current setup a search by one user will affect the searches of all users. This can be avoided by giving your users some identifier and storing their streams along with their identifier. The easiest way to do this is likely to use Flask's session support. You could also do this with a requestId as Pierre suggested. In either case you will also need code to notice when a user has closed the page and close their stream.
Disclaimer: I know nothing about Tweepy, but this appears to be a design issue.
Are you trying to add state to a RESTful API? You may have a design problem.
As JRichardSnape answered, your API shouldn't be the one taking care of canceling a request; it should be done in the front-end. What I mean here is in the javascript / AJAX / etc calling this function, add another call, to the new function
#app.route('/cancelSearch', methods=['POST'])
With the "POST" that has the search terms. So long as you don't have state, you can't really do this safely in an async call: Imagine someone else makes the same search at the same time then canceling one will cancel both (remember, you don't have state so you don't know who you're canceling). Perhaps you do need state with your design.
If you must keep using this and don't mind breaking the "stateless" rule, then add a "state" to your request. In this case it's not so bad because you could launch a thread and name it with the userId, then kill the thread every new search
def streamTweets():
search_term = request.form['tweet']
userId = request.form['userId'] # If your limit is one request per user at a time. If multiple windows can be opened and you want to follow this limit, store userId in a cookie.
#Look for any request currently running with this ID, and cancel them
Alternatively, you could return a requestId, which you would then keep in the front-end can call cancelSearch?requestId=$requestId. In cancelSearch, you would have to find the pending request (sounds like that's in tweepy since you're not using your own threads) and disconnect it.
Out of curiosity I just watched what happens when you search on Google, and it uses a GET request. Have a look (debug tools -> Network; then enter some text and see the autofill). Google uses a token sent with every request (every time you type something)). It doesn't mean it's used for this, but that's basically what I described. If you don't want a session, then use a unique identifier.
Well I solved it by using timer method But still I'm looking for pythonic way.
from streamer import StreamListener
def stream():
hashtag = input
#assign each user an ID ( for pubsub )
StreamListener.userid = random_user_id
def handler(signum, frame):
print("Forever is over")
raise Exception("end of time")
def main_stream():
stream = tweepy.Stream(auth, StreamListener())
stream.filter(track=track,async=True)
redirect(url_for('map_stream'))
def close_stream():
# this is for closing client list in redis but don't know it's working
obj = redis.client_list(tweet_stream)
redis_client_list = obj[0]['addr']
redis.client_kill(redis_client_list)
stream = tweepy.Stream(auth, StreamListener())
stream.disconnect()
import signal
signal.signal(signal.SIGALRM, handler)
signal.alarm(300)
try:
main_stream()
except Exception:
close_stream()
print("function terminate")