How to abort a client request - python

I have this code in my application that use Django framework:
import os, time
def get(self, request, id=None):
pid = os.fork()
if pid == 0:
self.run()
time.sleep(600)
else:
time.sleep(20)
return write_response()
I create a child process that will create the data that will be returned. I really need to do the work an child process. The function run use an external software to calculate the data to return. If I don't create a new process only the first request will succeed (the external software constraint).
The child process take about 10 seconds to do the work. The parent wait 20 seconds and then return a response using the data calculated by the child. For the client everything is working. In the server I get exception (Broken pip).
When the child continue executing the client has closed the socket so it raise an exception. What should I do to fix my problem?

Related

Django + multiprocessing.dummy.Pool + sleep = weird result

In my Django application, I want to do some work in background when a certain view is requested. To that end, I created a multiprocessing.dummy.Pool of workers, and whenever that URL is called, I start a new process on it. The task to be executed in background can have to do some retries with a certain timeout between them.
Since this whole thing is executed, so to speak, not on a UI thread, I thought I'd use sleep for timeouts. When I unittest this arrangement, everything works fine, but when this runs in Django, the thread gets to the sleep statement and then never wakes up, but when I restart the Django app, the thread gets past the sleep statement and then is immediately killed by the restart. I know I could schedule retries using Timers, but I wanted a simpler solution.
Here's a simplified version of my code:
from multiprocessing.dummy import Pool
POOL = Pool(settings.POOL_WORKERS)
def background_task(arg):
refresh = True
try:
for i in range(settings.GET_RETRY_LIMIT):
status, result = (arg, refresh=refresh)
refresh = False
if status is Statuses.OK:
return result
if i < settings.GET_RETRY_LIMIT - 1:
sleep(settings.GET_SLEEP_TIME)
except Exception as e:
logging.error(e)
return []
def do_background_work(arg):
POOL.apply_async(
background_task,
(arg)
)
def my_view(request):
arg = get_arg_from_request(request)
do_background_work(arg)
return Response("Ok")
UPD: By the way, turns out that the workers are most probably killed by Harakiri

Python Tornado - Asynchronous Request is blocking

The request handlers are as follows:
class TestHandler(tornado.web.RequestHandler): # localhost:8888/test
#tornado.web.asynchronous
def get(self):
t = threading.Thread(target = self.newThread)
t.start()
def newThread(self):
print "new thread called, sleeping"
time.sleep(10)
self.write("Awake after 10 seconds!")
self.finish()
class IndexHandler(tornado.web.RequestHandler): # localhost:8888/
def get(self):
self.write("It is not blocked!")
self.finish()
When I GET localhost:8888/test, the page loads 10 seconds and shows Awake after 10 seconds; while it is loading, if I open localhost:8888/index in a new browser tab, the new index page is not blocked and loaded instantly. These fit my expectation.
However, while the /test is loading, if I open another /test in a new browser tab, it is blocked. The second /test only starts processing after the first has finished.
What mistakes have I made here?
What you are seeing is actually a browser limitation, not an issue with your code. I added some extra logging to your TestHandler to make this clear:
class TestHandler(tornado.web.RequestHandler): # localhost:8888/test
#tornado.web.asynchronous
def get(self):
print "Thread starting %s" % time.time()
t = threading.Thread(target = self.newThread)
t.start()
def newThread(self):
print "new thread called, sleeping %s" % time.time()
time.sleep(10)
self.write("Awake after 10 seconds!" % time.time())
self.finish()
If I open two curl sessions to localhost/test simultaneously, I get this on the server side:
Thread starting 1402236952.17
new thread called, sleeping 1402236952.17
Thread starting 1402236953.21
new thread called, sleeping 1402236953.21
And this on the client side:
Awake after 10 seconds! 1402236962.18
Awake after 10 seconds! 1402236963.22
Which is exactly what you expect. However in Chromium, I get the same behavior as you. I think that Chromium (perhaps all browsers) will only allow one connection at a time to be opened to the same URL. I confirmed this by making IndexHandler run the same code as TestHandler, except with slightly different log messages. Here's the output when opening two browser windows, one to /test, and one to /index:
index Thread starting 1402237590.03
index new thread called, sleeping 1402237590.03
Thread starting 1402237592.19
new thread called, sleeping 1402237592.19
As you can see both ran concurrently without issue.
I think you picked the "wrong" test for checking parallel GET requests, that's because you're using a blocking function for your test: time.sleep(), which its behavior doesn't really occur when you simply render an HTML page ...
What happens is, that the def get() ( which handle all GET requests ) is actually being blocked when you use time.sleep it cannot process any new GET requests, puts them in some kind of "queue".
So if you really want to test sleep() - use the Tornado non-blocking function: tornado.gen.sleep()
Example:
from tornado import gen
#gen.coroutine
def get(self):
yield self.time_wait()
#gen.coroutine
def time_wait(self):
yield gen.sleep(15)
self.write("done")
Open multiple tabs in your browser, then you'll see that all requests are being processed when they arrive w/o "queueing" the new requests that comes in ..

Stop processing Flask route if request aborted

I have a flask REST endpoint that does some cpu-intensive image processing and takes a few seconds to return. Often, this endpoint gets called, then aborted by the client. In these situations I would like to cancel processing. How can I do this in flask?
In node.js, I would do something like:
req.on('close', function(){
//some handler
});
I was expecting flask to have something similar, or a synchronous method (request.isClosed()) that I could check at certain points during my processing and return if it's closed, but I can't find one.
I thought about sending something to test that the connection is still open, and catching the exception if it fails, but it seems Flask buffers all outputs so the exception isn't thrown until the processing completes and tries to return the result:
An established connection was aborted by the software in your host machine
How can I cancel my processing half way through if the client aborts their request?
There is a potentially... hacky solution to your problem. Flask has the ability to stream content back to the user via a generator. The hacky part would be streaming blank data as a check to see if the connection is still open and then when your content is finished the generator could produce the actual image. Your generator could check to see if processing is done and return None or "" or whatever if it's not finished.
from flask import Response
#app.route('/image')
def generate_large_image():
def generate():
while True:
if not processing_finished():
yield ""
else:
yield get_image()
return Response(generate(), mimetype='image/jpeg')
I don't know what exception you'll get if the client closes the connection but I'm willing to bet its error: [Errno 32] Broken pipe
As far as I know you can't know if a connection was closed by the client during the execution because the server is not testing if the connection is open during the execution. I know that you can create your custom request_handler in your Flask application for detecting if after the request is processed the connection was "dropped".
For example:
from flask import Flask
from time import sleep
from werkzeug.serving import WSGIRequestHandler
app = Flask(__name__)
class CustomRequestHandler(WSGIRequestHandler):
def connection_dropped(self, error, environ=None):
print 'dropped, but it is called at the end of the execution :('
#app.route("/")
def hello():
for i in xrange(3):
print i
sleep(1)
return "Hello World!"
if __name__ == "__main__":
app.run(debug=True, request_handler=CustomRequestHandler)
Maybe you want to investigate a bit more and as your custom request_handler is created when a request comes you can create a thread in the __init__ that checks the status of the connection every second and when it detects that the connection is closed ( check this thread ) then stop the image processing. But I think this is a bit complicated :(.
I was just attempting to do this same thing in a project and I found that with my stack of uWSGI and nginx that when a streaming response was interrupted on the client's end that the following errors occurred
SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request
uwsgi_response_write_body_do(): Broken pipe [core/writer.c line 404] during GET
IOError: write error
and I could just use a regular old try and except like below
try:
for chunk in iter(process.stdout.readline, ''):
yield chunk
process.wait()
except:
app.logger.debug('client disconnected, killing process')
process.terminate()
process.wait()
This gave me:
Instant streaming of data using Flask's generator functionality
No zombie processes on cancelled connection

pyzmq create a process with its own socket

I have some code thats monitoring some other changing files, what i would like to do is to start that code that uses zeromq with different socket, the way im doing it now seems to cause assertions to fail somewhere in libzmq, since i may be reusing the same socket. how do i ensure when i create a new process from the monitor class the context will not be reused? thats what i think is going on, if you can tell there is some other stupidity on my part, please advise.
here is some code:
import zmq
from zmq.eventloop import ioloop
from zmq.eventloop.zmqstream import ZMQStream
class Monitor(object):
def __init(self)
self.context = zmq.Context()
self.socket = self.context.socket(zmq.DEALER)
self.socket.connect("tcp//127.0.0.1:5055")
self.stream = ZMQStream(self._socket)
self.stream.on_recv(self.somefunc)
def initialize(self,id)
self._id = id
def somefunc(self, something)
"""work here and send back results if any """
import json
jdecoded = json.loads(something)
if self_id == jdecoded['_id']
""" good im the right monitor for you """
work = jdecoded['message']
results = algorithm (work)
self.socket.send(json.dumps(results))
else:
"""let some other process deal with it, not mine """
pass
class Prefect(object):
def __init(self, id)
self.context = zmq.Context()
self.socket = self.context.socket(zmq.DEALER)
self.socket.bind("tcp//127.0.0.1:5055")
self.stream = ZMQStream(self._socket)
self.stream.on_recv(self.check_if)
self._id = id
self.monitors = []
def check_if(self,message):
"""find out from message's id whether we have
started a proces for it previously"""
import json
jdecoded = json.loads(message)
this_id = jdecoded['_id']
if this_id in self.monitors:
pass
else:
"""start new process for it should have its won socket """
new = Monitor()
import Process
newp = Process(target=new.initialize,args=(this_id) )
newp.start()
self.monitors.append(this_id) ## ensure its remembered
what is going on is that i want all the monitor processess and a single prefect process listening on the same port, so when prefect sees a request that it hasnt seen it starts a process for it, all the processes that exist probably should listen too but ignore messages not meant for them.
as it stands, if i do this i get some crash possibly related to concurrent access of the same zmq socket by something (i tried threading.thread, still crashes) i read somewhere that concurrent access of a zmq socket by different threads is not possible. How would i ensure that new processes get their own zmq sockets?
EDIT:
the main deal in my app is that a request comes in via zmq socket, and a process(s) thats listening reacts to the message by:
1. If its directed at that process judged by the _id field, do some reading on a file and reply since one of the monitors match the messages _id, if none match, then:
2 If the messages _id files is not recognized, all monitors ignore it but the Prefect creates a process to handle that _id and all future messages to that id.
3. I want all the messages to be seen by the monitor processes as well as the prefect process, seems that seems easiest,
4. All the messages are very small, avarage ~4096 bytes.
5. The monitor does some non-blocking read and for each ioloop it sends what it has found out
more-edit=>and the prefect process binds now, and it will receive messages and echo them so they can be seen by monitors. This is what i have in mind, as the architecture but its not final.
.
All the messages are arriving from remote users over a browser that lets the server know what a client wants and the server sends the message to the backend via zmq(i did not show this, but is not hard) so in production they might not bind/connect to localhost.
I chose DEALER since it allows asyc / unlimited messages in either direction (see point 5.) and DEALER can bind with DEALER, and initial request/reply can arrive from either side. The other that can do this is possibly DEALER/ROUTER.
You are correct that you cannot keep using the same socket in a subprocess (multiprocessing usually uses fork to create subprocesses). In general, what this means is that you don't want to create the socket that will be used in the subprocess until after the subprocess starts.
Since, in your case, the socket is an attribute on the Monitor object, you probably don't want to create the Monitor in the main process at all. That would look something like this:
def start_monitor(this_id):
monitor = Monitor()
monitor.initialize(this_id)
# run the eventloop, or this will return immediately and destroy the monitor
... inside Prefect.check_if():
proc = Process(target=start_monitor, args=(this_id,))
proc.start()
self.monitors.append(this_id)
rather than your example, where the only thing the subprocess does is assign an ID and then kill the process, ultimately having no effect.

Dbus/GLib Main Loop, Background Thread

I'm starting out with DBus and event driven programming in general. The service that I'm trying to create really consists of three parts but two are really "server" things.
1) The actual DBus server talks to a remote website over HTTPS, manages sessions, and conveys info the clients.
2) The other part of the service calls a keep alive page every 2 minutes to keep the session active on the external website
3) The clients make calls to the service to retrieve info from the service.
I found some simple example programs. I'm trying to adapt them to prototype #1 and #2. Rather than building separate programs for both. I thought I that I can run them in a single, two threaded process.
The problem that I'm seeing is that I call time.sleep(X) in my keep alive thread. The thread goes to sleep, but won't ever wake up. I think that the GIL isn't released by the GLib main loop.
Here's my thread code:
class Keepalive(threading.Thread):
def __init__(self, interval=60):
super(Keepalive, self).__init__()
self.interval = interval
bus = dbus.SessionBus()
self.remote = bus.get_object("com.example.SampleService", "/SomeObject")
def run(self):
while True:
print('sleep %i' % self.interval)
time.sleep(self.interval)
print('sleep done')
reply_status = self.remote.keepalive()
if reply_status:
print('Keepalive: Success')
else:
print('Keepalive: Failure')
From the print statements, I know that the sleep starts, but I never see "sleep done."
Here is the main code:
if __name__ == '__main__':
try:
dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)
session_bus = dbus.SessionBus()
name = dbus.service.BusName("com.example.SampleService", session_bus)
object = SomeObject(session_bus, '/SomeObject')
mainloop = gobject.MainLoop()
ka = Keepalive(15)
ka.start()
print('Begin main loop')
mainloop.run()
except Exception as e:
print(e)
finally:
ka.join()
Some other observations:
I see the "begin main loop" message, so I know it's getting control. Then, I see "sleep %i," and after that, nothing.
If I ^C, then I see "sleep done." After ~20 seconds, I get an exception from self.run() that the remote application didn't respond:
DBusException: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
What's the best way to run my keep alive code within the server?
Thanks,
You have to explicitly enable multithreading when using gobject by calling gobject.threads_init(). See the PyGTK FAQ for background info.
Next to that, for the purpose you're describing, timeouts seem to be a better fit. Use as follows:
# Enable timer
self.timer = gobject.timeout_add(time_in_ms, self.remote.keepalive)
# Disable timer
gobject.source_remove(self.timer)
This calls the keepalive function every time_in_ms (milli)seconds. Further details, again, can be found at the PyGTK reference.

Categories

Resources