How to handle a timeout

How to handle a timeout - python

I have an application using authorize.net, and we seem to have connection issues from time to time.
I don't think that the response is ever returned here.
response = createtransactioncontroller.getresponse()
When looking at the code for the authorizenet library, I noticed that the timeout is not being set.
self._httpResponse = requests.post(self.endpoint, data=xmlRequest, headers=constants.headers, proxies=proxyDictionary)
If that's the case, how should I handle the case that the response never gets returned? It seems like a hack to add a timeout to the authorizenet library because it is used outside the scope of my project, but it also doesn't seem like best practice to add my own timer. What solution would you recommend?

Related

Python SimpleHTTPServer keeps going down and I don't know why

This is my first time working with SimpleHTTPServer, and honestly my first time working with web servers in general, and I'm having a frustrating problem. I'll start up my server (via SSH) and then I'll go try to access it and everything will be fine. But I'll come back a few hours later and the server won't be running anymore. And by that point the SSH session has disconnected, so I can't see if there were any error messages. (Yes, I know I should use something like screen to save the shell messages -- trying that right now, but I need to wait for it to go down again.)
I thought it might just be that my code was throwing an exception, since I had no error handling, but I added what should be a pretty catch-all try/catch block, and I'm still experiencing the issue. (I feel like this is probably not the best method of error handling, but I'm new at this... so let me know if there's a better way to do this)
class MyRequestHandler(SimpleHTTPServer.SimpleHTTPRequestHandler):
# (this is the only function my request handler has)
def do_GET(self):
if 'search=' in self.path:
try:
# (my code that does stuff)
except Exception as e:
# (log the error to a file)
return
else:
SimpleHTTPServer.SimpleHTTPRequestHandler.do_GET(self)
Does anyone have any advice for things to check, or ways to diagnose the issue? Most likely, I guess, is that my code is just crashing somewhere else... but if there's anything in particular I should know about the way SimpleHTTPServer operates, let me know.

I never had SimpleHTTPServer running for an extended period of time usually I just use it to transfer a couple of files in an ad-hoc manner, but I guess that it wouldn't be so bad as long as your security restraints are elsewhere (ie firewall) and you don't have need for much scale.
The SSH session is ending, which is killing your tasks (both foreground and background tasks). There are two solutions to this:
Like you've already mentioned use a utility such as screen to prevent your session from ending.
If you really want this to run for an extended period of time, you should look into your operating system's documentation on how to start/stop/enable services (now-a-days most of the cool kids are using systemd, but you might also find yourself using SysVinit or some other init system)
EDIT:
This link is in the comments, but I thought I should put it here as it answers this question pretty well

is there a cleanup phase in mod_wsgi?

I am trying to do some logging in Django (mod_wsgi) of a view. I however want to do this so that the client is not held up similar to the perlcleanuphandler phase available in mod_perl. Notice the line "It is used to execute some code immediately after the request has been served (the client went away)". This is exactly what I want.
I want to client to be serviced and then I want to do the logging. Is there a good insertion point for the code in mod_wsgi or Django ? I looked into suggestions here and here. However, in both cases when I put a simple time.sleep(10) and do a curl/wget on the url, the curl doesn't return for 10 secs.
I even tried to put the time.sleep in __del__ method in the HttpResponse Object as suggested in one of the comments, but still no dice.
I am aware that I can probably put the logging data onto a queue and do some backgroud processing to store the logs, but I would like to avoid that approach if there is an other simpler/easier approach.
Any suggestions ?

See documentation at:
http://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode
for a WSGI specific (not mod_wsgi specific) way.
Django as pointed out by others may have its own ways of doing things as well, although whether it is fired after all the response content is written back to the client I don't know.

Various timeouts for python httplib

I'm implementing a little service that fetches web pages from various servers. I need to be able to configure different types of timeouts. I've tried mucking around with the settimeout method of sockets but it's not exactly as I'd like it. Here are the problems.
I need to specify a timeout for the initial DNS lookup. I understand this is done when I instantiate the HTTPConnection at the beginning.
My code is written in such a way that I first .read a chunk of data (around 10 MB) and if the entire payload fits in this, I move on to other parts of the code. If it doesn't fit in this, I directly stream the payload out to a file rather than into memory. When this happens, I do an unbounded .read() to get the data and if the remote side sends me, say, a byte of data every second, the connection just keeps waiting receiving one byte every second. I want to be able to disconnect with a "you're taking too long". A thread based solution would be the last resort.

httplib is to straight forward for what you are looking for.
I would recommend to take a look for http://pycurl.sourceforge.net/ and the http://curl.haxx.se/libcurl/c/curl_easy_setopt.html#CURLOPTTIMEOUT option.
The http://curl.haxx.se/libcurl/c/curl_easy_setopt.html#CURLOPT_NOSIGNAL option sounds also interesting:
Consider building libcurl with c-ares support to enable asynchronous DNS lookups, which enables nice timeouts for name resolves without signals.

Have you tried requests?
You can set timeouts conveniently http://docs.python-requests.org/en/latest/user/quickstart/#timeouts
>>> requests.get('http://github.com', timeout=0.001)
EDIT:
I missed the part 2 of the question. For that you could use this:
import sys
import signal
import requests
class TimeoutException(Exception):
pass
def get_timeout(url, dns_timeout=10, load_timeout=60):
def timeout_handler(signum, frame):
raise TimeoutException()
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(load_timeout) # triger alarm in seconds
try:
response = requests.get(url, timeout=dns_timeout)
except TimeoutException:
return "you're taking too long"
return response
and in your code use the get_timeout function.
If you need the timeout to be available for other functions you could create a decorator.
Above code from http://pguides.net/python-tutorial/python-timeout-a-function/.

Python threading passing statuses

Basically what I'm trying to do is fetch a couple of websites using proxies and process the data. The problem is that the requests rarely fail in a convincing way, setting socket timeouts wasnt very helpful either because they often didn't work.
So what I did is:
q = Queue()
s = ['google.com','ebay.com',] # And so on
for item in s:
q.put(item)
def worker():
item = q.get()
data = fetch(item) # This is the buggy part
# Process the data, yadayada
for i in range(workers):
t = InterruptableThread(target=worker)
t.start()
# Somewhere else
if WorkerHasLivedLongerThanTimeout:
worker.terminate()
(InterruptableThread class)
The problem is that I only want to kill threads which are still stuck on the fetching part. Also, I want the item to return to the queue. Ie:
def worker():
self.status = 0
item = q.get()
data = fetch(item) # This is the buggy part
self.status = 1 # Don't kill me now, bro!
# Process the data, yadayada
# Somewhere else
if WorkerHasLivedLongerThanTimeout and worker.status != 1:
q.put(worker.item)
worker.terminate()
How can this be done?

edit: breaking news; see below · · · ······
I decided recently that I wanted to do something pretty similar, and what came out of it was the pqueue_fetcher module. It ended up being mainly a learning endeavour: I learned, among other things, that it's almost certainly better to use something like twisted than to try to kill Python threads with any sort of reliability.
That being said, there's code in that module that more or less answers your question. It basically consists of a class whose objects can be set up to get locations from a priority queue and feed them into a fetch function that's supplied at object instantiation. If the location's resources get successfully received before their thread is killed, they get forwarded on to the results queue; otherwise they're returned to the locations queue with a downgraded priority. Success is determined by a passed-in function that defaults to bool.
Along the way I ended up creating the terminable_thread module, which just packages the most mature variation I could find of the code you linked to as InterruptableThread. It also adds a fix for 64-bit machines, which I needed in order to use that code on my ubuntu box. terminable_thread is a dependency of pqueue_fetcher.
Probably the biggest stumbling block I hit is that raising an asynchronous exception as do terminable_thread, and the InterruptableThread you mentioned, can have some weird results. In the test suite for pqueue_fetcher, the fetch function blocks by calling time.sleep. I found that if a thread is terminate()d while so blocking, and the sleep call is the last (or not even the last) statement in a nested try block, execution will actually bounce to the except clause of the outer try block, even if the inner one has an except matching the raised exception. I'm still sort of shaking my head in disbelief, but there's a test case in pqueue_fetcher that reenacts this. I believe "leaky abstraction" is the correct term here.
I wrote a hacky workaround that just does some random thing (in this case getting a value from a generator) to break up the "atomicity" (not sure if that's actually what it is) of that part of the code. This workaround can be overridden via the fission parameter to pqueue_fetcher.Fetcher. It (i.e. the default one) seems to work, but certainly not in any way that I would consider particularly reliable or portable.
So my call after discovering this interesting piece of data was to heretofore avoid using this technique (i.e. calling ctypes.pythonapi.PyThreadState_SetAsyncExc) altogether.
In any case, this still won't work if you need to guarantee that any request whose entire data set has been received (and i.e. acknowledged to the server) gets forwarded on to results. In order to be sure of that, you have to guarantee that the bit that does that last network transaction and the forwarding is guarded from being interrupted, without guarding the entire retrieval operation from being interrupted (since this would prevent timeouts from working..). And in order to do that you need to basically rewrite the retrieval operation (i.e. the socket code) to be aware of whichever exception you're going to raise with terminable_thread.Thread.raise_exc.
I've yet to learn twisted, but being the Premier Python Asynchronous Networking Framework©™®, I expect it must have some elegant or at least workable way of dealing with such details. I'm hoping it provides a parallel way to implement fetching from non-network sources (e.g. a local filestore, or a DB, or an etc.), since I'd like to build an app that can glean data from a variety of sources in a medium-agnostic way.
Anyhow, if you're still intent on trying to work out a way to manage the threads yourself, you can perhaps learn from my efforts. Hope this helps.
· · · · ······ this just in:
I've realized that the tests that I thought had stabilized have actually not, and are giving inconsistent results. This appears to be related to the issues mentioned above with exception handling and the use of the fission function. I'm not really sure what's going on with it, and don't plan to investigate in the immediate future unless I end up having a need to actually do things this way.

Python Tornado - making POST return immediately while async function keeps working

so I have a handler below:
class PublishHandler(BaseHandler):
def post(self):
message = self.get_argument("message")
some_function(message)
self.write("success")
The problem that I'm facing is that some_function() takes some time to execute and I would like the post request to return straight away when called and for some_function() to be executed in another thread/process if possible.
I'm using berkeley db as the database and what I'm trying to do is relatively simple.
I have a database of users each with a filter. If the filter matches the message, the server will send the message to the user. Currently I'm testing with thousands of users and hence upon each publication of a message via a post request it's iterating through thousands of users to find a match. This is my naive implementation of doing things and hence my question. How do I do this better?

You might be able to accomplish this by using your IOLoop's add_callback method like so:
loop.add_callback(lambda: some_function(message))
Tornado will execute the callback in the next IOLoop pass, which may (I'd have to dig into Tornado's guts to know for sure, or alternatively test it) allow the request to complete before that code gets executed.
The drawback is that that long-running code you've written will still take time to execute, and this may end up blocking another request. That's not ideal if you have a lot of these requests coming in at once.
The more foolproof solution is to run it in a separate thread or process. The best way with Python is to use a process, due to the GIL (I'd highly recommend reading up on that if you're not familiar with it). However, on a single-processor machine the threaded implementation will work just as fine, and may be simpler to implement.
If you're going the threaded route, you can build a nice "async executor" module with a mutex, a thread, and a queue. Check out the multiprocessing module if you want to go the route of using a separate process.

I've tried this, and I believe the request does not complete before the callbacks are called.
I think a dirty hack would be to call two levels of add_callback, e.g.:
def get(self):
...
def _defered():
ioloop.add_callback(<whatever you want>)
ioloop.add_callback(_defered)
...
But these are hacks at best. I'm looking for a better solution right now, probably will end up with some message queue or simple thread solution.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to handle a timeout - python

Related

Python SimpleHTTPServer keeps going down and I don't know why

is there a cleanup phase in mod_wsgi?

Various timeouts for python httplib

Python threading passing statuses

Python Tornado - making POST return immediately while async function keeps working

Categories

Resources