urllib2.urlopen fails in deployed app - python

I wonder if there is something restrictive about the app engine proxy that serves url fetch requests that just changed today. For example, this url http://www.google.com/ig/calculator?q=1eur=?usd was working without a hitch until a few hours ago. This is the error I'm getting now
http://www.google.com/ig/calculator?q=1eur=?usd returned an error: HTTP Error 503: Service Unavailable
Note that in the SDK (who uses my local internet connection) the code below works. Also 'curl http://www.google.com/ig/calculator?q=1eur=?usd' works, so I don't think that it's google restricting that the request doesn't come from an end user browser (ie. no vainilla user agent). It's probably something that has changed a few hours ago in app engine infrastructure.
url = 'http://www.google.com/ig/calculator?q=1eur=?usd'
request = urllib2.Request(url = url, data = None)
try:
response = urllib2.urlopen(request)
except urllib2.URLError, e:
raise Exception("%s returned an error: %s" % (url, e))

As noted in the comments, it's very likely you are seeing being throttled. iGoogle hosts a number of private (but not secret) APIs for use by Google-authored gadgets that run on the page (the weather API is another widely-used example). However, they're not really intended for consumption by non-Google gadgets or applications, and their implementation can (and does) change without notice.
Furthermore, iGoogle is a deprecated product. I would expect that those utility APIs will go away simultaneously with the iGoogle shutdown (Nov 1, 2013). If you don't want your application to break when iGoogle goes away, I'd advise finding a different source for this information.

Related

Control API: Service unavailable (503)

Good morning,
I want to query households (my first query and generally first experience with the Sonos API) and have authenticated successfully. I got an access token and query the Control API like this:
headers={"Content-Type" : "application/json",
"Authorization" : "Bearer " + token["access_token"]}
resp = re.get('http://api.ws.sonos.com/control/api/v1/househoulds', headers=headers)
It returns me a response with error code "503: Service unavailable":
Service Unavailable
Service Unavailable - Zero size object
The server is temporarily unable to service your request. Please try again
later.
Reference XXXXX
(I cut out the reference because I am not sure, if it contains credentials). I remember that when I intentionally changed my access token to a wrong one yesterday, I would get an error code back that I am not authorized. But now when I change it to a false one I still just get this same page back (503: Service unavailable).
Does anyone have the same problem? Might it be some security mechanism because I authorized many times in a short time or is the control API just currently down? I tried yesterday and today and don't see a blog post stating a downtime.
I see two issues with the code snippet you provided:
Issue 1: Your API URL has a typo. You used "househoulds" instead of
"households".
Issue 2: Your URL needs to use https://, not http://
If you fix those two issues and are indeed using a valid access token, your request should work.

'CORS request did not succeed' when uploading an image and Flask raises error (Firefox only)

I am trying to debug a CORS issue with my app. Specifically, it fails only in Firefox and, it seems, only with somewhat bigger files.
I am using flask on the backend and I am trying to upload a "faulty" image to my service. When I say faulty, I mean that the backend should reject the image with a 400 (only accept PNG, not JPG). Uploading a PNG of any size works ok. However, when I reject the JPG file, the browser request fails with Network error and I cannot capture the 400-error to display a user-friendly message. From the backend's side, everything is the same, always same headers returned, be it accepted or rejected request, POST or OPTIONS.
However, I have noticed that it only fails with somewhat bigger files. If I send a JPG of a few KBs, it works. If I send a JPG of a few MBs, it fails.
I have looked at everything
curl-ing the backend gives all the right headers
there are no OPTIONS requests logged by the browsers, but if there were, I've also checked those with curl for the right headers
I'm only using HTTP (not HTTPS), so no problems with certificates
I have disabled all extensions, so no possible blocking from the browser
maybe other things that I cannot remember
What can possibly be the cause? Note that everything works as expected
Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at http://localhost:8083/api/image. (Reason: CORS request did not succeed).
Well, after a couple of hours of trials, it appears this has nothing to do with CORS. This is probably the most confusing error message. To cite from Firefox' documentation on this (emphasis mine):
The HTTP request which makes use of CORS failed because the HTTP connection failed at either the network or protocol level. The error is not directly related to CORS, but is a fundamental network error of some kind. In many cases, it is caused by a browser plugin (e.g. an ad blocker or privacy protector) blocking the request.
So, this should actually indicate that the problem is on the backend, although it is very subtle.
Since in my code I am rejecting the request based on the trasmitted filename, I never read the content of the request if the name ends with .jpg. Instead, I reject it immediately. This is a problem with Flask's development server, which does not empty the input stream in such cases (issue here).
So, if you want to deal with this while keeping the development server, you should consume the input. In my case, I added a custom error handler, like so:
class BadRequestError(ValueError):
"""Raised when a request does not conform to the protocol"""
pass
#app.errorhandler(BadRequestError)
def bad_request_handler(error):
# throw away the request data to avoid closing the connection before receiving all of it
# http://flask.pocoo.org/snippets/47/
_ = request.data
_ = request.form
response = jsonify(str(error))
response.status_code = 400
return response
and then, in the code, I always raise BadRequestError('...'), instead of just returning a 400-response.

Appengine urlfetch issue (Python)

I'm trying to use urlfetch to make a request to my application (the same application which is sending the request) however, it doesn't work.
My code is as follows;
uploadurl = 'http://myapp.appspot.com/posturl'
result = urlfetch.fetch(
url=uploadurl,
payload=data,
method=urlfetch.POST,
headers={'Content-Type': 'application/x-www-form-urlencoded'})
There is no error at all when I call this, and everything seems to work correctly, however the request never arrives. For debugging purposes, I changed the uploadurl to a different application which I own and it worked fine. Any ideas why I can't send requests using urlfetch to the same application?
The full (real) url that I would call is made by
session = str(os.urandom(16).encode('hex'))
uploadurl = blobstore.create_upload_url('/process?session=' + session)
So I can't understand how that could be incorrect as the url is made for me.
Thanks.
I don't know how you're verifying that the request "never arrives". The blobstore URLs are not handled by your application's actual code, but by the App Engine runtime itself, so if you're looking in the logs you won't see that request there.
I think it is not possible, to prevent endless loop. From the urlfetch api documentation page:
To prevent an app from causing an endless recursion of requests, a
request handler is not allowed to fetch its own URL. It is still
possible to cause an endless recursion with other means, so exercise
caution if your app can be made to fetch requests for URLs supplied by
the user.

Web2py ticket invalid links

I started playing around with web2py the other day for a new project. I really like the structure and the whole concept which feels like a breath of fresh air after spending a few years with PHP frameworks.
The only thing (currently) that is bothering me is the ticketing system. Each time I make a misstake a page with a link to a ticket is presented. I guess I could live with that if the link worked. It currently points to an admin page with http as protocol instead of https. I've done a bit of reading and the forced https for admin seems to be a security measure, but this makes debugging a pain.
Whats the standard solution here? Alter the error page, allow http for admin och use logs for debugging?
Best regards
Fredrik
I was in the same boat as you, I did not like the default mechanism. Luckily, customized exception handling with web2py is very straightforward. Take a look at routes.py in the root of your web2py directory. I've added the following to mine:
routes_onerror = [('application_name/*','/application_name/error/index')]
This routes any exceptions to my error handler controller (application_name/controllers/error.py) in which I defined my def index as:
def index():
if request.vars.code == '400':
return(dict(app=request.application,
ticket=None,
traceback="A 400 error was raised, this is controller/method path not found",
code=None,
layer=None,
wasEmailed=False))
elif request.vars.code == '404':
return(dict(app=request.application,
ticket=None,
traceback="A 404 error was raised, this is bad.",
code=None,
layer=None,
wasEmailed=False))
else:
fH = file('applications/%s/errors/%s' % (request.application,request.vars.ticket.split("/")[1]))
e = cPickle.load(fH)
fH.close()
__sendEmail(request.application,e['layer'],e['traceback'],e['code'])
return(dict(app=request.application,
ticket=request.vars.ticket,
traceback=e['traceback'],
code=e['code'],
layer=e['layer'],
wasEmailed=True))
As you can see for non-400 and 404 errors, I'm emailing the traceback to myself and then invoking the corresponding views/error/index.html. In production, this view gives a generic "I'm sorry an error has occurred, developers have been emailed". On my development server, it displays the formatted traceback.
Normally, I just use http://127.0.0.1/ (if you are local or over ssh) or edit/navigate using https://...
So, you will logon the admin app the first time, but always will the show the tickets after.

Web/Screen Scraping with Google App Engine - Code works in python interpreter but not GAE

I want to do some web scraping with GAE. (Infinite Campus Student Information Portal, fyi). This service requires you to login to get in the website.
I had some code that worked using mechanize in normal python. When I learned that I couldn't use mechanize in Google App Engine I ended up using urllib2 + ClientForm. I couldn't get it to login to the server, so after a few hours of fiddling with cookie handling I ran the exact same code in a normal python interpreter, and it worked. I found the log file and saw a ton of messages about stripping out the 'host' header in my request... I found the source file on Google Code and the host header was in an 'untrusted' list and removed from all requests by user code.
Apparently GAE strips out the host header, which is required by I.C. to determine which school system to log you in, which is why it appeared like I couldn't login.
How would I get around this problem? I can't specify anything else in my fake form submission to the target site. Why would this be a "security hole" in the first place?
App Engine does not strip out the Host header: it forces it to be an accurate value based on the URI you are requesting. Assuming that URI's absolute, the server isn't even allowed to consider the Host header anyway, per RFC2616:
If Request-URI is an absoluteURI, the host is part of the Request-URI.
Any Host header field value in the
request MUST be ignored.
...so I suspect you're misdiagnosing the cause of your problem. Try directing the request to a "dummy" server that you control (e.g. another very simple app engine app of yours) so you can look at all the headers and body of the request as it comes from your GAE app, vs, how it comes from your "normal python interpreter". What do you observe this way?

Categories

Resources