AppEnginePlatformWarning - reason to use sockets? - python

In the Google App Engine standard environment, if you use urllib to make HTTPS requests, you'll get an AppEnginePlatformWarning which says you're using urlfetch instead of sockets.
I found the warning annoying, so I disabled it.
# Use the App Engine Requests adapter. This makes sure that Requests uses
# URLFetch.
requests_toolbelt.adapters.appengine.monkeypatch()
# squelch warning
requests.packages.urllib3.disable_warnings(
requests.packages.urllib3.contrib.appengine.AppEnginePlatformWarning
)
My question is - is there a good reason to switch to sockets? Specifically what is wrong with using urlfetch?

There's nothing wrong with using urlfetch, in fact it is the recommended method for issuing outbound HTTP(S) requests on GAE. From Issuing HTTP(S) Requests (emphasis on requests-related note mine):
App Engine uses the URL Fetch service to issue outbound HTTP(S)
requests.
For details about how the URL Fetch service is implemented and which
headers are sent in a URL Fetch request, see Outbound Requests.
Issuing an HTTP request
To issue an outbound HTTP request, use the urlfetch.fetch
method. For improved code portability, you can also use the Python
standard libraries urllib, urllib2, or httplib to issue HTTP
requests. When you use these libraries in App Engine, they perform
HTTP requests using App Engine's URL Fetch service. You can also use
the third-party requests library as long as you configure it to use
URLFetch.
The sockets support is rather the problematic one in GAE, it comes with a fairly long list of limitations and restrictions, see Sockets Python API Overview, in particular the Limitations and restrictions section.
The warning you see is not from GAE, it's from the 3rd-party requests library you use, which is why I highlighted the note in the above quote. IMHO it's safe to simply ignore/mask the warning in a GAE context.

Related

is "from flask import request" identical to "import requests"?

In other words, is the flask request class identical to the requests library?
I consulted:
http://flask.pocoo.org/docs/0.11/api/
http://docs.python-requests.org/en/master/
but cannot tell for sure. I see code examples where people seem to use them interchangeably.
No these are not only completely different libraries, but completely different purposes.
Flask is a web framework which clients make requests to. The Flask request object contains the data that the client (eg a browser) has sent to your app - ie the URL parameters, any POST data, etc.
The requests library is for your app to make HTTP request to other sites, usually APIs. It makes an outgoing request and returns the response from the external site.

Forward WSGI cookies to Requests

I am developing a WSGI middleware application (Python 2.7) using Werkzeug. This app works within a SAML SSO environment and needs a SAML token to be accessed.
The middleware also performs requests to other applications in the same SAML environment, acting on behalf of the logged in user. In order to do that without the need of user feedback, I need to forward the SAML session cookie that I can get from the WSGI environment to requests that I am performing using the Requests library.
My issue is that the cookies that I get from WSGI/Werkzeug can only be parsed as http.cookies.SimpleCooke , while Requests accepts cookielib.CookieJar instances.
I have not found a way to cleanly forward these session cookies without resorting to shameful hacks such as parsing the raw content of the set-cookie headers.
Any suggestions?
Thanks,
gm
Cookies are just HTTP headers. Just use pull the cookie value from http.cookies.SimpleCookie, and add it to your requests session's cookie jar.
Not a hack. :)

How to use SSL in Python?

I want to use SSL on Google App Engine. Is there a 3rd-party Python module I must use or can I just use the Google SDK?
Should work just fine out of the box, see;
https://code.google.com/appengine/docs/python/config/appconfig.html#Secure_URLs
"Use" SLL for what? Joachim has answered regarding serving your pages over SSL.
If you want an SSL client, then urlfetch allows https URLS. It gives you no control other than the "validate_certificate" boolean parameter, and I don't immediately see any documentation of what CAs/certificates it trusts. Of course it doesn't support any protocol other than HTTPS, but that's in keeping with the fact that in general, GAE does not allow free use of sockets.

making urllib request in Python from the client side

I've written a Python application that makes web requests using the urllib2 library after which it scrapes the data. I could deploy this as a web application which means all urllib2 requests go through my web-server. This leads to the danger of the server's IP being banned due to the high number of web requests for many users. The other option is to create an desktop application which I don't want to do. Is there any way I could deploy my application so that I can get my web-requests through the client side. One way was to use Jython to create an applet but I've read that Java applets can only make web-requests to the server it is deployed on and the only way to to circumvent this is to create a server side proxy which leads us back to the problem of the server's ip getting banned.
This might sounds sound like and impossible situation and I'll probably end up creating a desktop application but I thought I'd ask if anyone knew of an alternate solution.
Thanks.
You can use a signed Java applet, they can use the Java security mechanism to enable access to any site.
This tutorial explains exactly what you have to do: http://www-personal.umich.edu/~lsiden/tutorials/signed-applet/signed-applet.html
The same might be possible from a Flash applet. Javascript is also restricted to the published site and doesn't allow being signed or security exceptions like this, AFAIK.
You probably can use AJAX requests made from JavaScript that is a part of client-side.
Use server → client communication to give commands and necessary data to make a request
…and use AJAX communication from client to 3rd party server then.
This depends on the form of "scraping" you intend to do:
You might run into problems running an AJAX call to a third-party site. Please see Screen scraping through AJAX and javascript.
An alternative would be to do it server-side, but to cache the results so that you don't hit the third-party server unnecessarily.
Check out diggstripper on google code.

How do I implement secure authentication using xml-rpc in python?

I have a basic xml-rpc web service service running.
What is the simplest way(I'm a newbie) to implement secure authentication?
I just need some direction.
You could checkout This code for a simple XML-RPC server over HTTPS. Authentication can work in any way you wish ... they could authenticate with some credentials and you provide a cookie for the rest of the session.
The Python docs for xmlrpc include details of using the HTTP 'Authorization' header for passing in credentials.
Here is some code that uses Twisted to implement a xmlrpc auth mechanism, which could easily use HTTPS instead of HTTP.
This guy has written a HTTPS XML-RPC setup with authorization which you can download.
There are tons of resources, and ways of doing this which are easily googleable. This all depends on if you are using mod_wsgi for example, or writing a standalone server using Twisted.
Bottom line:
a) Use SSL for communication
b) Use the HTTP authorization mechanism

Categories

Resources