HTTP METHOD categorization cancel vs. delete

HTTP METHOD categorization cancel vs. delete - python

In an API, what HTTP METHOD should be used for a cancel operation.
I imagine this wouldn't be a DELETE request, because the resource is not being disposed of. In which case, should it be a POST or a PUT ? Here is some documentation, but I still am not clear on the distinction from this: http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html

This isn't a DELETE. Cancelling an operation is a state change and a state change means an update.
I would personally use a PUT because you normally know the URI of the resource you are updating.
Also see this post for more details: PUT vs POST in REST.

Related

Problem with detecting if link is invalid

Is there any way to detect if a link is invalid using webbot?
I need to tell the user that the link they provided was unreachable.

The only way to be completely sure that a url sends you to a valid page is to fetch that page and check it works. You could try making a request other than GET to try to avoid the wasted bandwith downloading the page, but not all servers will respond: the only way to be absolutely sure is to GET and see what happens. Something like:
import requests
from requests.exceptions import ConnectionError
def check_url(url):
try:
r = requests.get(url, timeout=1)
return r.status_code == 200
except ConnectionError:
return False
Is this a good idea? It's only a GET request, and get is supposed to idempotent, so you shouldn't cause anybody any harm. On the other hand, what if a user sets up a script to add a new link every second pointing to the same website? Then you're DDOSing that website. So when you allow users to cause your server to do things like this, you need to think how you might protect it. (In this case: you could keep a cache of valid links expiring every n seconds, and only look up if the cache doesn't hold the link.)
Note that if you just want to check the link points to a valid domain it's a bit easier: you can just do a dns query. (The same point about caching and avoiding abuse probably applies.)
Note that I used requests, because it is easy, but you likely want to do this in the bacground, either with requests in a thread, or with one of the asyncio http libraries and an asyncio event loop. Otherwise your code will block for at least timeout seconds.
(Another attack: this gets the whole page. What if a user links to a massive page? See this question for a discussion of protecting from oversize responses. For your use case you likely just want to get a few bytes. I've deliberately not complicated the example code here because I wanted to illustrate the principle.)
Note that this just checks that something is available on that page. What if it's one of the many dead links which redirects to a domain-name website? You could enforce 'no redirects'---but then some redirects are valid. (Likewise, you could try to detect redirects up to the main domain or to a blacklist of venders' domains, but this will always be imperfect.) There is a tradeoff here to consider, which depends on your concrete use case, but it's worth being aware of.

You could try sending an HTTP request, opening the result, and have a list of known error codes, 404, etc. You can easily implement this in Python and is efficient and quick. Be warned that SOMETIMES (quite rarely) a website might detect your scraper and artificially return an Error Code to confuse you.

Do I need to use methods=['GET', 'POST'] in #app.route()?

My Forms send the age parameter via GET, and it worked with just this:
#app.route("/foo")
def foo():
age = request.args['age']
I did not bother with
#app.route('/foo', methods=['GET', 'POST'])
Does it matter?

It does not matter, in the sense that it will work. However usually, you would like to have several functions doing different things like. POST to /foo, means that you add an element, GET to /foo means that you retrieve the element(s) and DELETE to /foo means that you delete an element.

If you don't specify a methods argument to app.route(), then the default is to only accept GET and HEAD requests (*).
You only need to explicitly set methods if you need to accept other HTTP methods, such as POST, otherwise Flask will respond with a 405 Method Not Allowed HTTP response code when a client uses a HTTP method you didn't list, and your route function is simply not called.
So if your route should handle both GET and POST requests, but you forgot to add methods=['GET', 'POST'] to #route(), then you have a bug as POST requests result in a 405 response instead of your route handling the request.
In your case, however, you should not use methods=['GET', 'POST'], and instead let clients that try to use POST anyway know your route doesn't handle that method. Better to be explicit about the error than let it silently pass.
(*) HEAD is added whenever you use register a route that handles GET, and in case of a HEAD request, your route is called and only the headers are then served to the client. Flask automatically handles OPTIONS for you, the route is not called in that case.

As always, the answer is: it depends.
If you don't provide "methods" arguments, then Flask assumes the HTTP method is GET (and also accepts HEAD). So long as that assumption is valid, your code will work just fine.
If, however, your web page is communicated as a POST method (or DELETE, etc.), Flask will fail and complain that the POST (or DELETE, etc.) request is not allowed.
Think of this requirement as a redundancy check. Flask could have been written to adapt to whatever method is used in the HTTP request. Instead, Flask insists that you specify the method as a signal that the form of communication is intentional. This requirement makes the Flask implementation a little simpler at the cost of imposing the responsibility of coordinating the client-server interface on the programmer.

Invalidate an old session in Flask

How do I create a new clean session and invalidate the current one in Flask?
Do I use make_null_session() or open_session()?

I do this by calling session.clear().
EDIT:
After reading your comment in another answer, I see that you're trying to prevent a replay attack that might be made using a cookie that was issued in the past. I solved that problem as much as possible* with this approach:
Override SecureCookieSessionInterface.save_session(), copying the code from the overridden version rather than calling it.
When the overridden version of save_session() calls save_cookie(), make it pass a session_expires argument 30 minutes in the future. This causes cookies more than 30 minutes old to be considered invalid.
Make the overridden version of save_session() update a session variable every so often, to make sure the cookie and its session_expires time get rewritten regularly. (I name this session variable '_refresh' and store the current time in it, then rewrite it only if more than a few seconds have passed since the last-stored time. This optimization avoids rewriting the cookie on every HTTP request.)
Duplicating Flask code in the custom save_session() makes this approach a bit ugly and brittle, but it is necessary in order to change the arguments passed to save_cookie(). It would be nice if Flask made this easier, or at least implemented its own safeguard against replay attacks.
*WARNING: This approach by itself will not stop replay attacks that might happen during a session cookie's valid lifetime. This fundamental problem with cookie-based sessions is discussed in RFC 6896 and A Secure Cookie Protocol by Liu, Kovacs, Huang, Gouda.

If you have security concerns (and everyone should have) There is the answer:
This is not REALLY possible
Flask uses cookie-based sessions. When you edit or delete session, you send a REQUEST to CLIENT to remove the cookie, normal clients (browsers) will do. But if session hijacked by an attacker, the attacker's session remains valid.

You can add an after_request callback to remove the session cookie if a particular flag is set:
#app.after_request
def remove_if_invalid(response):
if "__invalidate__" in session:
response.delete_cookie(app.session_cookie_name)
return response
Then you simply set that session key whenever you want to invalidate the session:
#app.route("/logout")
def logout():
session["__invalidate__"] = True
return redirect(url_for("index"))
See also: http://werkzeug.pocoo.org/docs/wrappers/#werkzeug.wrappers.BaseResponse.delete_cookie

If you use default flask sessions and set the app.permanent_session_lifetime, then the session will not work if a user tries to replay the same session as long as the session has expired.If you look at the source code for open_session, there is line:
max_age = total_seconds(app.permanent_session_lifetime)
try:
data = s.loads(val, max_age=max_age)
return self.session_class(data)
except BadSignature:
return self.session_class()

GAE unsubscribe from a user's presence

Is there a way to unsubscribe from a user's presence? I no longer want to receive updates on /_ah/xmpp/presence/... for a particular user. I can't seem to find a simple API call to do that.
After digging around the XMPP protocol I found this which seems to indicate that doing a send_presence with presence type of 'unsubscribe' should work. Unfortunately digging into the GAE's xmpp API it appears that it defines
_VALID_PRESENCE_TYPES = frozenset([PRESENCE_TYPE_AVAILABLE,
PRESENCE_TYPE_UNAVAILABLE,
PRESENCE_TYPE_PROBE])
Which means I can't even do a send_presence(user_to_remove, status="", presence_type="unsubscribe") (PRESENCE_TYPE_AVAILABLE and others are just strings like "available" as per the xmpp specificiation)
Has anyone come across this issue or know how to achieve this ?

It seems that you can't. The docs (and the docstring) confirm that presence_type accepts a subset of the types defined in RFC 3921.
You can submit this as a feature request to the issue tracker.

As an experiment, you could re-implement your own "send_presence" that does the same thing as the existing function, without the check for valid presence types. Not officially sanctioned but worth a try.
One thing to note is that this won't block clients from re-subscribing from your bot or from badly-behaved clients ignoring it.
And as Drew mentioned, please do submit an issue on the issue tracker.

What can I attach to pylons.request in Pylons?

I want keep track of a unique identifier for each browser that connects to my web application (that is written in Pylons.) I keep a cookie on the client to keep track of this, but if the cookie isn't present, then I want to generate a new unique identifier that will be sent back to the client with the response, but I also may want to access this value from other code used to generate the response.
Is attaching this value to pylons.request safe? Or do I need to do something like use threading_local to make a thread local that I reset when each new request is handled?

Why do you want a unique identifier? Basically every visitor already gets a unique identifier, his Session. Beaker, Pylons session and caching middleware, does all the work and tracks visitors, usually with a Session cookie. So don't care about tracking users, just use the Session for what it's made for, to store whatever user specific stuff you have .
from pylons import session
session["something"] = whatever()
session.save()
# somewhen later
something = session["something"]

Whatever you were to set on the request will only survive for the duration of the request. the problem you are describing is more appropriately handled with a Session as TCH4k has said. It's already enabled in the middleware, so go ahead.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.