According to the beaker documentation:
Beaker does not automatically delete expired or old cookies on any of its back-ends. This task is left up to the developer based on how sessions are being used, and on what back-end.
Using mcinspect I've found that my memcached instance does seem to be persisting session records for much longer than the session is valid/in use.
What would be the best approach to remove deleted/expired/old/invalid beaker sessions from memcached?
What are you asking, or why are you asking this? Do you mean invalid/old as in your logic, or are they actually expired?
If they are really expired in memcached then they'll be 'gone' for all practical purposes -> they will not be returned and the space is free for new sessions. No need to do anything
If they are old/invalid in some other domain, and you cannot check that, you might need to flush your memcached, but that'll remove all your entries.
Related
So, in order to avoid the "no one best answer" problem, I'm going to ask, not for the best way, but the standard or most common way to handle sessions when using the Tornado framework. That is, if we're not using 3rd party authentication (OAuth, etc.), but rather we have want to have our own Users table with secure cookies in the browser but most of the session info stored on the server, what is the most common way of doing this? I have seen some people using Redis, some people using their normal database (MySQL or Postgres or whatever), some people using memcached.
The application I'm working on won't have millions of users at a time, or probably even thousands. It will need to eventually get some moderately complex authorization scheme, though. What I'm looking for is to make sure we don't do something "weird" that goes down a different path than the general Tornado community, since authentication and authorization, while it is something we need, isn't something that is at the core of our product and so isn't where we should be differentiating ourselves. So, we're looking for what most people (who use Tornado) are doing in this respect, hence I think it's a question with (in theory) an objectively true answer.
The ideal answer would point to example code, of course.
Here's how it seems other micro frameworks handle sessions (CherryPy, Flask for example):
Create a table holding session_id and whatever other fields you'll want to track on a per session basis. Some frameworks will allow you to just store this info in a file on a per user basis, or will just store things directly in memory. If your application is small enough, you may consider those options as well, but a database should be simpler to implement on your own.
When a request is received (RequestHandler initialize() function I think?) and there is no session_id cookie, set a secure session-id using a random generator. I don't have much experience with Tornado, but it looks like setting a secure cookie should be useful for this. Store that session_id and associated info in your session table. Note that EVERY user will have a session, even those not logged in. When a user logs in, you'll want to attach their status as logged in (and their username/user_id, etc) to their session.
In your RequestHandler initialize function, if there is a session_id cookie, read in what ever session info you need from the DB and perhaps create your own Session object to populate and store as a member variable of that request handler.
Keep in mind sessions should expire after a certain amount of inactivity, so you'll want to check for that as well. If you want a "remember me" type log in situation, you'll have to use a secure cookie to signal that (read up on this at OWASP to make sure it's as secure as possible, thought again it looks like Tornado's secure_cookie might help with that), and upon receiving a timed out session you can re-authenticate a new user by creating a new session and transferring whatever associated info into it from the old one.
Tornado designed to be stateless and don't have session support out of the box.
Use secure cookies to store sensitive information like user_id.
Use standard cookies to store not critical information.
For storing large objects - use standard scheme - MySQL + memcache.
The key issue with sessions is not where to store them, is to how to expire them intelligently. Regardless of where sessions are stored, as long as the number of stored sessions is reasonable (i.e. only active sessions plus some surplus are stored), all this data is going to fit in RAM and be served fast. If there is a lot of old junk you may expect unpredictable delays (the need to hit the disk to load the session).
There isn't anything built directly into Tornado for this purpose. As others have commented already, Tornado is designed to be a very fast async framework. It is lean by design. However, it is possible to hook in your own session management capability. You need to add a preamble section to each handler that would create or grab a session container. You will need to store the session ID in a cookie. If you are not strictly HTTPS then you will want to use a secure cookie. The session persistence can be any technology of your choosing such as Redis, Postgres, MySQL, a file store, etc...
There is a Github project that provides session management for Tornado. Even if you decide not to use it, it can provide insight into how to structure your own session management. The Github project is called dustdevil. Full disclosure - we created this several years ago but find it very easy to use and have it in active use today.
I am using pyramid to create a web application. I am then using pyramid-beaker to interface beaker into pyramid's session management system.
Two values affect the duration of a user's session.
The session cookie timeout
The actual session's life time on either disk/memcache/rdbms/etc
I currently have to cookie defaulted (via the standard beaker config) to delete when the browser closes. I have the session data set to clear out after 2 hours. This works prefectly.
What I need to know is how to override the cookie's timeout and the session timeout to both be 30 days or some other arbirtrary value.
Changing the timeout isn't supported by beaker. If you are trying to make a session stick around that long, you should probably just put it into a separate cookie. A common use-case is the "remember me" checkbox on login. This helps you track who the user is, but generally the actual session shouldn't be sticking around that long and gets recreated.
I have a solution. Its old but works.
I'm using sessions in Django to store login user information as well as some other information. I've been reading through the Django session website and still have a few questions.
From the Django website:
By default, Django stores sessions in
your database (using the model
django.contrib.sessions.models.Session).
Though this is convenient, in some
setups it’s faster to store session
data elsewhere, so Django can be
configured to store session data on
your filesystem or in your cache.
Also:
For persistent, cached data, set
SESSION_ENGINE to
django.contrib.sessions.backends.cached_db.
This uses a write-through cache –
every write to the cache will also be
written to the database. Session reads
only use the database if the data is
not already in the cache.
Is there a good rule of thumb for which one to use? cached_db seems like it would always be a better choice because best case, the data is in the cache, and worst case it's in the database where it would be anyway. The one downside is I have to setup memcached.
By default, SESSION_EXPIRE_AT_BROWSER_CLOSE is set
to False, which means session cookies
will be stored in users' browsers for
as long as SESSION_COOKIE_AGE. Use
this if you don't want people to have
to log in every time they open a
browser.
Is it possible to have both, the session expire at the browser close AND give an age?
If value is an integer, the session
will expire after that many seconds of
inactivity. For example, calling
request.session.set_expiry(300) would
make the session expire in 5 minutes.
What is considered "inactivity"?
If you're using the database backend, note that session data can
accumulate in the django_session
database table and Django does not
provide automatic purging. Therefore,
it's your job to purge expired
sessions on a regular basis.
So that means, even if the session is expired there are still records in my database. Where exactly would one put code to "purge the db"? I feel like you would need a seperate thread to just go through the db every once in awhile (Every hour?) and delete any expired sessions.
Is there a good rule of thumb for which one to use?
No.
Cached_db seems like it would always be a better choice ...
That's fine.
In some cases, there a many Django (and Apache) processes querying a common database. mod_wsgi allows a lot of scalability this way. The cache doesn't help much because the sessions are distributed randomly among the Apache (and Django) processes.
Is it possible to have both, the session expire at the browser close AND give an age?
Don't see why not.
What is considered "inactivity"?
I assume you're kidding. "activity" is -- well -- activity. You know. Stuff happening in Django. A GET or POST request that Django can see. What else could it be?
Where exactly would one put code to "purge the db"?
Put it in crontab or something similar.
I feel like you would need a seperate thread to just go through the db every once in awhile (Every hour?)
Forget threads (please). It's a separate process. Once a day is fine. How many sessions do you think you'll have?
I'm building a website that doesn't require a database because a REST API "is the database". (Except you don't want to be putting site-specific things in there, since the API is used by mostly mobile clients)
However there's a few things that normally would be put in a database, for example the "jobs" page. You have master list view, and the detail views for each job, and it should be easy to add new job entries. (not necessarily via a CMS, but that would be awesome)
e.g. example.com/careers/ and example.com/careers/77/
I could just hardcode this stuff in templates, but that's no DRY- you have to update the master template and the detail template every time.
What do you guys think? Maybe a YAML file? Or any better ideas?
Thx
Why not still keep it in a database? Your remote REST store is all well and funky, but if you've got local data, there's nothing (unless there's spec saying so) to stop you storing some stuff in a local db. Doesn't have to be anything v glamorous - could be sqlite, or you could have some fun with redis, etc.
You could use the Memcachedb via the Django cache interface.
For example:
Set the cache backend as memcached in your django settings, but install/use memcachedb instead.
Django can't tell the difference between the two because the provide the same interface (at least last time I checked).
Memcachedb is persistent, safe for multithreaded django servers, and won't lose data during server restarts, but it's just a key value store. not a complete database.
Some alternatives built into the Python library are listed in the Data Persistence chapter of the documentation. Still another is JSON.
I'm writing a simple app with AppEngine, using Python. After a successful insert by a user and redirect, I'd like to display a flash confirmation message on the next page.
What's the best way to keep state between one request and the next? Or is this not possible because AppEngine is distributed? I guess, the underlying question is whether AppEngine provides a persistent session object.
Thanks
Hannes
No session support is included in App Engine itself, but you can add your own session support.
GAE Utilities is one library made specifically for this; a more heavyweight alternative is to use django sessions through App Engine Patch.
The ways to reliable keep state between requests are memcache, the datastore or through the user (cookies or post/get).
You can use the runtime cache too, but this is very unreliable as you don't know if a request will end up in the same runtime or the runtime can drop it's entire cache if it feels like it.
I really wouldn't use the runtime cache except for very specific situations, for example I use it to cache the serialization of objects to json as that is pretty slow and if the caching is gone I can regenerate the result easily.