I'm using django to develop a website. On the server side, I need to transfer some data that must be processed on the second server (on a different machine). I then need a way to retrieve the processed data. I figured that the simplest would be to send back to the Django server a POST request, that would then be handled on a view dedicated for that job.
But I would like to add some minimum security to this process: When I transfer the data to the other machine, I want to join a randomly generated token to it. When I get the processed data back, I expect to also get back the same token, otherwise the request is ignored.
My problem is the following: How do I store the generated token on the Django server?
I could use a global variable, but I had the impression browsing here and there on the web, that global variables should not be used for safety reason (not that I understand why really).
I could store the token on disk/database, but it seems to be an unjustified waste of performance (even if in practice it would probably not change much).
Is there third solution, or a canonical way to do such a thing using Django?
You can store your token in django cache, it will be faster from database or disk storage in most of the cases.
Another approach is to use redis.
You can also calculate your token:
save some random token in settings of both servers
calculate token based on current timestamp rounded to 10 seconds, for example using:
token = hashlib.sha1(secret_token)
token.update(str(rounded_timestamp))
token = token.hexdigest()
if token generated on remote server when POSTing request match token generated on local server, when getting response, request is valid and can be processed.
The simple obvious solution would be to store the token in your database. Other possible solutions are Redis or something similar. Finally, you can have a look at distributed async tasks queues like Celery...
Related
What would be best way to solve following problem with Python ?
I have real-time data stream coming to my object-oriented storage from user application (json files being stored into S3 storage in Amazon).
Upon receiving of each JSON file, I have to within certain time (1s in this instance) process data in the file and generate response that is send back to the user. This data is being processed by simple Python script.
My issue is, that the real-time data stream can at the same time generate even few hundreds JSON files from user applications that I need to run trough my Python script and I don't know how to approach this the best way.
I understand, that way to tackle this would be to use trigger based Lambdas that would execute job on the top of every file once uploaded from real-time stream in server-less environment, however this option is quite expensive compared to have single server instance running and somehow triggering jobs inside.
Any advice is appreciated. Thanks.
Serverless can actually be cheaper than using a server. It is much cheaper when there are periods of no activity because you don't need to pay for a server doing nothing.
The hardest part of your requirement is sending the response back to the user. If an object is uploaded to S3, there is no easy way to send back a response and it isn't even obvious who is the user that sent the file.
You could process the incoming file and then store a response back in a similarly-named object, and the client could then poll S3 for the response. That requires the upload to use a unique name that is somehow generated.
An alternative would be for the data to be sent to AWS API Gateway, which can trigger an AWS Lambda function and then directly return the response to the requester. No server required, automatic scaling.
If you wanted to use a server, then you'd need a way for the client to send a message to the server with a reference to the JSON object in S3 (or with the data itself). The server would need to be running a web server that can receive the request, perform the work and provide back the response.
Bottom line: Think about the data flow first, rather than the processing.
If I have a server end point that say does a simple task like initializes an API with a token that I generate client side, and then print the users account info.
Can I initialize() the API globally so the user can do other tasks after printing account info?
and
How does that affect other users initializing() and printing info if they do it at the same time?
I don't understand how this server works and any help would be great. Thank you!
If I'm understanding you correctly, it sounds like the easiest way to accomplish what you're trying to do is to use the flask-login module.
You will want to create an endpoint / Flask route (for example, '/login') that the user will send a POST request to with a username (and usually also a password). When the login is successful, the user's browser will have a cookie set with a token that will allow them to access Flask routes that have a #login_required decorator attached.
In your Python code for those routes, you will be able to access the "current_user", which will allow you to tailor your response to the particular user sending the request.
This way of doing things will allow you to deal with your first question, which seems to be about saving a user's identity across requests, and it will also allow you to deal with your second question, which seems to be about the problem of having different users accessing the same endpoint while expecting different responses.
Nathan's answer points you in the right direction, but I'll make a comment about state and scope because you asked about multiple users.
Firstly, the normal way to do this would be for the server to generate the random token and send it back to the user.
client request 1 > init > create token > store along with init data > return token to user
client request 2 > something(token) > find data related to token > return data to user
When you run Flask (and most webapps) you try to make it so that each request is independent. In your case, the user calls an endpoint and identifies themselves by passing a token. If you make a record of that token somewhere (normally in a database) then you can identify that user's data by looking up the same token they pass on subsequent requests.
Where you choose to store that information is important. As I say a database is the normal recommended approach as it's built to be accessed safely by multiple people at the same time.
You might be tempted to not do the database thing and actually store the token / data information in a global variable inside python. Here's the reason why that's (probably) not going to work:
Flask is a wsgi server, and how it behaves when up and running depends on how it's configured. I generally use uwsgi with several different processes. Each process will have its own state that can't see one another. In this (standard / common) configuration, you don't know which process is going to receive any given request.
request 1 > init(token) > process 1 > store token in global vars
request 2 > other(token) > process 2 > can't find token in global vars
That's why we use a database to store all shared information:
request 1 > init(token) > process 1 > store token in db
request 2 > other(token) > process 2 > can find token db
Assume a Flask application that allows to build an object (server-side) through a number of steps (wizard-like ; client-side).
I would like to create an initial object server-side an build it up step by step given the client-side input, keeping the object 'alive' throughout the whole build-process. A unique id will be associated with the creation of each new object / wizard.
Serving the Flask application with the use of WSGI on Apache, requests can go through multiple instance of the Flask application / multiple threads.
How do I keep this object alive server-side, or in other words how to keep some kind of global state?
I like to keep the object in memory, not serialize/deserialize it to/from disk. No cookies either.
Edit:
I'm aware of the Flask.g object but since this is on per request basis this is not a valid solution.
Perhaps it is possible to use some kind of cache layer, e.g.:
from werkzeug.contrib.cache import SimpleCache
cache = SimpleCache()
Is this a valid solution? Does this layer live across multiple app instances?
You're looking for sessions.
You said you don't want to use cookies, but did you mean you didn't want to store the data as a cookie or are you avoiding cookies entirely? For the former case, take a look at server side sessions, e.g. Flask-KVSession
Instead of storing data on the client, only a securely generated ID is stored on the client, while the actual session data resides on the server.
I'm using pyramid web framework. I was confused by the relationship between the cookie and session. After looked up in wikipedia, did I know that session is an abstract concept and cookie may just be an kind of approach (on the client side).
So, my question is, what's the most common implementation (on both the client and server)? Can somebody give some example (maybe just description) codes? (I wouldn't like to use the provided session support inside the pyramid in order to learn)
The most common implementation of sessions is to use a cookie.
A cookie provides a way to store an arbitrary piece of text, which is usually used as a session identifier. When the cookie gets sent along with a HTTP request, the server (technically the code running on it) can use the cookie text (if it exists) to recognise that it has seen a client before. Text in a cookie usually provides enough information to retrieve extra information from the database about this client.
For example, a very naive implementation might store the primary key to the shopping_cart table in a database, so that when the server receives the cookie text it can directly use it to access the appropriate shopping cart for that particular client.
(And it's a naive approach because a user can do something like change their own cookie to a different primary key and access someone else's cart that way. Choosing a proper session id isn't as simple as it seems, which is why it's almost always better to use an existing implementation of sessions.)
An alternate approach is to store a session identifier is to use a GET parameter in the url (for example, in something like http://example.com/some/page?sid=4s6da4sdasd48, then the sid GET param serves the same function as the cookie string). In this approach, all links to other pages on the site have the GET param appended to them.
In general, the cookie stored with the client is just a long, hard-to-guess hash code string that can be used as a key into a database. On the server side, you have a table mapping those session hashes to primary keys (a session hash should never be a primary key) and expiration timestamps.
So when you get a request, first thing you do is look for the cookie. If there isn't one, create a session entry (cookie + expiration timestamp) in the database table. If there is one, look it up and make sure it hasn't expired; if it has, make a new one. In either case, if you made a new cookie, you might want to pass that fact down to later code so it knows if it needs to ask for a login or something. If you didn't need to make a new cookie, reset the expiration timestamp so you don't expire the session too soon.
While handling the view code and generating a response, you can use that session primary key to index into other tables that have data associated with the session. Finally, in the response sent back to the client, set the cookie to the session key hash.
If someone has cookies disabled, then their session cookie will always be new, and any session-based features won't work.
A session is (usually) a cookie that has a unique value. This value maps to a value in a database or held in memory that then tells you what session to load. PHP has an alternate method where it appends a unique value to the end of every URL (if you've ever seen PHPSESSID in a URL you now know why) but that has security implications (in theory).
Of course, since cookies are sent back and forth with every request unless you're talking over HTTPS you are sending the only way to know (reliably) that the client you are talking to now is the same one you logged in ten seconds ago to anyone on the same wireless network. See programs like Firesheep for reasons why switching to HTTPS is a good idea.
Finally, if you do want to build your own I, was given some advice on the matter by a university professor. Give out a new token on every page load and invalidate all a users tokens if an invalid token is used. This just means that if an attacker does get a token and logs in to it whilst it is still valid when the victim clicks a link both parties get logged out.
I have a webapp with some functionality that I'd like to be made accessible via an API or webservice. My problem is that I want to control where my API can be accessed from, that is, I only want the apps that I create or approve to have access to my API. The API would be a web-based REST service. My users do not login, so there is no authentication of the user. The most likely use case, and the one to work with now, is that the app will be an iOS app. The API will be coded with django/python.
Given that it is not possible to view the source-code of an iOS app (I think, correct me if I'm wrong), my initial thinking is that I could just have some secret key that is passed in as a parameter to the API. However, anyone listening in on the connection would be able to see this key and just use it from anywhere else in the world.
My next though is that I could add a prior step. Before the app gets to use API it must pass a challenge. On first request, my API will create a random phrase and encrypt it with some secret key (RSA?). The original, unencrypted phrase will be sent to the app, which must also encrypt the phrase with the same secret key and send back the encrypted text with their request. If the encryptions match up, the app gets access but if not they don't.
My question is: Does this sound like a good methodology and, if so, are there any existing libraries out there that can do these types of things? I'll be working in python server-side and objective-c client side for now.
The easiest solution would be IP whitelisting if you expect the API consumer to be requesting from the same IP all the time.
If you want to support the ability to 'authenticate' from anywhere, then you're on the right track; it would be a lot easier to share an encryption method and then requesting users send a request with an encrypted api consumer handle / password / request date. Your server decodes the encrypted value, checks the handle / password against a whitelist you control, and then verifies that the request date is within some timeframe that is valid; aka, if the request date wasnt within 1 minute ago, deny the request (that way, someone intercepts the encrypted value, it's only valid for 1 minute). The encrypted value keeps changing because the request time is changing, so the key for authentication keeps changing.
That's my take anyways.
In addition to Tejs' answer, one known way is to bind the Product ID of the OS (or another unique ID of the client machine) with a specific password that is known to the user, but not stored in the application, and use those to encrypt/decrypt messages. So for example, when you get the unique no. of the machine from the user, you supply him with password, such that they complete each other to create a seed X for RC4 for example and use it for encryption / decryption. this seed X is known to the server as well, and it also use it for encryption / decryption. I won't tell you this is the best way of course, but assuming you trust the end-user (but not necessarily any one who has access to this computer), it seems sufficient to me.
Also, a good python library for cryptography is pycrypto
On first request, my API will create a random phrase and encrypt it with some secret key (RSA?)
Read up on http://en.wikipedia.org/wiki/Digital_signature to see the whole story behind this kind of handshake.
Then read up on
http://en.wikipedia.org/wiki/Lamport_signature
And it's cousin
http://en.wikipedia.org/wiki/Hash_tree
The idea is that a signature can be used once. Compromise of the signature in your iOS code doesn't matter since it's a one-use-only key.
If you use a hash tree, you can get a number of valid signatures by building a hash tree over the iOS binary file itself. The server and the iOS app both have access to the same
file being used to generate the signatures.