A client wants to ensure that I cannot read sensitive data from their site, which will still be administered by me. In practice, this means that I'll have database access, but it can't be possible for me to read the contents of certain Model Fields. Is there any way to make the data inaccessible to me, but still decrypted by the server to be browsed by the client?
This is possible with public key encryption. I have done something similar before in PHP but the idea is the same for a Django app:
All data on this website was stored encrypted using a private key held by the system software. The corresponding public key to decrypt the data was held by the client in a text file.
When the client wanted to access their data, they pasted the public key into an authorisation form (holding the key in the session) which unlocked the data.
When done, they deauthorised their session.
This protected the information against authorised access to the web app (so safe against weak username/passwords) and also from leaks at the database level.
This is still not completely secure: if you have root access to the machine you can capture the key as it is uploaded, or inspect the session information. For that the cure could be to run the reading software on the client's machine and access the database through an API.
I realise this is an old question but I thought I'd clarify that it is indeed possible.
No, it's not possible to have data that is both in a form you can't decrypt it, and in a form where you can decrypt it to show it to the client simultaneously. The best you can do is a reversible encryption on the content so at least if your server is compromised their data is safe.
Take a look at Django-fields
You might find Django Encrypted Fields useful.
You and your client could agree on them being obscured. A simple XOR operation or something similar will make the values unreadable in the admin and they can be decoded just in time they are needed in the site.
This way you can safely administer the site without "accidentally" reading something.
Make sure your client understands that it is technically possible for you to get the actual contents but that it would require active effort.
Some other issues to consider are that the web application will then not be able to sort or easily query on the encrypted fields. It would be helpful to know what administrative functions the client wants you to have. Another approach would be to have a separate app / access channel that does not show the critical data but still allows you to perform your admin functions only.
Related
So, in order to avoid the "no one best answer" problem, I'm going to ask, not for the best way, but the standard or most common way to handle sessions when using the Tornado framework. That is, if we're not using 3rd party authentication (OAuth, etc.), but rather we have want to have our own Users table with secure cookies in the browser but most of the session info stored on the server, what is the most common way of doing this? I have seen some people using Redis, some people using their normal database (MySQL or Postgres or whatever), some people using memcached.
The application I'm working on won't have millions of users at a time, or probably even thousands. It will need to eventually get some moderately complex authorization scheme, though. What I'm looking for is to make sure we don't do something "weird" that goes down a different path than the general Tornado community, since authentication and authorization, while it is something we need, isn't something that is at the core of our product and so isn't where we should be differentiating ourselves. So, we're looking for what most people (who use Tornado) are doing in this respect, hence I think it's a question with (in theory) an objectively true answer.
The ideal answer would point to example code, of course.
Here's how it seems other micro frameworks handle sessions (CherryPy, Flask for example):
Create a table holding session_id and whatever other fields you'll want to track on a per session basis. Some frameworks will allow you to just store this info in a file on a per user basis, or will just store things directly in memory. If your application is small enough, you may consider those options as well, but a database should be simpler to implement on your own.
When a request is received (RequestHandler initialize() function I think?) and there is no session_id cookie, set a secure session-id using a random generator. I don't have much experience with Tornado, but it looks like setting a secure cookie should be useful for this. Store that session_id and associated info in your session table. Note that EVERY user will have a session, even those not logged in. When a user logs in, you'll want to attach their status as logged in (and their username/user_id, etc) to their session.
In your RequestHandler initialize function, if there is a session_id cookie, read in what ever session info you need from the DB and perhaps create your own Session object to populate and store as a member variable of that request handler.
Keep in mind sessions should expire after a certain amount of inactivity, so you'll want to check for that as well. If you want a "remember me" type log in situation, you'll have to use a secure cookie to signal that (read up on this at OWASP to make sure it's as secure as possible, thought again it looks like Tornado's secure_cookie might help with that), and upon receiving a timed out session you can re-authenticate a new user by creating a new session and transferring whatever associated info into it from the old one.
Tornado designed to be stateless and don't have session support out of the box.
Use secure cookies to store sensitive information like user_id.
Use standard cookies to store not critical information.
For storing large objects - use standard scheme - MySQL + memcache.
The key issue with sessions is not where to store them, is to how to expire them intelligently. Regardless of where sessions are stored, as long as the number of stored sessions is reasonable (i.e. only active sessions plus some surplus are stored), all this data is going to fit in RAM and be served fast. If there is a lot of old junk you may expect unpredictable delays (the need to hit the disk to load the session).
There isn't anything built directly into Tornado for this purpose. As others have commented already, Tornado is designed to be a very fast async framework. It is lean by design. However, it is possible to hook in your own session management capability. You need to add a preamble section to each handler that would create or grab a session container. You will need to store the session ID in a cookie. If you are not strictly HTTPS then you will want to use a secure cookie. The session persistence can be any technology of your choosing such as Redis, Postgres, MySQL, a file store, etc...
There is a Github project that provides session management for Tornado. Even if you decide not to use it, it can provide insight into how to structure your own session management. The Github project is called dustdevil. Full disclosure - we created this several years ago but find it very easy to use and have it in active use today.
In my case I'm using the Dropbox API. Currently I'm storing the key and secret in a JSON file, just so that I can gitignore it and keep it out of the Github repo, but obviously that's no better than having it in the code from a security standpoint. There have been lots of questions about protecting/obfuscating Python before (usually for commercial reasons) and the answer is always "Don't, Python's not meant for that."
Thus, I'm not looking for a way of protecting the code but just a solution that will let me distribute my app without disclosing my API details.
Plain text. Any obfuscation attempt is futile if the code gets distributed.
Don't know if this is feasible in your case. But you can access the API via a proxy that you host.
The requests from the Python APP go to the proxy and the proxy makes the requests to the Dropbox API and returns the response to the Python app. This way your api key will be at the proxy that you're hosting. The access to the proxy can be controlled by any means you prefer. (For example username and password )
There are two ways depending on your scenario:
If you are developing a web application for end users, just host it in a way that your API key does not come to disclosure. So keeping it gitignored in a separate file and only upload it to your server should be fine (as long there is no breach to your server). Any obfuscation will not add any practical benefit, it will just give a false feeling of security.
If you are developing a framework/library for developers or a client application for end users, ask them to generate an API key on their own.
What's the best way to protect a symmetric key that needs to be used in code within Google Appengine?
Our application uses Python 2.7
EDIT: we have some database fields that we want protected, that need to be accessed in the code but there is no reason to leave them in the database in plain text. Obviously I'd like to make it as hard as possible to retrieve the key (understanding that it is never impossible).
There is no way to absolutely protect a key if you don't trust the environment that the code is running in. You could store (part of) the key in a trusted location and only accept queries for the key from the domain/IP of your app. But then it would still be in that appengine instance's memory.
The best solution for outgoing messages is to use public-key crypto. Let your code use the public key of the remote party, since those don't have to be kept secret. It can then only be decrypted with the remote's private key.
If you can't trust the appengine's environment, you can't decrypt incoming public-key messages because that would require your secret key to be available to the application.
Edit: Since you've added that you want to protect some database fields, have you thought about hashing them?
I have a program that I wrote in python that collects data. I want to be able to store the data on the internet somewhere and allow for another user to access it from another computer somewhere else, anywhere in the world that has an internet connection. My original idea was to use an e-mail client, such as g-mail, to store the data by sending pickled strings to the address. This would allow for anyone to access the address and simply read the newest e-mail to get the data. It worked perfectly, but the program requires a new e-mail to be sent every 5-30 seconds. So the method fell through because of the limit g-mail has on e-mails, among other reasons, such as I was unable to completely delete old e-mails.
Now I want to try a different idea, but I do not know very much about network programming with python. I want to setup a webpage with essentially nothing on it. The "master" program, the program actually collecting the data, will send a pickled string to the webpage. Then any of the "remote" programs will be able to read the string. I will also need the master program to delete old strings as it updates the webpage. It would be preferred to be able to store multiple string, so there is no chance of the master updating while the remote is reading.
I do not know if this is a feasible task in python, but any and all ideas are welcome. Also, if you have an ideas on how to do this a different way, I am all ears, well eyes in this case.
I would suggest taking a look at setting up a simple site in google app engine. It's free and you can use python to do the site. Than it would just be a matter of creating a simple restful service that you could send a POST to with your pickled data and store it in a database. Than just create a simple web front end onto the database.
Another option in addition to what Casey already provided:
Set up a remote MySQL database somewhere that has user access levels allowing remote connections. Your Python program could then simply access the database and INSERT the data you're trying to store centrally (e.g. through MySQLDb package or pyodbc package). Your users could then either read the data through a client that supports MySQL or you could write a simple front-end in Python or PHP that displays the data from the database.
Adding this as an answer so that OP will be more likely to see it...
Make sure you consider security! If you just blindly accept pickled data, it can open you up to arbitrary code execution.
I suggest you to use a good middle-ware like: Zero-C ICE, Pyro4, Twisted.
Pyro4 using pickle to serialize data.
For my final year project I plan to code a cloud in Python. The client will be written in Java by the other member of my team. The client will have a tabbed interface and it will provide a text editor, a media player, a couple of small Java based games and a maybe a few more services.
The server will work like this:
1) Validate the user.
2) Send a file, called "dump" to the user. Dump will contain all the file names and file types that the user created by himself or the files which the user can read/write. This info will be fetched from the database.
3) The tabs in the client will display the file types associated with the tab application. e.g the media tab will only select and show the media files from the dump readable by user. The text editor tab will show only the txt files from the dump readable by the user.
4) A request to open the file will send the file back to client, which the associated application will open.
5) All the changes made to the files and all the actions (overwriting, saving, deleting etc.) will be sent back to the server along with the new object. Something similar will be done to the newly created objects.
My Questions are:
What are the best approaches for the communication between the client and the server. For the dump I plan to use some sort of encrypted XML file. For the other way round, I don't have a clue :/.
For easy integration with the database, I was planning to use Django (which I started few days back). How can I send my requests from the client to the server (without Django I'd use SQL queries) and the files from the server to the client? Maybe GET and POST will work for the former problem? Any other suggestions?
Q1: how should I transfer data between client/server securely
A: HTTPS to support encryption & JSON to serialise objects between languages (Python/Java) seems to be the most natural. You could experiment with XML-RPC over SSL or TSL if you want to be creative.
Q2: How do I send queries to the server's db?
A: My first response is to say talk to the person coding the server, and see what's easiest on that end. However, I think that your client should stick to HTTP. The server developer would ensure the server supports RESTful URIs. Then your client only access a URI and have the results processed by the server.
At its most raw, this could be implemented like this:
https://www.example.com/db?q="SELECT * FROM docs"
There are smarter ways to do it, but you get the idea.
If you're going to use a web framework on the server, it makes sense to use an HTTP-based protocol. The downside is that only the client can initiate a connection (e.g., the client needs to first ask for the "dump" file), but a simple GET request will suffice (remember, the server can send anything in the HTTP response, including your XML file).
Regarding encryption, it's best to use an existing protocol like HTTPS. There are well-vetted libraries that will correctly establish a secure connection between your client and the server.
Overall, I'm advocating the highest-level protocols that are appropriate for your application. HTTP(S) goes hand-in-hand with your web-based architecture, so make use of it.
Stick to Django. It's really productive. I would use JSON instead of XML. More convenient. import json. This should help you in communicating between client-server.
Also cloud computing is just a recent word that's just thrown around for (client+server+some services). Oh by the way all that you want to do can be completely done in Django itself. No need to go to JAVA.
Django is Cool :)