Securing/sanitizing remote calls to server by untrusted clients

Securing/sanitizing remote calls to server by untrusted clients - python

I'm building an API which will expose (among other things) the following calls:
Upload file to remote server.
Perform various computations (over some set of possible function) on remotely uploaded file.
I'm trying to do this on Python. What are the best practices when the client is untrusted, meaning that they can upload arbitrarily crafted files?
What's the standard procedure nowadays? RPC, REST, something else?
I do not need to worry about authentication and/or encryption, requests can be anonymous and in the clear. MITM is not a concern either.

You should treat any client as untrusted, so your case will need a general approach which can be found at OWASP ASVS (v16: files and resources verification requirements). REST is OK for this purpose.
The main points are:
store files outside of webroot (e.g. it can't be served by static page server)
avoid setting the execution bit on (for Linux)
if it is possible, limit file types to know-good ones (e.g. validation against whitelist; validate filetypes by extension AND by file signature)
check that files do have an appropriate size before accepting requests and putting files into variables (you can check it by HTTP content-length and filter it before passing to an app)
if it is possible, check files with server antivirus
if files are served back to a user, ensure that the appropriate headers (content-type, no-sniff) are set. If they are not, some XSS scenarios are possible
verify that filenames are sanitized so they won't trick you program into serving other files (e.g. there might be a scenario where filename "../../../../../../etc/passwd" will serve an actual /etc/passwd file). Reject request if filename contains ../ or / sequences.
do not ever concatenate path to folders with filenames because it can give the same issue
if computations will be made via calling the command line, beware of command line injections (this issue and 2 previous can be solved by specifying the file name format to the users, e.g. accept only alphanumeric names without spaces or any special chars and reject any request that won't fit the pattern)
if you can, limit requests number by IP

Related

Is there a way we can find the remaining space in FTP server using ftplib?

I know we can use the size method from the FTP class to find the size of the files in the FTP server. But here I want to find the total remaining space in the server after files are uploaded so that we could guess how many more files we can upload. Is there a way to do this?

This is not really Python/ftplib question. It's more about what information does your FTP server provide.
Some servers support commands like AVBL (for example Serv-U);
or XQUOTA (for example WS_FTP).
Microsoft IIS can be configured to add free disk space to the output of LIST command (see Is it possible to determine the amount of free space on a remote FTP server without using scripts?).
All these can be used from ftplib. But in general, most FTP servers do not provide any of these.
See also How to check free space in a FTP Server?

Python/Twisted-- Render Paramiko SFTPFile as if it were twisted.web.static.File

Before I pose the question, some background: I'm creating a web management tool that, among other things, allows the user to download, tail, email, and move and files between predefined directories via the management panel. Many of these directories are local to the server, but some are actually located on remote hosts and accessed via SSH--however, this is transparent to the user. I've used Twisted to create a pseudo-REST API for the client to access, but since I want to avoid revealing actual server paths to the client, it requests downloads of files using a POST with an arbitrary ID to the api, as such: "http://XXXX:8880/api/transfer/download"
with POST params similar to this: {"srckey":"5","srcfile":"solar2-windows-1.10.zip"}. The idea being the client only knows the key of the directory and filename.
Pardon the excessive background--I'm hoping it will make my question more clear: The issue I have is I'm trying to allow users to download a copy of a file from one of the "remote" hosts via the management server that hosts the web panel, all without caching the file locally. I've used Twisted's File() object to stream large static files before, but since the file resides on another server, I'm trying to accomplish the same using a file object provided by Paramiko's "open()" method.
I've tried setting up a consumer/producer system similar to that used in the render methods of twisted.web.static.File, plugging in the file pointer provided by Paramiko in the appropriate places, but only the smallest text files transfer successfully--all cases cause Paramiko to throw this error:
socket.error: Socket is closed
The contents of the relevant python files are here:
serve-project.py: http://pastebin.com/YcjsQHu3
WrapSSH.py:
http://pastebin.com/XaKXJwxb
In a nutshell, I'm trying to stream the data from a Paramiko SFTPFile to an HTTP client. I suspect that my approach is majorly faulty, due to my minimal familiarity with Twisted. Anyone have suggestions on a more intelligent way to accomplish this?

Fetching images from URL and saving on server and/or Table (ImageField)

I'm not seeing much documentation on this. I'm trying to get an image uploaded onto server from a URL. Ideally I'd like to make things simple but I'm in two minds as to whether using an ImageField is the best way or simpler to simply store the file on the server and display it as a static file. I'm not uploading anyfiles so I need to fetch them in. Can anyone suggest any decent code examples before I try and re-invent the wheel?
Given an URL say http://www.xyx.com/image.jpg, I'd like to download that image to the server, put it into a suitable location after renaming. My question is general as I'm looking for examples of what people have already done. So far I just see examples relating to uploading images, but that doesn't apply. This should be a simple case and I'm looking for a canonical example that might help.
This is for uploading an image from the user: Django: Image Upload to the Server
So are there any examples out there that just deal with the process of fetching and image and storing on the server and/or ImageField.

Well, just fetching an image and storing it into a file is straightforward:
import urllib2
with open('/path/to/storage/' + make_a_unique_name(), 'w') as f:
f.write(urllib2.urlopen(your_url).read())
Then you need to configure your Web server to serve files from that directory.
But this comes with security risks.
A malicious user could come along and type a URL that points nowhere. Or that points to their own evil server, which accepts your connection but never responds. This would be a typical denial of service attack.
A naive fix could be:
urllib2.urlopen(your_url, timeout=5)
But then the adversary could build a server that accepts a connection and writes out a line every second indefinitely, never stopping. The timeout doesn’t cover that.
So a proper solution is to run a task queue, also with timeouts, and a carefully chosen number of workers, all strictly independent of your Web-facing processes.
Another kind of attack is to point your server at something private. Suppose, for the sake of example, that you have an internal admin site that is running on port 8000, and it is not accessible to the outside world, but it is accessible to your own processes. Then I could type http://localhost:8000/path/to/secret/stats.png and see all your valuable secret graphs, or even modify something. This is known as server-side request forgery or SSRF, and it’s not trivial to defend against. You can try parsing the URL and checking the hostname against a blacklist, or explicitly resolving the hostname and making sure it doesn’t point to any of your machines or networks (including 127.0.0.0/8).
Then of course, there is the problem of validating that the file you receive is actually an image, not an HTML file or a Windows executable. But this is common to the upload scenario as well.

Security issues storing config file in json/CPickle

I tried to figure it out, the most secure and flexible solution for storing in config file some credentials for database connection and other private info.
This is inside a python module for logging into different handlers (mongodb, mysqldb, files,etc) the history of users activity in the system.
This logging module, is attached with a handler and its there where I need to load the config file for each handler. I.E. database, user, pass, table, etc.
After some research in the web and stackoverflow, I just saw mainly the security risks comparison between Json and CPickle, but concerning the eval method and the types restriction, more than the config file storage issue.
I was wondering if storing credentials in json is a good idea, due to the security risks involved in having a .json config file in the server (from which the logging handler will read the data). I know that this .json file could be retrieved by an http request. If the parameters are stored in a python object inside a .py code, I guess there is more security due to the fact that any request of this file will be interpreted first by the server, but I am loosing the flexibility of modularization and easy modification of this data.
What would you suggest for this kind of Security issues while storing this kind of config files in the server and accessed by some Python class?
Thanks in advance,
Luchux.

I'd think about encrypting the credentials file. The process that uses it will need a key/password to decrypt it, and you can store that somewhere else-- or even enter it interactively on server start-up. That way you don't have a single point of failure (though of course a determined intruder can eventually put the pieces together).
(Naturally you should also try to secure the server so that your credentials can't just be fetched by http request)

Security around user uploaded files

I have a client's python website which runs a dropbox-like feature that allows uploading of files.
I want to make sure that uploading files does not open up the server to vulnerabilities.
So, I store all uploaded files as blobs in a postgres database and do not trust the file name and extension of the file, I let the application determine that for itself.
I ran into problems when trying to let the application decide the file format itself, so my question boils down to:
Is it necessary, for security, to limit what file formats are allowed to be uploaded?
If yes, how, if not using something like libmagic, can I determine the file format in the best way?
Are there other measures I need to make in order to remain safe when allowing publically loaded files?
Thanks.

The referenced "bug" question (which chains to this)doesn't refer to a bug, it says that some MS Office file types are, like Java jars, packaged and compressed as zipfiles. If you rename a .xlsx file to .zip, you can view the contents - I found 13 .xml files and a .bin printer settings file in a simple example.
For security you can't "trust" mime-type and file extension provided by the user, but you can in principle use them to validate that the contents are valid for the claimed file type. The first level of checking would ensure that the claimed Office files are in fact valid zipfiles, the second would check that the contents conforms to what is expected by the Office application. Not being an Office developer I don't know of a process to inspect a zip archive, determine which Office application it is for, and validate that the application can open it, but I'm sure it exists somewhere on MSDN.
More fundamentally, what do you mean by "security is important to my application"? Security prevents unwanted events - you need to define what you want to prevent. Do you want users to only be able to upload files for whitelisted applications? Should they be prevented from uploading blacklisted file types (like .exe)? Is it OK with you if a user uploaded 10MB of random bits and called it a .xyz file?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.