Security issues storing config file in json/CPickle

Security issues storing config file in json/CPickle - python

I tried to figure it out, the most secure and flexible solution for storing in config file some credentials for database connection and other private info.
This is inside a python module for logging into different handlers (mongodb, mysqldb, files,etc) the history of users activity in the system.
This logging module, is attached with a handler and its there where I need to load the config file for each handler. I.E. database, user, pass, table, etc.
After some research in the web and stackoverflow, I just saw mainly the security risks comparison between Json and CPickle, but concerning the eval method and the types restriction, more than the config file storage issue.
I was wondering if storing credentials in json is a good idea, due to the security risks involved in having a .json config file in the server (from which the logging handler will read the data). I know that this .json file could be retrieved by an http request. If the parameters are stored in a python object inside a .py code, I guess there is more security due to the fact that any request of this file will be interpreted first by the server, but I am loosing the flexibility of modularization and easy modification of this data.
What would you suggest for this kind of Security issues while storing this kind of config files in the server and accessed by some Python class?
Thanks in advance,
Luchux.

I'd think about encrypting the credentials file. The process that uses it will need a key/password to decrypt it, and you can store that somewhere else-- or even enter it interactively on server start-up. That way you don't have a single point of failure (though of course a determined intruder can eventually put the pieces together).
(Naturally you should also try to secure the server so that your credentials can't just be fetched by http request)

Related

How to make a large file accessible to external APIs?

I'm new to webdev, and I have this use case where a user sends a large file (e.g., a video file) to the API, and then this file needs to be accessible to other APIs (which could possibly be on different servers) for further processing.
I'm using FastAPI for the backend, defining a file parameter with a type of UploadFile to receive and store the files. But what would be the best way to make this file accessible to other APIs? Is there a way I can get a publicly accessible URL out of the saved file, which other APIs can use to download the file?

Returning a File Response
First, to return a file that is saved on disk from a FastAPI backend, you could use FileResponse (in case the file was already fully loaded into memory, see here). For example:
from fastapi import FastAPI
from fastapi.responses import FileResponse
some_file_path = "large-video-file.mp4"
app = FastAPI()
#app.get("/")
def main():
return FileResponse(some_file_path)
In case the file is too large to fit into memory—as you may not have enough memory to handle the file data, e.g., if you have 16GB of RAM, you can’t load a 100GB file—you could use StreamingResponse. That way, you don't have to read it all first in memory, but, instead, load it into memory in chunks, thus processing the data one chunk at a time. Example is given below. If you find yield from f being rather slow when using StreamingResponse, you could instead create a custom generator, as described in this answer.
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
some_file_path = "large-video-file.mp4"
app = FastAPI()
#app.get("/")
def main():
def iterfile():
with open(some_file_path, mode="rb") as f:
yield from f
return StreamingResponse(iterfile(), media_type="video/mp4")
Exposing the API to the public
As for exposing your API to the public—i.e., external APIs, users, developers, etc.—you can use ngrok (or expose, as suggested in this answer).
Ngrok is a cross-platform application that enables developers to expose a local development server to the Internet with minimal effort. To embed the ngrok agent into your FastAPI application, you could use pyngrok—as suggested here (see here for a FastAPI integration example). If you would like to run and expose your FastAPI app through Google Colab (using ngrok), instead of your local machine, please have a look at this answer (plenty of tutorials/examples can also be found on the web).
If you are looking for a more permanent solution, you may want to have a look at cloud platforms—more specifically, a Platform as a Service (PaaS)—such as Heroku. I would strongly recommend you thoroughly read FastAPI's Deployment documentation. Have a closer look at About HTTPS and Deployments Concepts.
Important to note
By exposing your API to the outside world, you are also exposing it to various forms of attack. Before exposing your API to the public—even if it’s for free—you need to make sure you are offering secure access (use HTTPS), as well as authentication (verify the identity of a user) and authorisation (verify their access rights; in other words, verify what specific routes, files and data a user has access to)—take a look at 1. OAuth2 and JWT tokens, 2. OAuth2 scopes, 3. Role-Based Access Control (RBAC), 4. Get Current User and How to Implement Role based Access Control With FastAPI.
Addtionally, if you are exposing your API to be used publicly, you may want to limit the usage of the API because of expensive computation, limited resources, DDoS attacks, Brute-force attacks, Web scraping, or simply due to monthly cost for a fixed amount of requests. You can do that at the application level using, for instance, slowapi (related post here), or at the platform level by setting the rate limit through your hosting service (if permitted). Furthermore, you would need to make sure that the files uploaded by users have the permitted file extension, e.g., .mp4, and are not files with, for instance, a .exe extension that are potentially harmful to your system. Finally, you would also need to ensure that the uploaded files do not exceed a predefined MAX_FILE_SIZE limit (based on your needs and system's resources), so that authenticated users, or an attacker, would be prevented from uploading extremely large files that would result in consuming server resources in a way that the application may end up crashing. You shouldn't rely, though, on the Content-Length header being present in the request to do that, as this might be easily altered, or even removed, by the client. You should rather use an approach similar to this answer (have a look at the "Update" section) that uses request.stream() to process the incoming data in chunks as they arrive, instead of loading the entire file into memory first. By using a simple counter, e.g., total_len += len(chunk), you can check if the file size has exceeded the MAX_FILE_SIZE, and if so, raise an HTTPException with HTTP_413_REQUEST_ENTITY_TOO_LARGE status code (see this answer as well, for more details and code examples).
Read more on FastAPI's Security documentation and API Security on Cloudflare.

How to secure passwords and secret keys in Django settings file

A django settings file includes sensitive information such as the secret_key, password for database access etc which is unsafe to keep hard-coded in the setting file. I have come across various suggestions as to how this information can be stored in a more secure way including putting it into environment variables or separate configuration files. The bottom line seems to be that this protects the keys from version control (in addition to added convenience when using in different environments) but that in a compromised system this information can still be accessed by a hacker.
Is there any extra benefit from a security perspective if sensitive settings are kept in a data vault / password manager and then retrieved at run-time when settings are loaded?
For example, to include in the settings.py file (when using pass):
import subprocess
SECRET_KEY=subprocess.check_output("pass SECRET_KEY", shell=True).strip().decode("utf-8")
This spawns a new shell process and returns output to Django. Is this more secure than setting through environment variables?

I think a data vault/password manager solution is a matter of transferring responsibility but the risk is still here. When deploying Django in production, the server should be treated as importantly as a data vault. Firewall in place, fail to ban, os up to date... must be in place. Then, in my opinion, there is nothing wrong or less secure than having a settings.py file with a config parser reading a config.ini file (declared in your .gitignore!) where all your sensitive information is present.

Securing/sanitizing remote calls to server by untrusted clients

I'm building an API which will expose (among other things) the following calls:
Upload file to remote server.
Perform various computations (over some set of possible function) on remotely uploaded file.
I'm trying to do this on Python. What are the best practices when the client is untrusted, meaning that they can upload arbitrarily crafted files?
What's the standard procedure nowadays? RPC, REST, something else?
I do not need to worry about authentication and/or encryption, requests can be anonymous and in the clear. MITM is not a concern either.

You should treat any client as untrusted, so your case will need a general approach which can be found at OWASP ASVS (v16: files and resources verification requirements). REST is OK for this purpose.
The main points are:
store files outside of webroot (e.g. it can't be served by static page server)
avoid setting the execution bit on (for Linux)
if it is possible, limit file types to know-good ones (e.g. validation against whitelist; validate filetypes by extension AND by file signature)
check that files do have an appropriate size before accepting requests and putting files into variables (you can check it by HTTP content-length and filter it before passing to an app)
if it is possible, check files with server antivirus
if files are served back to a user, ensure that the appropriate headers (content-type, no-sniff) are set. If they are not, some XSS scenarios are possible
verify that filenames are sanitized so they won't trick you program into serving other files (e.g. there might be a scenario where filename "../../../../../../etc/passwd" will serve an actual /etc/passwd file). Reject request if filename contains ../ or / sequences.
do not ever concatenate path to folders with filenames because it can give the same issue
if computations will be made via calling the command line, beware of command line injections (this issue and 2 previous can be solved by specifying the file name format to the users, e.g. accept only alphanumeric names without spaces or any special chars and reject any request that won't fit the pattern)
if you can, limit requests number by IP

Python/Twisted-- Render Paramiko SFTPFile as if it were twisted.web.static.File

Before I pose the question, some background: I'm creating a web management tool that, among other things, allows the user to download, tail, email, and move and files between predefined directories via the management panel. Many of these directories are local to the server, but some are actually located on remote hosts and accessed via SSH--however, this is transparent to the user. I've used Twisted to create a pseudo-REST API for the client to access, but since I want to avoid revealing actual server paths to the client, it requests downloads of files using a POST with an arbitrary ID to the api, as such: "http://XXXX:8880/api/transfer/download"
with POST params similar to this: {"srckey":"5","srcfile":"solar2-windows-1.10.zip"}. The idea being the client only knows the key of the directory and filename.
Pardon the excessive background--I'm hoping it will make my question more clear: The issue I have is I'm trying to allow users to download a copy of a file from one of the "remote" hosts via the management server that hosts the web panel, all without caching the file locally. I've used Twisted's File() object to stream large static files before, but since the file resides on another server, I'm trying to accomplish the same using a file object provided by Paramiko's "open()" method.
I've tried setting up a consumer/producer system similar to that used in the render methods of twisted.web.static.File, plugging in the file pointer provided by Paramiko in the appropriate places, but only the smallest text files transfer successfully--all cases cause Paramiko to throw this error:
socket.error: Socket is closed
The contents of the relevant python files are here:
serve-project.py: http://pastebin.com/YcjsQHu3
WrapSSH.py:
http://pastebin.com/XaKXJwxb
In a nutshell, I'm trying to stream the data from a Paramiko SFTPFile to an HTTP client. I suspect that my approach is majorly faulty, due to my minimal familiarity with Twisted. Anyone have suggestions on a more intelligent way to accomplish this?

Fetching images from URL and saving on server and/or Table (ImageField)

I'm not seeing much documentation on this. I'm trying to get an image uploaded onto server from a URL. Ideally I'd like to make things simple but I'm in two minds as to whether using an ImageField is the best way or simpler to simply store the file on the server and display it as a static file. I'm not uploading anyfiles so I need to fetch them in. Can anyone suggest any decent code examples before I try and re-invent the wheel?
Given an URL say http://www.xyx.com/image.jpg, I'd like to download that image to the server, put it into a suitable location after renaming. My question is general as I'm looking for examples of what people have already done. So far I just see examples relating to uploading images, but that doesn't apply. This should be a simple case and I'm looking for a canonical example that might help.
This is for uploading an image from the user: Django: Image Upload to the Server
So are there any examples out there that just deal with the process of fetching and image and storing on the server and/or ImageField.

Well, just fetching an image and storing it into a file is straightforward:
import urllib2
with open('/path/to/storage/' + make_a_unique_name(), 'w') as f:
f.write(urllib2.urlopen(your_url).read())
Then you need to configure your Web server to serve files from that directory.
But this comes with security risks.
A malicious user could come along and type a URL that points nowhere. Or that points to their own evil server, which accepts your connection but never responds. This would be a typical denial of service attack.
A naive fix could be:
urllib2.urlopen(your_url, timeout=5)
But then the adversary could build a server that accepts a connection and writes out a line every second indefinitely, never stopping. The timeout doesn’t cover that.
So a proper solution is to run a task queue, also with timeouts, and a carefully chosen number of workers, all strictly independent of your Web-facing processes.
Another kind of attack is to point your server at something private. Suppose, for the sake of example, that you have an internal admin site that is running on port 8000, and it is not accessible to the outside world, but it is accessible to your own processes. Then I could type http://localhost:8000/path/to/secret/stats.png and see all your valuable secret graphs, or even modify something. This is known as server-side request forgery or SSRF, and it’s not trivial to defend against. You can try parsing the URL and checking the hostname against a blacklist, or explicitly resolving the hostname and making sure it doesn’t point to any of your machines or networks (including 127.0.0.0/8).
Then of course, there is the problem of validating that the file you receive is actually an image, not an HTML file or a Windows executable. But this is common to the upload scenario as well.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.