Python web based interpreter security issues

Python web based interpreter security issues - python

I am making a web based python interpreter which will take code executes it on Linux based python3 interpreter and give output on the same web page. But this has some serious loop holes like someone can execute bash script using python's os module, can check directory for source code of the web application and a lot more.
Can anyone suggest me how to prevent this kind of mishaps in my application
Regards

Short answer: there is no easy "python-only" solution for this.
Some details:
user can always try to call os, sys, with open(SENSITIVE_PATH, 'rw') as f: ..., etc, and it's hard to detect all those cases simply by analyzing the code
If you allow ANY third-party, then things become even more complicated, for example some third-party package may locally create an alias to os.execv (os_ex = os.execv), and after this it will be possible to write a script like from thirdparty.some_internals import os_ex; os_ex(...).
The more or less reliable solution is to use "external sandboxing" solutions:
Run interpreter in the unprivileged docker container. For example:
write untrusted script to some file that will be exposed through volume in the docker container
execute that script in docker:
a. subprocess.call(['docker', 'exec', 'CONTAINER_ID', '/usr/bin/python', 'PATH_TO_SCRIPT'])
b. subprocess.call(['docker', 'exec', 'CONTAINER_ID', '/usr/bin/python', '-c', UNTRUSTED_SCRIPT_TEXT])
Use PyPy-s sandbox.
Search for some "secure" IPython kernel for Jupyter notebook server. Or write your own. Note: existing kernels are not guaranteed to be secure and may allow to call subprocess.check_output, os.rm and others. So for "default kernel" it's still better to run Jupyter server in the isolated environment.
Run interpreter in chroot using unprivileged user. Different implementations have different level of "safety".
Use Jython with finely tuned permissions.
Some exotic solutions like "client-side JS python implementation": brython, pyjs
In any case, even if you manage to implement or reuse existing "sandbox" you still will get many potential problems:
If multiprocessing or multithreading is allowed then you might want to monitor how CPU resources are utilized, because
some scripts might want to use EVERYTHING. Even with GIL it's possible for multi-threading to utilize all kernels (all the user has to do is to call functions that use c-libraries in the threads)
You might want to monitor memory usage, because some scripts might leak or simply use a lot of memory
Other candidates for monitoring: Disk IO usage, Network usage, open file descriptors usage, execution time, etc...
Also you should always check for security updates of your "sandboxing solution", because even docker sometimes is vulnerable and makes it possible to execute code on host machine
Recommended read: https://softwareengineering.stackexchange.com/questions/191623/best-practices-for-execution-of-untrusted-code

Related

Python Script Sandboxing using Docker

If I build a container using a base image like Python 3 Alpine, and I'll follow the Hardening indicated into the docker documentation, is it secure to inject and execute a Python script?
I mean, if a user will write something dangerous (like sudo rm -R using a Python function), only the container will be affected of those problems, right?
Is this a good practice? I need to execute some small code snippets with limited access to the system, modules, etc...

I would not treat Docker as a security “silver bullet” here; you want to have at least some notion that the code you’re running is “trustworthy” before unleashing it on your system, even under Docker.
Remember that you need to have root privileges to run docker anything at all, or else you can trivially gain them (docker run -v /:/host -u root ... will let you freely edit the host filesystem). If your application really is dealing in untrusted code, consider whether you want a privileged process to be able to deal with it.
Beyond that, Docker containers share the host’s kernel and various physical resources. If there’s a kernel privilege escalation bug, something running in a container could exploit it. If your untrusted code makes outbound TCP calls to shuffle data around that you wouldn’t want on your network, that’s not limited by default. If it’s “merely” using your CPU cycles to mine Bitcoin, you can’t control that.
If all of this sounds like an acceptable level of risk to you, then running somewhat-trusted code under Docker is certainly better than not: you do get some protection against changing files on the host and host-level settings like network configuration, especially if you believe the code you’re running isn’t actively malicious.

Forbid Python from writing anything to disk

Are there any command-line options or configurations that forbids Python from writing to disk?
I know I can hack open but it doesn't sound very safe.
I've hosted some Python tutorials I wrote myself on my website for friends who want to learn Python, and I want them to have access to a Python console so they can try as they learn. This is done by creating a Python subprocess from the http server.
However, I do not want them to accidentally or intentionally damage my server, so I need to forbid the Python process from writing anything to disk.
Also I'm running the server on Ubuntu Linux so doing it Python-wise or system-wise are both OK.

I doubt there's a way to do this in the interpreter itself: there are way too many things to patch (open, subprocess, os.system, file, and probably others). I'd suggest looking into a way of containerizing the python runtime via something like Docker. The containerization gives some guarantees restricting access, though not as much as virtualization. See here for more discussion about the security implications.
Running a jupyter/ipython notebook in the docker container would probably be the easiest way to expose a web-frontend. jupyter provides a collection of docker containers for this purpose: see https://github.com/jupyter/tmpnb and https://github.com/jupyter/docker-stacks

How to build a web service with one sandboxed Python (VM) per request

As part of an effort to make the scikit-image examples gallery interactive, I would like to build a web service that receives a Python code snippet, executes it, and provides me with the generated output image.
For safety, the Python instances launched should be sandboxed and resource controlled, so I was thinking of using LXC containers.
Is this a good way to approach the problem? If so, what is the recommended way of launching one Python VM per request?

Stefan, perhaps "Docker" could be of use? I get the impression that you could constrain the VM that the application is run in -- an example web service:
http://docs.docker.io/en/latest/examples/python_web_app/
You could try running the application on Digital Ocean, like so:
https://www.digitalocean.com/community/articles/how-to-install-and-use-docker-getting-started

[disclaimer: I'm an engineer at Continuum working on Wakari]
Wakari Enterprise (http://enterprise.wakari.io) is aiming to do exactly this, and we're hoping to back-port the functionality into Wakari Cloud (http://wakari.io) so "published" IPython Notebooks can have some knobs on them for variable input control, then they can be "invoked" in a sandboxed state, and then the output given back to the user.
However for things that exist now, you should look at Sage Notebook. A few years ago several people worked hard on a Sage Notebook Cell Server that could do exactly what you were asking for: execute small code snippets. I haven't followed it since then, but it seems it is still alive and well from a quick search:
http://sagecell.sagemath.org/?q=ejwwif
http://sagecell.sagemath.org
http://www.sagemath.org/eval.html
For the last URL, check out Graphics->Mandelbrot and you can see that Sage already has some great capabilities for UI widgets that are tied to the "cell execution".

I think docker is the way to go for this. The instances are very light weight, and docker is designed to spawn 100s of instances at a time (Spin up time is fractions of a second vs traditional VMs couple of seconds). Configured correctly I believe it also gives you a complete sandboxed environment. Then it matters not about trying to sandbox python :-D

I'm not sure if you really have to go as far as setting up LXC containers:
There is seccomp-nurse, a Python sandbox that leverages the seccomp feature of the Linux kernel.
Another option would be to use PyPy, which has explicit support for sandboxing out of the box.
In any case, do not use pysandbox, it is broken by design and has severe security risks.

Deploying a Python Script to Windows and Linux

I have a python server that I need to run in both a Linux and Windows environment, and my question is about deployment. What is the best method for deploying the solution instead of just double clicking on the file and running it?
Since I use the server_forever() on the server, I can just run the script from command line, but this keeps the python window open. If I log off the machine, naturally the process will stop. So what is the best method for deploying a python script that needs to keep running if the user is logged in or off a machine.
Since I am going to be using multiple environment, Linux and Windows, can you please be specific in what OS you are talking about?
For windows, I was thinking of running the script 'At Startup' using the Windows scheduler. But I wanted to see if anyone had a better option. For linux, I really don't know what to create. I am assuming a CRON job?
Deployment does refer to coding, so using serve_forever() on a multiprocessing job manager keeps the python window open upon execution. Is there a way to hide this window through code? Would you recommend using a conversion tool like py2exe instead?

This is the subject matter of a whole library of books, so I will just give an introduction here :-)
You can basically start scripts directly and then have multiple options to do this in a way that they keep running in the background.
If you have certain functionality that needs to run on regular moments, you would do this by scheduling it:
Windows: Windows Scheduler or specific scheduling tools
Linux: Cron
If your problem is that you want to start a script without it closing on you while SSH'ing into Linux, you want to look into the "screen" or "tmux" tools.
If you want to have it started automatically this could be done by using the "At Startup" as you point out and Linux has similar functionalities, but the preferred and more robust way would be to set up a service that is better integrated with the OS.
Windows: Windows Service
Linux: Daemon
Even more capabilities can be yielded by using an application server such a Django
Tomcat (see comment) is an option, but definitely not the standard one; you'll have a hard time finding support both from Tomcat people running Python or Python people running their stuff on Tomcat. That being said, I imagine you could enable CGI and have it run a Python command with your script.
Yet, instead of just starting a Python script I would strongly encourage you to have a look at different Python options that are probably available for your specific use case. From lightweight web solutions like Flask over a versatile networking engine like Twisted to a full blown web framework like Django.
They all have rather well-thought-out deployment solutions available. Look up WSGI for more background.

faking a filesystem / virtual filesystem

I have a web service to which users upload python scripts that are run on a server. Those scripts process files that are on the server and I want them to be able to see only a certain hierarchy of the server's filesystem (best: a temporary folder on which I copy the files I want processed and the scripts).
The server will ultimately be a linux based one but if a solution is also possible on Windows it would be nice to know how.
What I though of is creating a user with restricted access to folders of the FS - ultimately only the folder containing the scripts and files - and launch the python interpreter using this user.
Can someone give me a better alternative? as relying only on this makes me feel insecure, I would like a real sandboxing or virtual FS feature where I could run safely untrusted code.

Either a chroot jail or a higher-order security mechanism such as SELinux can be used to restrict access to specific resources.

You are probably best to use a virtual machine like VirtualBox or VMware (perhaps even creating one per user/session).
That will allow you some control over other resources such as memory and network as well as disk
The only python that I know of that has such features built in is the one on Google App Engine. That may be a workable alternative for you too.

This is inherently insecure software. By letting users upload scripts you are introducing a remote code execution vulnerability. You have more to worry about than just modifying files, whats stopping the python script from accessing the network or other resources?
To solve this problem you need to use a sandbox. To better harden the system you can use a layered security approach.
The first layer, and the most important layer is a python sandbox. User supplied scripts will be executed within a python sandbox. This will give you the fine grained limitations that you need. Then, the entire python app should run within its own dedicated chroot. I highly recommend using the grsecurity kernel modules which improve the strength of any chroot. For instance a grsecuirty chroot cannot be broken unless the attacker can rip a hole into kernel land which is very difficult to do these days. Make sure your kernel is up to date.
The end result is that you are trying to limit the resources that an attacker's script has. Layers are a proven approach to security, as long as the layers are different enough such that the same attack won't break both of them. You want to isolate the script form the rest of the system as much as possible. Any resources that are shared are also paths for an attacker.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.