How can I simulate a Python shell most effectively and securely?

How can I simulate a Python shell most effectively and securely? - python

For to offer interactive examples about data analysis, I'd like to embed an interactive python shell. It does not necessarily have to be a real Python shell. Users shall be given tasks that they can execute in the shell. This is similar to existing tutorials, as seen on, e.g., http://www.codecademy.org, but I'd like to work with libraries that those solutions do not offer, as far as I understood.
In order to get a real shell on the website, I think of two approaches:
I found projects like http://www.repl.it, but it seems rather difficult to include the necessary libraries like SciPy, NumPy, and Pandas. In addition, user input has to be validated and I'm not sure whether that works with those shells I found.
I could pipe the commands through a web applications to a Python installation on my server, but I'm scared of using eval() on foreign, arbitrary code. Is there a safe mode for Python? I found http://www.pypy.org. Although they offer a Python sandbox, unfortunately, they do not support the libraries I need.
Alternatively, I thought of just embedding a "fake shell", which I build to copy the behaviour of the functions that I want to explain. Of course, this would result in more work, as I would have to write a fake interface, but for now it seems to be the only possibility.
I hope that this question is not too generic; I'm looking for either a good HTML/JS library that helps me put a fake shell on my website or a library/service/software that can embed a real Python shell with the required modules installed.

There is no way to run untrusted Python safely; Python's dynamic nature allows for too many ways to break through any protective layers you could care to think of.
Instead, run each session on a new virtual machine, properly locked down (firewalled, unprivileged user), which you shut down after a hard time limit. New sessions get a new, clean virtual machine.
This isolates you from any malicious code that might run and try to break out of a sandbox; a good virtual machine is hardware-isolated by the processor from the host OS, something a Python-only layer could never achieve.

This process is sometimes called sandboxing.
You can find some good information on the python wiki
There are basically three options available:
machine-level mechanisms (such as a VM, as Martijn Pieters suggested)
OS-level mechanisms (such as a chroot or SELinux)
custom interpreters, such as pypy (which has sandboxing capabilities, as you mentioned), or Jython, where you may be able to use the Java security manager or applet mechanisms.
You may also want to check Restricted Python, which is especially useful for very restricted environments, but security will depend on its configuration.
Ultimately, your choice of solution will depend on what you want to restrict:
Filesystem access? Block everything, or allow certain directories?
Network access, such as sockets?
Arbitrary system calls?

Related

What is the recommended practice in django to execute external scripts?

I'm planning to build a WebApp that will need to execute scripts based on the argument that an user will provide in a text-field or in the Url.
possible solutions that I have found:
create a lib directory in the root directory of the project, and put the scripts there, and import it from views.
using subprocess module to directly run the scripts in the following way:
subprocess.call(['python', 'somescript.py', argument_1,...])
argument_1: should be what an end user provides.

I'm planning to build a WebApp that will need to execute scripts
Why should it "execute scripts" ? Turn your "scripts" into proper modules, import the relevant functions and call them. The fact that Python can be used as a "scripting language" doesn't mean it's not a proper programming language.

Approach (1) should be the default approach. Never subprocess unless you absolutely have to.
Disadvantages of subprocessing:
Depends on the underlying OS and in your case Python (i.e. is python command the same as the Python that runs the original script?).
Potentially harder to make safe.
Harder to pass values, return results and report errors.
Eats more memory and cpu (a side effect is that you can utilize all cpu cores but since you are writing a web app it is likely you do that anyway).
Generally harder to code and maintain.
Advantages of subprocessing:
Isolates the runtime. This is useful if for example scripts are uploaded by users. You don't want them to mess with your application.
Related to 1: potentially easier to dynamically add scripts. Not that you should do that anyway. Also becomes harder when you have more then 1 server and you need to synchronize them.
Well, you can run non-python code that way. But it doesn't apply to your case.

Execution permissions in Python

I need to send code to remote clients to be executed in them but security is a concern for me right now. I don't want unsafe code to be executed there so I would like to control what a program is doing. I mean for example, know if is making connections, where is connecting to, if is reading local files, etc. Is this possible with Python?
EDIT: I'm thinking in something similar to Android permission system. I want to know what a code will do and if it does something different, stop it.

You could use a different Python runtime:
if you run your script using Jython; you can exploit Java's permission system
with Pypy's sandboxed version you can choose what is allowed to run in your controller script

There used to be a module in Python called bastian, but that was deprecated as it wasn't that secure. There's also I believe something called RPython, but I don't know too much about that.
I would in this case use Pyro and write the code on the target server. That way you know clients can only execute written and tested code.
edit - it's probably worth noting that Pyro also supports http://en.wikipedia.org/wiki/Privilege_separation - although I've not had to use it for that.

I think you are looking for a sandboxed python. There used to be an effort to implement this, but it has been abolished a couple of years ago.
Sandboxed python in the python wiki offers a nice overview of possible options for your usecase.

The most rigourous (but probably the slowest) way is to run Python on a bare OS in an emulator.
Depending on the OS you use, there are several ways of running programs with restrictions, but without the overhead of an emulator:
FreeBSD has a nice integrated solution in the form of jails.
These grew out of the chroot system call.
Linux-VServer aims to do more or less the same on Linux.

Differences between subprocess module, envoy, sarge and pexpect?

I am thinking about making a program that will need to send input and take output from the various aircrack-ng suite tools. I know of a couple of python modules like subprocess, envoy, sarge and pexpect that would provide the necessary functionality. Can anyone advise on what I should be using or not using, especially as I'm new to python.
Thanks

As the maintainer of sarge, I can tell you that its goals are broadly similar to envoy (in terms of ease of use over subprocess) and there is (IMO) more functionality in sarge with respect to:
Cross-platform support for bash-like syntax (e.g.use of &&, ||, & in command lines)
Better support for capturing subprocess output streams and working with them asynchronously
More documentation, especially about the internals and peripheral issues like threading+forking in the context of using subprocess
Support for prevention of shell injection attacks
Of course YMMV, but you can check out the docs, they're reasonably comprehensive.

pexpect
In 2015, pexpect does not work on windows. Rumored to add "experimental" support in the next version, but this has been a rumor for a long time (I'm not holding my breath).
Having written many applications using pexpect (and loving it), I am now sorry because one of the things I love about python (that it is cross platform) is not true for my applications.
If you plan to ever add windows support, for the moment, avoid pexpect.
envoy
Not much activity in the last year. And few commits (12 total) since 2012. Not very promising for its future.
Internally it uses shlex in a way that is not compatible with windows paths (the commands must use '/' not '\' for directory separators). A workaround (when using pathlib) is to call as_posix() on path objects before passing them as commands. See this answer.
Getting access to the internal streams (i.e. I want to parse the output to have some updating scrollbars), seems possible but is not documented.
sarge
Works on windows out-of-the-box and has an expect() method that should provide functionality similar to pexpect (allowing me to update a scrollbar). Recent activity, but it is hosted on gitlab and bitbucket (very confusing).
Personal Conclusion
I'm moving from pexpect to sarge for future development. Seems to provide similar feature set to pexpect and supports windows.

subprocess - is a standard library module, so it'll be available with python installation. But it has a reputation of hard to use since it's api is non-intuitive.
envoy - is a third party module that wraps around subprocess. It was written to be an easy to use alternative to subprocess. The author of envoy Kenneth Reitz is famous for his Python for Humans philosophy.
I'm not familiar with the other two.

Limiting the features of an embedded python instance

Is there a way to limit the abilities of python scripts running under an embedded interpretor? Specifically I wish to prevent the scripts from doing things like the following:
Importing python extension modules (ie .pyd modules), except those specifically allowed by the application.
Manipulating processes in any way (ie starting new processes, or terminating the application).
Any kind of networking.
Manipulating the file system (eg creating, modifying and deleting files).

No. There's no easy way to prevent those things on CPython. Your options are:
Edit CPython source code and remove things you don't want - provide mocking methods for all those things. Very error-prone and hard to do. This is the approach of Google's App Engine.
Use Restricted Python. However, with it you can't prevent your user from exhausting the memory available or running infinite eat-all-cpu loops.
Use another python implementation. PyPy has a sandbox mode you can use. Jython runs under java and I guess java can be sandboxed.

Maybe this can be helpful. You have an example provided on how to work with the ast.

What you want it Google's Unladen Swallow project that Python version of App Engine runs on.
Modules are severely restricted, ctypes are not allowed, sockets are matched against some policy or other, in other words you get a sandboxed version of Python, in line with their Java offering.
I'd like to point out that this makes the system almost useless. Well useless for anything cooler than yet another [App Engine] App. Forget monkey-patching system modules, and even access to own stack is restricted. Totally un-dynamic-like.
OT: games typically embed LUA for scripting, perhaps you should check it out.

Safe Python Environment in Linux

Is it possible to create an environment to safely run arbitrary Python scripts under Linux? Those scripts are supposed to be received from untrusted people and may be too large to check them manually.
A very brute-force solution is to create a virtual machine and restore its initial state after every launch of an untrusted script. (Too expensive.)
I wonder if it's possible to restrict Python from accessing the file system and interacting with other programs and so on.

Consider using a chroot jail. Not only is this very secure, well-supported and tested but it also applies to external applications you run from python.

There are 4 things you may try:
As you already mentioned, using a virtual machine or some other form of virtualisation (perhaps solaris zones are lightweight enough?). If the script breaks the OS there then you don't care.
Using chroot, which puts a shell session into a virtual root directory, separate from the main OS root directory.
Using systrace. Think of this as a firewall for system calls.
Using a "jail", which builds upon systrace, giving each jail it's own process table etc.
Systrace has been compromised recently, so be aware of that.

You could run jython and use the sandboxing mechanism from the JVM. The sandboxing in the JVM is very strong very well understood and more or less well documented. It will take some time to define exactly what you want to allow and what you dnt want to allow, but you should be able to get a very strong security from that ...
On the other side, jython is not 100% compatible with cPython ...

Try searching for "sandboxing python", e.g.:
http://wiki.python.org/moin/SandboxedPython
http://wiki.python.org/moin/How%20can%20I%20run%20an%20untrusted%20Python%20script%20safely%20(i.e.%20Sandbox)

could you not just run as a user which has no access to anything but the scripts in that directory?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.