I am still somewhat new to Python, and have begun learning to use the Twisted framework so that I may set up an asynchronous web server. The details about storing stateful information in the Session object is pretty straightforward, but there is something that is lacking in the documentation that is throwing me off. The first line in the script on this tutorial reads:
cache()
...rest of the script goes here
This is something that only works in what is called an rpy script - more about that here. The problem is, I don't really want to use an rpy script, and it allegedly is not a requirement. The page I referenced describes rpy scripts as being mainly for experimenting with new ideas AND NOT MUCH ELSE.
My issue is that when I try to run a non-rpy version of my script, I get this error:
NameError: name 'cache' is not defined
Some additional research has told me that cache() is a part of the globals for every rpy script, so there is no need to import. However, the documentation doesn't describe how to use cache() in a non-rpy file. So, there is my question - how can I use cache() in a non-rpy file? I am pretty sure it is just a matter of knowing which module to import, which I do not. Any help will be appreciated.
A distinctive feature of the handling of rpy scripts by Twisted Web is that the source code is re-evaluated on each request.
cache is an API specifically for rpy scripts to tell the runtime not to re-evaluate the source again. If cache is called, the results of evaluating the source are saved and used to satisfy the next request for that resource.
Since this feature is unique to handling of rpy scripts, there is no need or value in using cache when defining resources for Twisted Web in a different way.
Apparently, if you aren't using an rpy file, you simply don't need to use cache(). I simply removed that line from the code and it seems to be working fine. Any additional input on this is still appreciated, because the documentation is lacking.
Related
I'm developing a web game in pure Python, and want some simple scripting available to allow for more dynamic game content. Game content can be added live by privileged users.
It would be nice if the scripting language could be Python. However, it can't run with access to the environment the game runs on since a malicious user could wreak havoc which would be bad. Is it possible to run sandboxed Python in pure Python?
Update: In fact, since true Python support would be way overkill, a simple scripting language with Pythonic syntax would be perfect.
If there aren't any Pythonic script interpreters, are there any other open source script interpreters written in pure Python that I could use? The requirements are support for variables, basic conditionals and function calls (not definitions).
This is really non-trivial.
There are two ways to sandbox Python. One is to create a restricted environment (i.e., very few globals etc.) and exec your code inside this environment. This is what Messa is suggesting. It's nice but there are lots of ways to break out of the sandbox and create trouble. There was a thread about this on Python-dev a year ago or so in which people did things from catching exceptions and poking at internal state to break out to byte code manipulation. This is the way to go if you want a complete language.
The other way is to parse the code and then use the ast module to kick out constructs you don't want (e.g. import statements, function calls etc.) and then to compile the rest. This is the way to go if you want to use Python as a config language etc.
Another way (which might not work for you since you're using GAE), is the PyPy sandbox. While I haven't used it myself, word on the intertubes is that it's the only real sandboxed Python out there.
Based on your description of the requirements (The requirements are support for variables, basic conditionals and function calls (not definitions)) , you might want to evaluate approach 2 and kick out everything else from the code. It's a little tricky but doable.
Roughly ten years after the original question, Python 3.8.0 comes with auditing. Can it help? Let's limit the discussion to hard-drive writing for simplicity - and see:
from sys import addaudithook
def block_mischief(event,arg):
if 'WRITE_LOCK' in globals() and ((event=='open' and arg[1]!='r')
or event.split('.')[0] in ['subprocess', 'os', 'shutil', 'winreg']): raise IOError('file write forbidden')
addaudithook(block_mischief)
So far exec could easily write to disk:
exec("open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')", dict(locals()))
But we can forbid it at will, so that no wicked user can access the disk from the code supplied to exec(). Pythonic modules like numpy or pickle eventually use the Python's file access, so they are banned from disk write, too. External program calls have been explicitly disabled, too.
WRITE_LOCK = True
exec("open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')", dict(locals()))
exec("open('/tmp/FILE','a').write('pwned by l33t h4xx0rz')", dict(locals()))
exec("numpy.savetxt('/tmp/FILE', numpy.eye(3))", dict(locals()))
exec("import subprocess; subprocess.call('echo PWNED >> /tmp/FILE', shell=True)", dict(locals()))
An attempt of removing the lock from within exec() seems to be futile, since the auditing hook uses a different copy of locals that is not accessible for the code ran by exec. Please prove me wrong.
exec("print('muhehehe'); del WRITE_LOCK; open('/tmp/FILE','w')", dict(locals()))
...
OSError: file write forbidden
Of course, the top-level code can enable file I/O again.
del WRITE_LOCK
exec("open('/tmp/FILE','w')", dict(locals()))
Sandboxing within Cpython has proven extremely hard and many previous attempts have failed. This approach is also not entirely secure e.g. for public web access:
perhaps hypothetical compiled modules that use direct OS calls cannot be audited by Cpython - whitelisting the safe pure pythonic modules is recommended.
Definitely there is still the possibility of crashing or overloading the Cpython interpreter.
Maybe there remain even some loopholes to write the files on the harddrive, too. But I could not use any of the usual sandbox-evasion tricks to write a single byte. We can say the "attack surface" of Python ecosystem reduces to rather a narrow list of events to be (dis)allowed: https://docs.python.org/3/library/audit_events.html
I would be thankful to anybody pointing me to the flaws of this approach.
EDIT: So this is not safe either! I am very thankful to #Emu for his clever hack using exception catching and introspection:
#!/usr/bin/python3.8
from sys import addaudithook
def block_mischief(event,arg):
if 'WRITE_LOCK' in globals() and ((event=='open' and arg[1]!='r') or event.split('.')[0] in ['subprocess', 'os', 'shutil', 'winreg']):
raise IOError('file write forbidden')
addaudithook(block_mischief)
WRITE_LOCK = True
exec("""
import sys
def r(a, b):
try:
raise Exception()
except:
del sys.exc_info()[2].tb_frame.f_back.f_globals['WRITE_LOCK']
import sys
w = type('evil',(object,),{'__ne__':r})()
sys.audit('open', None, w)
open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')""", dict(locals()))
I guess that auditing+subprocessing is the way to go, but do not use it on production machines:
https://bitbucket.org/fdominec/experimental_sandbox_in_cpython38/src/master/sandbox_experiment.py
AFAIK it is possible to run a code in a completely isolated environment:
exec somePythonCode in {'__builtins__': {}}, {}
But in such environment you can do almost nothing :) (you can not even import a module; but still a malicious user can run an infinite recursion or cause running out of memory.) Probably you would want to add some modules that will be the interface to you game engine.
I'm not sure why nobody mentions this, but Zope 2 has a thing called Python Script, which is exactly that - restricted Python executed in a sandbox, without any access to filesystem, with access to other Zope objects controlled by Zope security machinery, with imports limited to a safe subset.
Zope in general is pretty safe, so I would imagine there are no known or obvious ways to break out of the sandbox.
I'm not sure how exactly Python Scripts are implemented, but the feature was around since like year 2000.
And here's the magic behind PythonScripts, with detailed documentation: http://pypi.python.org/pypi/RestrictedPython - it even looks like it doesn't have any dependencies on Zope, so can be used standalone.
Note that this is not for safely running arbitrary python code (most of the random scripts will fail on first import or file access), but rather for using Python for limited scripting within a Python application.
This answer is from my comment to a question closed as a duplicate of this one: Python from Python: restricting functionality?
I would look into a two server approach. The first server is the privileged web server where your code lives. The second server is a very tightly controlled server that only provides a web service or RPC service and runs the untrusted code. You provide your content creator with your custom interface. For example you if you allowed the end user to create items, you would have a look up that called the server with the code to execute and the set of parameters.
Here's and abstract example for a healing potion.
{function_id='healing potion', action='use', target='self', inventory_id='1234'}
The response might be something like
{hp='+5' action={destroy_inventory_item, inventory_id='1234'}}
Hmm. This is a thought experiment, I don't know of it being done:
You could use the compiler package to parse the script. You can then walk this tree, prefixing all identifiers - variables, method names e.t.c. (also has|get|setattr invocations and so on) - with a unique preamble so that they cannot possibly refer to your variables. You could also ensure that the compiler package itself was not invoked, and perhaps other blacklisted things such as opening files. You then emit the python code for this, and compiler.compile it.
The docs note that the compiler package is not in Python 3.0, but does not mention what the 3.0 alternative is.
In general, this is parallel to how forum software and such try to whitelist 'safe' Javascript or HTML e.t.c. And they historically have a bad record of stomping all the escapes. But you might have more luck with Python :)
I think your best bet is going to be a combination of the replies thus far.
You'll want to parse and sanitise the input - removing any import statements for example.
You can then use Messa's exec sample (or something similar) to allow the code execution against only the builtin variables of your choosing - most likely some sort of API defined by yourself that provides the programmer access to the functionality you deem relevant.
I want to allow users to make their own Python "mods" for my game, by placing their scripts in a special folder which the game "scans" for Python modules and imports.
What would be the simplest way to prevent "dangerous" scripts from being imported? I don't want people complaining to me that they used someone's mod and it erased their hard drive.
Things I would like to limit is accessing/modifying/creating any files outside of their folder and connecting to the internet/downloading/sending data. If you can thik of anything else, let me know.
So how can this be done?
Restricted Python seems to able to restrict functionality for code in a clean way and is compatible with python up to 2.7.
http://pypi.python.org/pypi/RestrictedPython/
e.g.
By supplying a different __builtins__ dictionary, we can rule out unsafe operations, such as opening files [...]
The obvious way to do it is to load the module as a string and exec it. This has just as many security risks, but might be easier to block by using custom globals and locals. Have a look at this question - it gives some really good guidance on this. As pointed out in Delnan's comments, this isn't completely secure though.
You could also try this. I haven't used it, but it seems to provide a safe environment for unsafe scripts.
There are some serious shortcomings for sandboxed python execution. aquavitae's answer links to some good discussion on the matter, especially this blog post. Read that first.
There is a kernel of secure execution within cPython. The fundamental idea is to replace the __builtins__ global (Note: not the __builtin__ module), which informs python to turn on some security features; making a handful of attributes on certain objects inaccessible, and removing most of the implementation objects from the interpreter when evaulating that bit of code.
You'll then need to write an actual implementation; in such a way that the protected modules are not the leaked into the sandbox. A fairly tested "file" replacement is provided in the linked blog. Getting a look on that might give you an idea of how involved and complex this problem is.
So now that you have understood that this is a challenge in python; you should take a look at languages with sandbox execution as a core feature, such as Lua, which is very popular in games.
Giving them python execution and trying to limit what they do is asking for trouble. See this SO question for discussion and a pointer to a good article. (You would presumably disable "eval", but it wouldn't make much difference in practice.
My suggestion: Turn the question around. Your goal is to provide them with scripting facilities so they can enhance the game. Find or define an interpreter for a suitable scripting language that has the features you need, and use it to execute their scripts. For example, you could support data persistence in a simple keystore model, without giving them file creation access. Or give them a command to create files but ensure it only accepts a path-less filename. The essential thing is to ensure that there is NO way for them to execute python commands directly.
I'm making a wxpython app that I will compile with the various freezing utility out there to create an executable for multiple platforms.
the program will be a map editer for a tile-based game engine
in this app I want to provide a scripting system so that advanced users can modify the behavior of the program such as modifying project data, exporting the project to a different format ect.
I want the system to work like so.
the user will place the python script they wish to run into a styled textbox and then press a button to execute the script.
I'm good with this so far thats all really simple stuff.
obtain the script from the text-box as a string compile it to a cod object with the inbuilt function compile() then execute the script with an exec statment
script = textbox.text #bla bla store the string
code = compile(script, "script", "exec") #make the code object
eval(code, globals())
the thing is, I want to make sure that this feature can't cause any errors or bugs
say if there is an import statement in the script. will this cause any problems taking into account that the code has been compiled with something like py2exe or py2app?
how do I make sure that the user can't break critical part of the program like modifying part of the GUI while still allowing them to modify the project data (the data is held in global properties in it's own module)? I think that this would mean modifying the globals dict that is passed to the eval function.
how to I make sure that this eval can't cause the program to hang due to a long or infinite loop?
how do I make sure that an error raised inside the user's code can't crash the whole app?
basically, how to I avoid all those problems that can arise when allowing the user to run their own code?
EDIT: Concerning the answers given
I don't feel like any of the answers so far have really answered my questions
yes they have been in part answered but not completely. I'm well aware the it is impossible to completely stop unsafe code. people are just too clever for one man (or even a teem) to think of all the ways to get around a security system and prevent them.
in fact I don't really care if they do. I'm more worried about some one unintentional breaking something they didn't know about. if some one really wanted to they could tear the app to shreds with the scripting functionality, but I couldn't care less. it will be their instance and all the problem they create will be gone when they restart the app unless they have messed with files on the HD.
I want to prevent the problems that arise when the user dose something stupid.
things like IOError's, SystaxErrors, InfiniteLoopErrors ect.
now the part about scope has been answered. I now understand how to define what functions and globals can be accessed from the eval function
but is there a way to make sure that the execution of their code can be stopped if it is taking too long?
a green thread system perhaps? (green because it would be eval to make users worry about thread safety)
also if a users uses an import module statement to load a module from even the default library that isn't used in the rest of the class. could this cause problems with the app being frozen by Py2exe, Py2app, or Freeze? what if they call a modal out side of the standard library? would it be enough that the modal is present in the same directory as the frozen executable?
I would like to get these answers with out creating a new question but I will if I must.
Easy answer: don't.
You can forbid certain keywords (import) and operations, and accesses to certain data structures, but ultimately you're giving your power users quite a bit of power. Since this is for a rich client that runs on the user's machine, a malicious user can crash or even trash the whole app if they really feel like it. But it's their instance to crash. Document it well and tell people what not to touch.
That said, I've done this sort of thing for web apps that execute user input and yes, call eval like this:
eval(code, {"__builtins__":None}, {safe_functions})
where safe_functions is a dictionary containing {"name": func} type pairs of functions you want your users to be able to access. If there's some essential data structure that you're positive your users will never want to poke at, just pop it out of globals before passing them in.
Incidentally, Guido addressed this issue on his blog a while ago. I'll see if I can find it.
Edit: found.
Short Answer: No
Is using eval in Python a bad practice?
Other related posts:
Safety of Python 'eval' For List Deserialization
It is not easy to create a safety net. The details too many and clever hacks are around:
Python: make eval safe
On your design goals:
It seems you are trying to build an extensible system by providing user to modify a lot of behavior and logic.
Easiest option is to ask them to write a script which you can evaluate (eval) during the program run.
How ever, a good design describes , scopes the flexibility and provides scripting mechanism through various design schemes ranging from configuration, plugin to scripting capabilities etc. The scripting apis if well defined can provide more meaningful extensibility. It is safer too.
I'd suggest providing some kind of plug-in API and allowing users to provide plug-ins in the form of text files. You can then import them as modules into their own namespace, catching syntax errors in the process, and call the various functions defined in the plug-in module, again checking for errors. You can provide an API module that defines the functions/classes from your program that the plug-in module has access to. That gives you the freedom to make changes to your application's architecture without breaking plug-ins, since you can just adapt the API module to expose the functionality in the same way.
If you have the option to switch to Tkinter you can use the bundled tcl interpreter to process your script. For that matter you can probably do that with a wxpython app if you don't start the tk event loop; just use the tcl interpreter without creating any windows.
Since the tcl interpreter is a separate thing it should be nearly impossible to crash the python interpreter if you are careful about what commands you expose to tcl. Plus, tcl makes creating DSLs very easy.
Python - the only scripting language with a built-in scripting engine :-).
I'm writing tests for a Django application that uses an external data source. Obviously, I'm using fake data to test all the inner workings of my class but I'd like to have a couple of tests for the actual fetcher as well. One of these will need to verify that the external source is still sending the data in the format my application expects, which will mean making the request to retrieve that information in the test.
Obviously, I don't want our CI to come down when there is a network problem or the data provider has a spot of downtime. In this case, I would like to throw a warning that skips the rest of that test method and doesn't contribute to an overall failure. This way, if the data arrives successfully I can check it for consistency (failing if there is a problem) but if the data cannot be fetched it logs a warning so I (or another developer) know to quickly check that the data source is ok.
Basically, I would like to test my external source without being dependant on it!
Django's test suite uses Python's unittest module (at least, that's how I'm using it) which looks useful, given that the documentation for it describes Skipping tests and expected failures. This feature is apparently 'new in version 2.7', which explains why I can't get it to work - I've checked the version of unittest I have installed on from the console and it appears to be 1.63!
I can't find a later version of unittest in pypi so I'm wondering where I can get hold of the unittest version described in that document and whether it will work with Django (1.2).
I'm obviously open to recommendations / discussion on whether or not this is the best approach to my problem :)
[EDIT - additional information / clarification]
As I said, I'm obviously mocking the dependancy and doing my tests on that. However, I would also like to be able to check that the external resource (which typically is going to be an API) still matches my expected format, without bringing down CI if there is a network problem or their server is temporarily down. I basically just want to check the consistency of the resource.
Consider the following case...
If you have written a Twitter application, you will have tests for all your application's methods and behaviours - these will use fake Twitter data. This gives you a complete, self-contained set of tests for your application. The problem is that this doesn't actually check that the application works because your application inherently depends on the consistency of Twitter's API. If Twitter were to change an API call (perhaps change the URL, the parameters or the response) the application would stop working even though the unit tests would still pass. (Or perhaps if they were to completely switch off basic authentication!)
My use case is simpler - I have a single xml resource that is used to import information. I have faked the resource and tested my import code but I would like to have a test that checks the format of that xml resource has not changed.
My question is about skipping tests in Django's test runner so I can throw a warning if the resource is unavailable without the tests failing, specifically getting a version of Python's unittest module that supports this behaviour. I've given this much background information to allow anyone with experience in this area to offer alternative suggestions.
Apologies for the lengthy question, I'm aware most people won't read this now.
I've 'bolded' the important bits to make it easier to read.
I created a separate answer since your edit invalidated my last answer.
I assume you're running on Python version 2.6 - I believe the changes that you're looking for in unittest are available in Python version 2.7. Since unittest is in the standard library, updating to Python 2.7 should make those changes available to you. Is that an option that will work for you?
One other thing that I might suggest is to maybe separate the "external source format verification" test(s) into a separate test suite and run that separately from the rest of your unit tests. That way your core unit tests are still fast and you don't have to worry about the external dependencies breaking your main test suites. If you're using Hudson, it should be fairly easy to create a separate job that will handle those tests for you. Just a suggestion.
The new features in unittest in 2.7 have been backported to 2.6 as unittest2. You can just pip install and substitute unittest2 for unittest and your tests will work as thyey did plus you get the new features without upgrading to 2.7.
What are you trying to test? The code in your Django application(s) or the dependency? Can you just Mock whatever that external dependency is? If you're just wanting to test your Django application, then I would say Mock the external dependency, so your tests are not dependent upon that external resource's availability.
If you can post some code of your "actual fetcher" maybe you will get some tips on how you could use mocks.
I'm trying to create a construct in Python 3 that will allow me to easily execute a function on a remote machine.
Assuming I've already got a python tcp server that will run the functions it receives, running on the remote server, I'm currently looking at using a decorator like
#execute_on(address, port)
This would create the necessary context required to execute the function it is decorating and then send the function and context to the tcp server on the remote machine, which then executes it. Firstly, is this somewhat sane? And if not could you recommend a better approach? I've done some googling but haven't found anything that meets these needs.
I've got a quick and dirty implementation for the tcp server and client so fairly sure that'll work. I can get a string representation the function (e.g. func) being passed to the decorator by
import inspect
string = inspect.getsource(func)
which can then be sent to the server where it can be executed. The problem is, how do I get all of the context information that the function requires to execute? For example, if func is defined as follows,
import MyModule
def func():
result = MyModule.my_func()
MyModule will need to be available to func either in the global context or funcs local context on the remote server. In this case that's relatively trivial but it can get so much more complicated depending on when and how import statements are used. Is there an easy and elegant way to do this in Python? The best I've come up with at the moment is using the ast library to pull out all import statements, using the inspect module to get string representations of those modules and then reconstructing the entire context on the remote server. Not particularly elegant and I can see lots of room for error.
Thanks for your time
The approach you outline is extremely risky unless the remote server is somehow very strongly protected or "extremely sandboxed" (e.g a BSD "jail") -- anybody who can send functions to it would be able to run arbitrary code there.
Assuming you have an authentication system that you trust entirely, comes the "fragility" problem that you realized -- the function can depend on any globals defined in its module at the moment of execution (which can be different from those you can detect by inspection: determining the set of imported modules, and more generally of globals, at execution time, is a Turing-complete problem).
You can deal with the globals problem by serializing the function's globals, as well as the function itself, at the time you send it off for remote execution (whether you serialize all this stuff in readable string form, or otherwise, is a minor issue). But that still leaves you with the issue of imports performed inside the function.
Unless you're willing to put some limitations on the "remoted" function, such as "no imports inside the function (and functions called from it)", I'm thinking you could have the server override __import__ (the built-in function that is used by all import statements and is designed to be overridden for peculiar needs, such as yours;-) to ask for the extra module from the sending client (of course, that requires that said client also have "server-like" capabilities, in that it must be able to respond to such "module requests" from the server).
Can't you impose some restrictions on functions that are remoted, to bring this task back into the domain of sanity...?
You may interested in the execnet project.
execnet provides carefully tested means to easily interact with Python interpreters across version, platform and network barriers. It has a minimal and fast API targetting the following uses:
distribute tasks to local or remote CPUs
write and deploy hybrid multi-process applications
write scripts to administer a bunch of exec environments
http://codespeak.net/execnet/example/test_info.html#get-information-from-remote-ssh-account
I've seen a demo of it. But never used it myself.
Its' not clear from your question whether there is some system limitation/requirement for you to solve your problem in this way. If not there may be much easier and quicker ways of doing this using some sort of messaging infrastructure.
For example you might consider whether [Celery][1]
[1]: http://ask.github.com/celery/getting-started/introduction.html will meet your needs.
What's your end goal with this? From your description I can see no reason why you can't just create a simple messaging class and send instances of that to command the remote machine to do 'stuff'.. ?
Whatever you do, your remote machine is going to need the Python source to execute so why not distribute the code there and then run it? You could create a simple server which would accept some python source files, extract them import the relevent modules and then run a command?
This will probably be hard to do and you'll run into issues with security, arguments etc.
Maybe you can just run a Python interpreter remotely that can be given code using a socket or an HTTP service rather than do it on a function by function level?