I am writing a game engine in Python and the thing is I am not sure how to handle external scripts (think source engine mods, LUA). Every scene, entity in a game can have custom script attached to it, but game engine is not aware of those scripts until the scene is being loaded. For example there could be a script, which would animate game cutscene and that script would be used only in one scene.
So, what i want to know is what's the best way to handle those scripts? I know I could import them with exec or eval, but someone said it's now safe. Why? I could also create some scripting language, which would be parsed during runtime, but I don't see a point in that considering that Python is scripting language itself. Any help would be greatly appreciated.
You can use __import__() for this. Say your entity has a script attribute that can be either None or the name of a script (that is in a known location, which is on sys.path), you could use the following code to run the main() function from that script:
if entity.script is not None:
custom_script = __import__(entity.script, globals(), locals(), [], -1)
custom_script.main()
Yes, exec and eval are not safe for user-supplied data, like handling expressions or Web input. But if you specifically want to give users the full power of Python, and you understand the risks (users can do anything – erase/read files on the computer, crash Python, enter an infinite loop, etc.), using exec is perfectly fine.
If you trust your game script designer(s), exec away. If the levels can come from random people on the Internet, at least let your players be aware of the risk.
You can run your external script either using by importing it using __import__() or using exec/execfile function. It is nice to know that:
import compiles script to byte-code and saves .pyc file at first usage
import tries to search for the module and eventually it creates module with its own namespace in contrast to exec/execfile that just executes the code in string/file.
it is possible to sandbox your code a bit by mangling with globals builtins during passing globals as parameters to __import__/execfile, although this method is not super safe, see my answer to this question for example.
Related
I'm developing a web game in pure Python, and want some simple scripting available to allow for more dynamic game content. Game content can be added live by privileged users.
It would be nice if the scripting language could be Python. However, it can't run with access to the environment the game runs on since a malicious user could wreak havoc which would be bad. Is it possible to run sandboxed Python in pure Python?
Update: In fact, since true Python support would be way overkill, a simple scripting language with Pythonic syntax would be perfect.
If there aren't any Pythonic script interpreters, are there any other open source script interpreters written in pure Python that I could use? The requirements are support for variables, basic conditionals and function calls (not definitions).
This is really non-trivial.
There are two ways to sandbox Python. One is to create a restricted environment (i.e., very few globals etc.) and exec your code inside this environment. This is what Messa is suggesting. It's nice but there are lots of ways to break out of the sandbox and create trouble. There was a thread about this on Python-dev a year ago or so in which people did things from catching exceptions and poking at internal state to break out to byte code manipulation. This is the way to go if you want a complete language.
The other way is to parse the code and then use the ast module to kick out constructs you don't want (e.g. import statements, function calls etc.) and then to compile the rest. This is the way to go if you want to use Python as a config language etc.
Another way (which might not work for you since you're using GAE), is the PyPy sandbox. While I haven't used it myself, word on the intertubes is that it's the only real sandboxed Python out there.
Based on your description of the requirements (The requirements are support for variables, basic conditionals and function calls (not definitions)) , you might want to evaluate approach 2 and kick out everything else from the code. It's a little tricky but doable.
Roughly ten years after the original question, Python 3.8.0 comes with auditing. Can it help? Let's limit the discussion to hard-drive writing for simplicity - and see:
from sys import addaudithook
def block_mischief(event,arg):
if 'WRITE_LOCK' in globals() and ((event=='open' and arg[1]!='r')
or event.split('.')[0] in ['subprocess', 'os', 'shutil', 'winreg']): raise IOError('file write forbidden')
addaudithook(block_mischief)
So far exec could easily write to disk:
exec("open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')", dict(locals()))
But we can forbid it at will, so that no wicked user can access the disk from the code supplied to exec(). Pythonic modules like numpy or pickle eventually use the Python's file access, so they are banned from disk write, too. External program calls have been explicitly disabled, too.
WRITE_LOCK = True
exec("open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')", dict(locals()))
exec("open('/tmp/FILE','a').write('pwned by l33t h4xx0rz')", dict(locals()))
exec("numpy.savetxt('/tmp/FILE', numpy.eye(3))", dict(locals()))
exec("import subprocess; subprocess.call('echo PWNED >> /tmp/FILE', shell=True)", dict(locals()))
An attempt of removing the lock from within exec() seems to be futile, since the auditing hook uses a different copy of locals that is not accessible for the code ran by exec. Please prove me wrong.
exec("print('muhehehe'); del WRITE_LOCK; open('/tmp/FILE','w')", dict(locals()))
...
OSError: file write forbidden
Of course, the top-level code can enable file I/O again.
del WRITE_LOCK
exec("open('/tmp/FILE','w')", dict(locals()))
Sandboxing within Cpython has proven extremely hard and many previous attempts have failed. This approach is also not entirely secure e.g. for public web access:
perhaps hypothetical compiled modules that use direct OS calls cannot be audited by Cpython - whitelisting the safe pure pythonic modules is recommended.
Definitely there is still the possibility of crashing or overloading the Cpython interpreter.
Maybe there remain even some loopholes to write the files on the harddrive, too. But I could not use any of the usual sandbox-evasion tricks to write a single byte. We can say the "attack surface" of Python ecosystem reduces to rather a narrow list of events to be (dis)allowed: https://docs.python.org/3/library/audit_events.html
I would be thankful to anybody pointing me to the flaws of this approach.
EDIT: So this is not safe either! I am very thankful to #Emu for his clever hack using exception catching and introspection:
#!/usr/bin/python3.8
from sys import addaudithook
def block_mischief(event,arg):
if 'WRITE_LOCK' in globals() and ((event=='open' and arg[1]!='r') or event.split('.')[0] in ['subprocess', 'os', 'shutil', 'winreg']):
raise IOError('file write forbidden')
addaudithook(block_mischief)
WRITE_LOCK = True
exec("""
import sys
def r(a, b):
try:
raise Exception()
except:
del sys.exc_info()[2].tb_frame.f_back.f_globals['WRITE_LOCK']
import sys
w = type('evil',(object,),{'__ne__':r})()
sys.audit('open', None, w)
open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')""", dict(locals()))
I guess that auditing+subprocessing is the way to go, but do not use it on production machines:
https://bitbucket.org/fdominec/experimental_sandbox_in_cpython38/src/master/sandbox_experiment.py
AFAIK it is possible to run a code in a completely isolated environment:
exec somePythonCode in {'__builtins__': {}}, {}
But in such environment you can do almost nothing :) (you can not even import a module; but still a malicious user can run an infinite recursion or cause running out of memory.) Probably you would want to add some modules that will be the interface to you game engine.
I'm not sure why nobody mentions this, but Zope 2 has a thing called Python Script, which is exactly that - restricted Python executed in a sandbox, without any access to filesystem, with access to other Zope objects controlled by Zope security machinery, with imports limited to a safe subset.
Zope in general is pretty safe, so I would imagine there are no known or obvious ways to break out of the sandbox.
I'm not sure how exactly Python Scripts are implemented, but the feature was around since like year 2000.
And here's the magic behind PythonScripts, with detailed documentation: http://pypi.python.org/pypi/RestrictedPython - it even looks like it doesn't have any dependencies on Zope, so can be used standalone.
Note that this is not for safely running arbitrary python code (most of the random scripts will fail on first import or file access), but rather for using Python for limited scripting within a Python application.
This answer is from my comment to a question closed as a duplicate of this one: Python from Python: restricting functionality?
I would look into a two server approach. The first server is the privileged web server where your code lives. The second server is a very tightly controlled server that only provides a web service or RPC service and runs the untrusted code. You provide your content creator with your custom interface. For example you if you allowed the end user to create items, you would have a look up that called the server with the code to execute and the set of parameters.
Here's and abstract example for a healing potion.
{function_id='healing potion', action='use', target='self', inventory_id='1234'}
The response might be something like
{hp='+5' action={destroy_inventory_item, inventory_id='1234'}}
Hmm. This is a thought experiment, I don't know of it being done:
You could use the compiler package to parse the script. You can then walk this tree, prefixing all identifiers - variables, method names e.t.c. (also has|get|setattr invocations and so on) - with a unique preamble so that they cannot possibly refer to your variables. You could also ensure that the compiler package itself was not invoked, and perhaps other blacklisted things such as opening files. You then emit the python code for this, and compiler.compile it.
The docs note that the compiler package is not in Python 3.0, but does not mention what the 3.0 alternative is.
In general, this is parallel to how forum software and such try to whitelist 'safe' Javascript or HTML e.t.c. And they historically have a bad record of stomping all the escapes. But you might have more luck with Python :)
I think your best bet is going to be a combination of the replies thus far.
You'll want to parse and sanitise the input - removing any import statements for example.
You can then use Messa's exec sample (or something similar) to allow the code execution against only the builtin variables of your choosing - most likely some sort of API defined by yourself that provides the programmer access to the functionality you deem relevant.
I want to allow users to make their own Python "mods" for my game, by placing their scripts in a special folder which the game "scans" for Python modules and imports.
What would be the simplest way to prevent "dangerous" scripts from being imported? I don't want people complaining to me that they used someone's mod and it erased their hard drive.
Things I would like to limit is accessing/modifying/creating any files outside of their folder and connecting to the internet/downloading/sending data. If you can thik of anything else, let me know.
So how can this be done?
Restricted Python seems to able to restrict functionality for code in a clean way and is compatible with python up to 2.7.
http://pypi.python.org/pypi/RestrictedPython/
e.g.
By supplying a different __builtins__ dictionary, we can rule out unsafe operations, such as opening files [...]
The obvious way to do it is to load the module as a string and exec it. This has just as many security risks, but might be easier to block by using custom globals and locals. Have a look at this question - it gives some really good guidance on this. As pointed out in Delnan's comments, this isn't completely secure though.
You could also try this. I haven't used it, but it seems to provide a safe environment for unsafe scripts.
There are some serious shortcomings for sandboxed python execution. aquavitae's answer links to some good discussion on the matter, especially this blog post. Read that first.
There is a kernel of secure execution within cPython. The fundamental idea is to replace the __builtins__ global (Note: not the __builtin__ module), which informs python to turn on some security features; making a handful of attributes on certain objects inaccessible, and removing most of the implementation objects from the interpreter when evaulating that bit of code.
You'll then need to write an actual implementation; in such a way that the protected modules are not the leaked into the sandbox. A fairly tested "file" replacement is provided in the linked blog. Getting a look on that might give you an idea of how involved and complex this problem is.
So now that you have understood that this is a challenge in python; you should take a look at languages with sandbox execution as a core feature, such as Lua, which is very popular in games.
Giving them python execution and trying to limit what they do is asking for trouble. See this SO question for discussion and a pointer to a good article. (You would presumably disable "eval", but it wouldn't make much difference in practice.
My suggestion: Turn the question around. Your goal is to provide them with scripting facilities so they can enhance the game. Find or define an interpreter for a suitable scripting language that has the features you need, and use it to execute their scripts. For example, you could support data persistence in a simple keystore model, without giving them file creation access. Or give them a command to create files but ensure it only accepts a path-less filename. The essential thing is to ensure that there is NO way for them to execute python commands directly.
I'm making a wxpython app that I will compile with the various freezing utility out there to create an executable for multiple platforms.
the program will be a map editer for a tile-based game engine
in this app I want to provide a scripting system so that advanced users can modify the behavior of the program such as modifying project data, exporting the project to a different format ect.
I want the system to work like so.
the user will place the python script they wish to run into a styled textbox and then press a button to execute the script.
I'm good with this so far thats all really simple stuff.
obtain the script from the text-box as a string compile it to a cod object with the inbuilt function compile() then execute the script with an exec statment
script = textbox.text #bla bla store the string
code = compile(script, "script", "exec") #make the code object
eval(code, globals())
the thing is, I want to make sure that this feature can't cause any errors or bugs
say if there is an import statement in the script. will this cause any problems taking into account that the code has been compiled with something like py2exe or py2app?
how do I make sure that the user can't break critical part of the program like modifying part of the GUI while still allowing them to modify the project data (the data is held in global properties in it's own module)? I think that this would mean modifying the globals dict that is passed to the eval function.
how to I make sure that this eval can't cause the program to hang due to a long or infinite loop?
how do I make sure that an error raised inside the user's code can't crash the whole app?
basically, how to I avoid all those problems that can arise when allowing the user to run their own code?
EDIT: Concerning the answers given
I don't feel like any of the answers so far have really answered my questions
yes they have been in part answered but not completely. I'm well aware the it is impossible to completely stop unsafe code. people are just too clever for one man (or even a teem) to think of all the ways to get around a security system and prevent them.
in fact I don't really care if they do. I'm more worried about some one unintentional breaking something they didn't know about. if some one really wanted to they could tear the app to shreds with the scripting functionality, but I couldn't care less. it will be their instance and all the problem they create will be gone when they restart the app unless they have messed with files on the HD.
I want to prevent the problems that arise when the user dose something stupid.
things like IOError's, SystaxErrors, InfiniteLoopErrors ect.
now the part about scope has been answered. I now understand how to define what functions and globals can be accessed from the eval function
but is there a way to make sure that the execution of their code can be stopped if it is taking too long?
a green thread system perhaps? (green because it would be eval to make users worry about thread safety)
also if a users uses an import module statement to load a module from even the default library that isn't used in the rest of the class. could this cause problems with the app being frozen by Py2exe, Py2app, or Freeze? what if they call a modal out side of the standard library? would it be enough that the modal is present in the same directory as the frozen executable?
I would like to get these answers with out creating a new question but I will if I must.
Easy answer: don't.
You can forbid certain keywords (import) and operations, and accesses to certain data structures, but ultimately you're giving your power users quite a bit of power. Since this is for a rich client that runs on the user's machine, a malicious user can crash or even trash the whole app if they really feel like it. But it's their instance to crash. Document it well and tell people what not to touch.
That said, I've done this sort of thing for web apps that execute user input and yes, call eval like this:
eval(code, {"__builtins__":None}, {safe_functions})
where safe_functions is a dictionary containing {"name": func} type pairs of functions you want your users to be able to access. If there's some essential data structure that you're positive your users will never want to poke at, just pop it out of globals before passing them in.
Incidentally, Guido addressed this issue on his blog a while ago. I'll see if I can find it.
Edit: found.
Short Answer: No
Is using eval in Python a bad practice?
Other related posts:
Safety of Python 'eval' For List Deserialization
It is not easy to create a safety net. The details too many and clever hacks are around:
Python: make eval safe
On your design goals:
It seems you are trying to build an extensible system by providing user to modify a lot of behavior and logic.
Easiest option is to ask them to write a script which you can evaluate (eval) during the program run.
How ever, a good design describes , scopes the flexibility and provides scripting mechanism through various design schemes ranging from configuration, plugin to scripting capabilities etc. The scripting apis if well defined can provide more meaningful extensibility. It is safer too.
I'd suggest providing some kind of plug-in API and allowing users to provide plug-ins in the form of text files. You can then import them as modules into their own namespace, catching syntax errors in the process, and call the various functions defined in the plug-in module, again checking for errors. You can provide an API module that defines the functions/classes from your program that the plug-in module has access to. That gives you the freedom to make changes to your application's architecture without breaking plug-ins, since you can just adapt the API module to expose the functionality in the same way.
If you have the option to switch to Tkinter you can use the bundled tcl interpreter to process your script. For that matter you can probably do that with a wxpython app if you don't start the tk event loop; just use the tcl interpreter without creating any windows.
Since the tcl interpreter is a separate thing it should be nearly impossible to crash the python interpreter if you are careful about what commands you expose to tcl. Plus, tcl makes creating DSLs very easy.
Python - the only scripting language with a built-in scripting engine :-).
When I first started reading about Python, all of the tutorials have you use Python's Interactive Mode. It is difficult to save, write long programs, or edit your existing lines (for me at least). It seems like a far more difficult way of writing Python code than opening up a code.py file and running the interpreter on that file.
python code.py
I am coming from a Java background, so I have ingrained expectations of writing and compiling files for programs. I also know that a feature would not be so prominent in Python documentation if it were not somehow useful. So what am I missing?
Let's see:
If you want to know how something works, you can just try it. There is no need to write up a file. I almost always scratch write my programs in the interpreter before coding them. It's not just for things that you don't know how they work in the programming language. I never remember what the correct arguments to range are to create, for example, [-2, -1, 0, 1]. I don't need to. I just have to fire up the interpreter and try stuff until I figure out it is range(-2, 2) (did that just now, actually).
You can use it as a calculator.
Python is a very introspective programming language. If you want to know anything about an object, you can just do dir(object). If you use IPython, you can even do object.<TAB> and it will tab-complete the methods and attributes of that object. That's way faster than looking stuff up in documentation or even in code.
help(anything) for documentation. It's way faster than any web interface.
Again, you have to use IPython (highly recommended), but you can time stuff. %timeit func1() and %timeit func2() is a common idiom to determine what is faster.
How often have you wanted to write a program to use once, and then never again. The fastest way to do this is to just do it in the Python interpreter. Sure, you have to be careful writing loops or functions (they must have the correct syntax the first time), but most stuff is just line by line, and you can play around with it.
Debugging. You don't need to put selective print statements in code to see what variables are when you write it in the interpreter. You just have to type >>> a, and it will show what a is. Nice again to see if you constructed something correctly. The building Python debugger pdb also uses the intrepeter functionality, so you can not only see what a variable is when debugging, but you can also manipulate or even change it without halting debugging.
When people say that Python is faster to develop in, I guarantee that this is a big part of what they are talking about.
Commenters: anything I am forgetting?
REPL Loops (like Python's interactive mode) provide immediate feedback to the programmer. As such, you can rapidly write and test small pieces of code, and assemble those pieces into a larger program.
You're talking about running Python in the console by simply typing "python"? That's just for little tests and for practicing with the language. It's very useful when learning the language and testing out other modules.
Of course any real software project is written in .py files and later executed by the interpreter!
The Python interpreter is a least common denominator: you can run it on multiple platforms, and it acts the same way (modulo platform-specific modules), so it's pretty easy to get a newbie going with.
It's a lot easier to tell a newbie to launch the interpreter and "do this" than to have them open a file, type in some code, save it, make it executable, make sure python is in your PATH, or use a #! line, etc etc. Scrap all of that and just launch the interpreter. For simple examples, you can't beat it. It was never meant for long programs, so if you were using it for that, you probably missed the part of the tutorial that told you "longer scripts go in a file". :)
you use the interactive interpreter to test snippets of your code before you put them into your script.
As already mentioned, the Python interactive interpreter gives a quick and dirty way to test simple Python functions and/or code snippets.
I personally use the Python shell as a very quick way to perform simple Numerical operations (provided by the math module). I have my environment setup, so that the math module is automatically imported whenever I start a Python shell. In fact, its a good way to "market" Python to non-Pythoniasts. Show them how they can use Python as a neat scientific calculator, and for simple mathematical prototyping.
One thing I use interactive mode for that others haven't mentioned: To see if a module is installed. Just fire up Python and try to import the module; if it dies, then your PYTHONPATH is broke or the module is not installed.
This is a great first step for "Hey, it's not working on my machine" or "Which Python did that get installed in, anyway" bugs.
I find the interactive interpreter very, very good for testing quick code, or to show others the Power of Python. Sometimes I use the interpreter as a handy calculator, too. It's amazing what you can do in a very short amount of time.
Aside from the built-in console, I also have to recommend Pyshell. It has auto-completion, and a decent syntax highlighting. You can also edit multiple lines of code at once. Of course, it's not perfect, but certainly better than the default python console.
When coding in Java, you almost always will have the API open in some browser window. However with the python interpreter, you can always import any module that you are thinking about using and check what it offers. You can also test the behavior of new methods that you are unsure of, to eliminate the "Oh! so THAT's how it works" as a source of bugs.
Interactive mode makes it easy to test code snippets before incorporating them into a larger program. If you use IDLE there's syntax highlighting and argument pop-ups to help you out. It's also a quick way of checking that you've figured out how to use a module without having to write a test program.
This questions is semi-based of this one here:
How can you profile a python script?
I thought that this would be a great idea to run on some of my programs. Although profiling from a batch file as explained in the aforementioned answer is possible, I think it would be even better to have this option in Eclipse. At the same time, making my entire program a function and profiling it would mean I have to alter the source code?
How can I configure eclipse such that I have the ability to run the profile command on my existing programs?
Any tips or suggestions are welcomed!
if you follow the common python idiom to make all your code, even the "existing programs", importable as modules, you could do exactly what you describe, without any additional hassle.
here is the specific idiom I am talking about, which turns your program's flow "upside-down" since the __name__ == '__main__' will be placed at the bottom of the file, once all your defs are done:
# program.py file
def foo():
""" analogous to a main(). do something here """
pass
# ... fill in rest of function def's here ...
# here is where the code execution and control flow will
# actually originate for your code, when program.py is
# invoked as a program. a very common Pythonism...
if __name__ == '__main__':
foo()
In my experience, it is quite easy to retrofit any existing scripts you have to follow this form, probably a couple minutes at most.
Since there are other benefits to having you program also a module, you'll find most python scripts out there actually do it this way. One benefit of doing it this way: anything python you write is potentially useable in module form, including cProfile-ing of your foo().
You can always make separate modules that do just profiling specific stuff in your other modules. You can organize modules like these in a separate package. That way you don't change your existing code.