I understand that letting any anonymous user upload any sort of file in general can be dangerous, especially if it's code. However, I have an idea to let users upload custom AI scripts to my website. I would provide the template so that the user could compete with other AI's in an online web game I wrote in Python. I either need a solution to ensure a user couldn't compromise any other files or inject malicious code via their uploaded script or a solution for client-side execution of the game. Any suggestions? (I'm looking for a solution that will work with my Python scripts)
I am in no way associated with this site and I'm only linking it because it tries to achieve what you are getting after: jailing of python. The site is code pad.
According to the about page it is ran under geordi and traps all sys calls with ptrace. In addition to be chroot'ed they are on a virtual machine with firewalls in place to disallow outbound connections.
Consider it a starting point but I do have to chime in on the whole danger thing. Gotta CYA myself. :)
Using PyPy you can create a python sandbox. The sandbox is a separate and supposedly secure python environment where you can execute their scripts. More info here
http://codespeak.net/pypy/dist/pypy/doc/sandbox.html
"In theory it's impossible to do anything bad or read a random file on the machine from this prompt."
"This is safe to do even if script.py comes from some random untrusted source, e.g. if it is done by an HTTP server."
Along with other safeguards, you can also incorporate human review of the code. Assuming part of the experience is reviewing other members' solutions, and everyone is a python developer, don't allow new code to be activated until a certain number of members vote for it. Your users aren't going to approve malicious code.
Yes.
Allow them to script their client, not your server.
PyPy is probably a decent bet on the server side as suggested, but I'd look into having your python backend provide well defined APIs and data formats and have the users implement the AI and logic in Javascript so it can run in their browser. So the interaction would look like: For each match/turn/etc, pass data to the browser in a well defined format, provide a javascript template that receives the data and can implement logic, and provide web APIs that can be invoked by the client (browser) to take the desired actions. That way you don't have to worry about security or server power.
Have an extensive API for the users and strip all other calls upon upload (such as import statements). Also, strip everything that has anything to do with file i/o.
(You might want to do multiple passes to ensure that you didn't miss anything.)
Related
I have to setup a program which reads in some parameters from a widget/gui, calculates some stuff based on database values and the input, and finally sends some ascii files via ftp to remote servers.
In general, I would suggest a python program to do the tasks. Write a Qt widget as a gui (interactively changing views, putting numbers into tables, setting up check boxes, switching between various layers - never done something as complex in python, but some experience in IDL with event handling etc), set up data classes that have unctions, both to create the ascii files with the given convention, and to send the files via ftp to some remote server.
However, since my company is a bunch of Windows users, each sitting at their personal desktop, installing python and all necessary libraries on each individual machine would be a pain in the ass.
In addition, in a future version the program is supposed to become smart and do some optimization 24/7. Therefore, it makes sense to put it to a server. As I personally rather use Linux, the server is already set up using Ubuntu server.
The idea is now to run my application on the server. But how can the users access and control the program?
The easiest way for everybody to access something like a common control panel would be a browser I guess. I have to make sure only one person at a time is sending signals to the same units at a time, but that should be doable via flags in the database.
After some google-ing, next to QtWebKit, django seems to the first choice for such a task. But...
Can I run a full fledged python program underneath my web application? Is django the right tool to do so?
As mentioned previously, in the (intermediate) future ( ~1 year), we might have to implement some computational expensive tasks. Is it then also possible to utilize C as it is within normal python?
Another question I have is on the development. In order to become productive, we have to advance in small steps. Can I first create regular python classes, which later on can be imported to my web application? (Same question applies for widgets / QT?)
Finally: Is there a better way to go? Any standards, any references?
Django is a good candidate for the website, however:
It is not a good idea to run heavy functionality from a website. it should happen in a separate process.
All functions should be asynchronous, I.E. You should never wait for something to complete.
I would personally recommend writing a separate process with a message queue and the website would only ask that process for statuses and always display a result immediatly to the user
You can use ajax so that the browser will always have the latest result.
ZeroMQ or Celery are useful for implementing the functionality.
You can implement functionality in C pretty easily. I recomment however that you write that functionality as pure c with a SWIG wrapper rather that writing it as an extension module for python. That way the functionality will be portable and not dependent on the python website.
I understand that letting any anonymous user upload any sort of file in general can be dangerous, especially if it's code. However, I have an idea to let users upload custom AI scripts to my website. I would provide the template so that the user could compete with other AI's in an online web game I wrote in Python. I either need a solution to ensure a user couldn't compromise any other files or inject malicious code via their uploaded script or a solution for client-side execution of the game. Any suggestions? (I'm looking for a solution that will work with my Python scripts)
I am in no way associated with this site and I'm only linking it because it tries to achieve what you are getting after: jailing of python. The site is code pad.
According to the about page it is ran under geordi and traps all sys calls with ptrace. In addition to be chroot'ed they are on a virtual machine with firewalls in place to disallow outbound connections.
Consider it a starting point but I do have to chime in on the whole danger thing. Gotta CYA myself. :)
Using PyPy you can create a python sandbox. The sandbox is a separate and supposedly secure python environment where you can execute their scripts. More info here
http://codespeak.net/pypy/dist/pypy/doc/sandbox.html
"In theory it's impossible to do anything bad or read a random file on the machine from this prompt."
"This is safe to do even if script.py comes from some random untrusted source, e.g. if it is done by an HTTP server."
Along with other safeguards, you can also incorporate human review of the code. Assuming part of the experience is reviewing other members' solutions, and everyone is a python developer, don't allow new code to be activated until a certain number of members vote for it. Your users aren't going to approve malicious code.
Yes.
Allow them to script their client, not your server.
PyPy is probably a decent bet on the server side as suggested, but I'd look into having your python backend provide well defined APIs and data formats and have the users implement the AI and logic in Javascript so it can run in their browser. So the interaction would look like: For each match/turn/etc, pass data to the browser in a well defined format, provide a javascript template that receives the data and can implement logic, and provide web APIs that can be invoked by the client (browser) to take the desired actions. That way you don't have to worry about security or server power.
Have an extensive API for the users and strip all other calls upon upload (such as import statements). Also, strip everything that has anything to do with file i/o.
(You might want to do multiple passes to ensure that you didn't miss anything.)
Is it possible to make python run on your homepage? I know, this is a really stupid question but please don't pick on me for my stupidity :)
If it is possible, how? Do you have to upload/install the executing part of Python to you website using FTP? or...?
Edit: Just found out my provider does not support python and that shell access is completely restricted. Problem solved :)
Everything depends on the hosting provider you use for your homepage -- do they offer Python among their services, and, if so, what version, and how do you write server-side scripts to use it (is it CGI-only, or...?) -- if not, or the version / deployment options disappoint, what do they allow in terms of giving you shell access and running long-time processes?
It's impossible for us to judge any of these aspects, because every single one of them depends on your hosting provider, and absolutely none of them depends on Python itself!-)
Yes, you can. I don't know exactly how but I know it is possible. Mabye look into this website:
https://trinket.io/
This website lets you do this. I sent them a message to see how they do it so I will update this to let you know after they respond.
Python is a scripting language, though it is used gracefully for building back end web applications.
I created a nice RSS application in Python. It took a while and most of the code just does heavy work, like formatting XML, downloading feeds, etc. To the application itself requires very little user interaction, just a initial list of RSS feeds and some parameters.
What would be really nice, is if I was able to have a web front-end which allowed me to have the user edit their feeds and parameters, then they could click a create button and it runs.
I don't really want to have to rewrite the thing in a web framework. Is there anything that will allow me to build a nice front-end allowing it to interact with the normal Python underneath?
It depends on your needs, free time, etc.
I recommend two solutions:
Django - a very rich framework which allows you to create full featured sites using only accessible components (in most cases they are good enough)
http://werkzeug.pocoo.org/ - collections of tools if you want to have possibility to control everything from the low level
web.py is a very lightweight 'library' (not framework) that you can put as a front end to your app. Just import your app within the main controller and use it as you would.
The Python standard library also includes a builting SimpleHTTPServer module which might be what you need to create a front end for your app.
You may also either deploy your Python code as CGI script on a webserver of your choice, e.g. Tomcat:
The CGI (Common Gateway Interface) defines a way for a web server to
interact with external content-generating programs, which are often
referred to as CGI programs or CGI scripts.
According to a Qura-question this might be appropriate only for small projects, but I do not say anything wrong with that since it worked well for me for perl-scripts. The same source suggests a Python WSGI (web-service gateway) service like uwsgi another service dedicated to running Python code.
Last but not least, there is the solution to encapsulate your Python into Java-code: I stumbled upon the Quora-question "How do I run Java and Python in Tomcat?" which refered to using Jython and plyJy, the latter project is not alive anymore. However, there is also a related question on the topic of bundling Python and Java..
I would like to enable students to submit python code solutions to a few simple python problems. My applicatoin will be running in GAE. How can I limit the risk from malicios code that is sumitted? I realize that this is a hard problem and I have read related Stackoverflow and other posts on the subject. I am curious if the restrictions aleady in place in the GAE environment make it simpler to limit damage that untrusted code could inflict. Is it possible to simply scan the submitted code for a few restricted keywords (exec, import, etc.) and then ensure the code only runs for less than a fixed amount of time, or is it still difficult to sandbox untrusted code even in the resticted GAE environment? For example:
# Import and execute untrusted code in GAE
untrustedCode = """#Untrusted code from students."""
class TestSpace(object):pass
testspace = TestSpace()
try:
#Check the untrusted code somehow and throw and exception.
except:
print "Code attempted to import or access network"
try:
# exec code in a new namespace (Thanks Alex Martelli)
# limit runtime somehow
exec untrustedCode in vars(testspace)
except:
print "Code took more than x seconds to run"
#mjv's smiley comment is actually spot-on: make sure the submitter IS identified and associated with the code in question (which presumably is going to be sent to a task queue), and log any diagnostics caused by an individual's submissions.
Beyond that, you can indeed prepare a test-space that's more restrictive (thanks for the acknowledgment;-) including a special 'builtin' that has all you want the students to be able to use and redefines __import__ &c. That, plus a token pass to forbid exec, eval, import, __subclasses__, __bases__, __mro__, ..., gets you closer. A totally secure sandbox in a GAE environment however is a real challenge, unless you can whitelist a tiny subset of the language that the students are allowed.
So I would suggest a layered approach: the sandbox GAE app in which the students upload and execute their code has essentially no persistent layer to worry about; rather, it "persists" by sending urlfetch requests to ANOTHER app, which never runs any untrusted code and is able to vet each request very critically. Default-denial with whitelisting is still the holy grail, but with such an extra layer for security you may be able to afford a default-acceptance with blacklisting...
You really can't sandbox Python code inside App Engine with any degree of certainty. Alex's idea of logging who's running what is a good one, but if the user manages to break out of the sandbox, they can erase the event logs. The only place this information would be safe is in the per-request logging, since users can't erase that.
For a good example of what a rathole trying to sandbox Python turns into, see this post. For Guido's take on securing Python, see this post.
There are another couple of options: If you're free to choose the language, you could run Rhino (a Javascript interpreter) on the Java runtime; Rhino is nicely sandboxed. You may also be able to use Jython; I don't know if it's practical to sandbox it, but it seems likely.
Alex's suggestion of using a separate app is also a good one. This is pretty much the approach that shell.appspot.com takes: It can't prevent you from doing malicious things, but the app itself stores nothing of value, so there's no harm if you do.
Here's an idea. Instead of running the code server-side, run it client-side with Skuplt:
http://www.skulpt.org/
This is both safer, and easier to implement.