I'm writing a web-application in Python, I haven't decided if I want to use Flask, web.py or something else yet, and I want to be able to do profile on the live application.
There seems to be very little information on how you go about implementing the instrumentation to do performance measurement, short of doing a lot of print datetime.now() everywhere.
What is the best way of going about instrumenting your Python application to allow good measurements to be made. I guess I'm looking for something similar to the Stackoverflow teams mvc-mini-profiler.
You could simply run cProfile tool that comes with Python:
python -m cProfile script.py
Of course, you would have to create the script.py file that would execute the parts of the code that you want to test. If you had some unit tests, you could also use that.
Or you couse use:
import cProfile
cProfile.run('foo()')
to profile it from foo entry point.
Amir Salihefendic wrote a short (150 LOC) RequestProfiler, which is described in this blog post:
http://amix.dk/blog/post/19359
I haven't tried it, but since it is a WSGI middleware, it should be somewhat pluggable.
You can just use a general purpose web application performance tool, such as httpperf. This works using an external client and works with any framework since it works against a standard interface (HTTP). Therefore it tests the full stack performance.
Use New Relic's Free monitoring system. You simply install an agent on the server and point to your flask init.py file. Once you run the application with proper agent setup, you will start seeing application metrics in see New Relic's online dashboard called APM.
By default it will show you graphs of your application's throughput (QPS/RPM), app response time, top transactions, error rate, error stack trace if any(eg for 500 error), calls to external services etc. In addition you can monitor your System stats too.
Related
Okay, so basically I am creating a website. The data I need to display on this website is delivered twice daily, where I need to read the delivered data from a file and store this new data in the database (instead of the old data).
I have created the python functions to do this. However, I would like to know, what would be the best way to run this script, while my flask application is running? This may be a very simple answer, but I have seen some answers saying to incorporate the script into the website design (however these answers didn't explain how), and others saying to run it separately. The script needs to run automatically throughout the day with no monitoring or input from me.
TIA
Generally it's a really bad idea to put a webserver to handle such tasks, that is the flask application in your case. There are many reasons for it so just to name a few:
Python's Achilles heel - GIL.
Sharing system resources of the application between users and other operations.
Crashes - it happens, it could be unlikely but it does. And if you are not careful, the web application goes down along with it.
So with that in mind I'd advise you to ditch this idea and use crontabs. Basically write a script that does whatever transformations or operations it needs to do and create a cron job at a desired time.
I understand that letting any anonymous user upload any sort of file in general can be dangerous, especially if it's code. However, I have an idea to let users upload custom AI scripts to my website. I would provide the template so that the user could compete with other AI's in an online web game I wrote in Python. I either need a solution to ensure a user couldn't compromise any other files or inject malicious code via their uploaded script or a solution for client-side execution of the game. Any suggestions? (I'm looking for a solution that will work with my Python scripts)
I am in no way associated with this site and I'm only linking it because it tries to achieve what you are getting after: jailing of python. The site is code pad.
According to the about page it is ran under geordi and traps all sys calls with ptrace. In addition to be chroot'ed they are on a virtual machine with firewalls in place to disallow outbound connections.
Consider it a starting point but I do have to chime in on the whole danger thing. Gotta CYA myself. :)
Using PyPy you can create a python sandbox. The sandbox is a separate and supposedly secure python environment where you can execute their scripts. More info here
http://codespeak.net/pypy/dist/pypy/doc/sandbox.html
"In theory it's impossible to do anything bad or read a random file on the machine from this prompt."
"This is safe to do even if script.py comes from some random untrusted source, e.g. if it is done by an HTTP server."
Along with other safeguards, you can also incorporate human review of the code. Assuming part of the experience is reviewing other members' solutions, and everyone is a python developer, don't allow new code to be activated until a certain number of members vote for it. Your users aren't going to approve malicious code.
Yes.
Allow them to script their client, not your server.
PyPy is probably a decent bet on the server side as suggested, but I'd look into having your python backend provide well defined APIs and data formats and have the users implement the AI and logic in Javascript so it can run in their browser. So the interaction would look like: For each match/turn/etc, pass data to the browser in a well defined format, provide a javascript template that receives the data and can implement logic, and provide web APIs that can be invoked by the client (browser) to take the desired actions. That way you don't have to worry about security or server power.
Have an extensive API for the users and strip all other calls upon upload (such as import statements). Also, strip everything that has anything to do with file i/o.
(You might want to do multiple passes to ensure that you didn't miss anything.)
cprofile OR python-profiler are used to do profiling in python. I have done it for a single function or method. But I want to do profiling for a whole Django project. I want that on every call the result of profiling saves in a File. Is it possible?
What about runsnakerun GUI tool available for profiling? Is it helpful?
Check out Django Live Profiler.
I've been doing a lot of searching on this topic very recently and this is the best I've found if you want to gather information not only about the whole application but across multiple requests too. Set this up and fire some requests at your dev server with ab for example. They say it's low-overhead enough to use in production, but I haven't looked into that yet really.
For debugging single requests in a quick and dirty way, for example to see what SQL queries are being run, Django Debug Toolbar is nice; it's not exactly profiling but it's a good complement.
I created a nice RSS application in Python. It took a while and most of the code just does heavy work, like formatting XML, downloading feeds, etc. To the application itself requires very little user interaction, just a initial list of RSS feeds and some parameters.
What would be really nice, is if I was able to have a web front-end which allowed me to have the user edit their feeds and parameters, then they could click a create button and it runs.
I don't really want to have to rewrite the thing in a web framework. Is there anything that will allow me to build a nice front-end allowing it to interact with the normal Python underneath?
It depends on your needs, free time, etc.
I recommend two solutions:
Django - a very rich framework which allows you to create full featured sites using only accessible components (in most cases they are good enough)
http://werkzeug.pocoo.org/ - collections of tools if you want to have possibility to control everything from the low level
web.py is a very lightweight 'library' (not framework) that you can put as a front end to your app. Just import your app within the main controller and use it as you would.
The Python standard library also includes a builting SimpleHTTPServer module which might be what you need to create a front end for your app.
You may also either deploy your Python code as CGI script on a webserver of your choice, e.g. Tomcat:
The CGI (Common Gateway Interface) defines a way for a web server to
interact with external content-generating programs, which are often
referred to as CGI programs or CGI scripts.
According to a Qura-question this might be appropriate only for small projects, but I do not say anything wrong with that since it worked well for me for perl-scripts. The same source suggests a Python WSGI (web-service gateway) service like uwsgi another service dedicated to running Python code.
Last but not least, there is the solution to encapsulate your Python into Java-code: I stumbled upon the Quora-question "How do I run Java and Python in Tomcat?" which refered to using Jython and plyJy, the latter project is not alive anymore. However, there is also a related question on the topic of bundling Python and Java..
I would like to enable students to submit python code solutions to a few simple python problems. My applicatoin will be running in GAE. How can I limit the risk from malicios code that is sumitted? I realize that this is a hard problem and I have read related Stackoverflow and other posts on the subject. I am curious if the restrictions aleady in place in the GAE environment make it simpler to limit damage that untrusted code could inflict. Is it possible to simply scan the submitted code for a few restricted keywords (exec, import, etc.) and then ensure the code only runs for less than a fixed amount of time, or is it still difficult to sandbox untrusted code even in the resticted GAE environment? For example:
# Import and execute untrusted code in GAE
untrustedCode = """#Untrusted code from students."""
class TestSpace(object):pass
testspace = TestSpace()
try:
#Check the untrusted code somehow and throw and exception.
except:
print "Code attempted to import or access network"
try:
# exec code in a new namespace (Thanks Alex Martelli)
# limit runtime somehow
exec untrustedCode in vars(testspace)
except:
print "Code took more than x seconds to run"
#mjv's smiley comment is actually spot-on: make sure the submitter IS identified and associated with the code in question (which presumably is going to be sent to a task queue), and log any diagnostics caused by an individual's submissions.
Beyond that, you can indeed prepare a test-space that's more restrictive (thanks for the acknowledgment;-) including a special 'builtin' that has all you want the students to be able to use and redefines __import__ &c. That, plus a token pass to forbid exec, eval, import, __subclasses__, __bases__, __mro__, ..., gets you closer. A totally secure sandbox in a GAE environment however is a real challenge, unless you can whitelist a tiny subset of the language that the students are allowed.
So I would suggest a layered approach: the sandbox GAE app in which the students upload and execute their code has essentially no persistent layer to worry about; rather, it "persists" by sending urlfetch requests to ANOTHER app, which never runs any untrusted code and is able to vet each request very critically. Default-denial with whitelisting is still the holy grail, but with such an extra layer for security you may be able to afford a default-acceptance with blacklisting...
You really can't sandbox Python code inside App Engine with any degree of certainty. Alex's idea of logging who's running what is a good one, but if the user manages to break out of the sandbox, they can erase the event logs. The only place this information would be safe is in the per-request logging, since users can't erase that.
For a good example of what a rathole trying to sandbox Python turns into, see this post. For Guido's take on securing Python, see this post.
There are another couple of options: If you're free to choose the language, you could run Rhino (a Javascript interpreter) on the Java runtime; Rhino is nicely sandboxed. You may also be able to use Jython; I don't know if it's practical to sandbox it, but it seems likely.
Alex's suggestion of using a separate app is also a good one. This is pretty much the approach that shell.appspot.com takes: It can't prevent you from doing malicious things, but the app itself stores nothing of value, so there's no harm if you do.
Here's an idea. Instead of running the code server-side, run it client-side with Skuplt:
http://www.skulpt.org/
This is both safer, and easier to implement.