mod_python caching of variables

mod_python caching of variables - python

I'm using mod_python to run Trac in Apache. I'm developing a plugin and am not sure how global variables are stored/cached.
I am new to python and have googled the subject and found that mod_python caches python modules (I think). However, I would expect that cache to be reset when the web service is restarted, but it doesn't appear to be. I'm saying this becasue I have a global variable that is a list, I test the list to see if a value exists and if it doesn't then I add it. The first time I ran this, it added three entries to the list. Subsequently, the list has three entries from the start.
For example:
globalList = []
class globalTest:
def addToTheList(itemToAdd):
print(len(globalTest))
if itemToAdd not in globalList:
globalList.append(itemToAdd)
def doSomething():
addToTheList("I am new entry one")
addToTheList("I am new entry two")
addToTheList("I am new entry three")
The code above is just an example of what I'm doing, not the actual code ;-). But essentially the doSomething() method is called by Trac. The first time it ran, it added all three entries. Now - even after restarting the web server the len(globalList) command is always 3.
I suspect the answer may be that my session (and therefore the global variable) is being cached because Trac is remembering my login details when I refresh the page in Trac after the web server restart. If that's the case - how do I force the cache to be cleared. Note that I don't want to reset the globalList variable manually i.e. globalList.length = 0
Can anyone offer any insight as to what is happening?
Thank you

Obligatory:
Switch to wsgi using mod_wsgi.
Don't use mod_python.
There is Help available for configuring mod_wsgi with trac.

read the mod-python faq it says
Global objects live inside mod_python
for the life of the apache process,
which in general is much longer than
the life of a single request. This
means if you expect a global variable
to be initialised every time you will
be surprised....
go to link
http://www.modpython.org/FAQ/faqw.py?req=show&file=faq03.005.htp
so question is why you want to use global variable?

Related

Safest place for initilization code

My application has a datastore entry that needs to be initialized with some default values when the app is first deployed. I have a page that lets administrators of the app edit those values later, so it's a problem if the initialization code runs again and overwrites those edits.
I initially tried putting code in appengine_config.py, but that's clearly not correct, as any new values for the entity were overwritten after a few page loads. I thought about putting it in main.py, before the call to run_wsgi_app(), but it's my understanding that main.py is run whenever App Engine creates a new instance of the application. Warmup requests seem to have the same problem as appengine_config.py.
Is there a way to do what I'm trying to do?

Typically you could use appengine_config.py or an explicit handler.
If you use appengine_config.py your code should check for the values existence, and only when no value exists should it define a default.
My main concern with one only initialisation code in appengine_config.py is the check for existence of these initial values will be performed on every instance startup. If there is a lot to check that's an overhead on warm starts that you may not want.
For iany initialisation code for a new instance, you will have this problem of checking existence no matter what strategy you adopt, that is "Ensuring what ever process intialiases default values runs at most once".
Personally I would actually have a specific handler method that you call only once. And it then checks to make sure it shouldn't run before taking any action; In case it is called again

Django : Call a method only once when the django starts up

I want to initialize some variables (from the database) when Django starts.
I am able to get the data from the database but the problem is how should I call the initialize method . And this should be only called once.
Tried looking in other Pages, but couldn't find an answer to it.
The code currently looks something like this ::
def get_latest_dbx(request, ....):
#get the data from database
def get_latest_x(request):
get_latest_dbx(request,x,...)
def startup(request):
get_latest_x(request)

Some people suggest( Execute code when Django starts ONCE only? ) call that initialization in the top-level urls.py(which looks unusual, for urls.py is supposed to handle url pattern). There is another workaround by writing a middleware: Where to put Django startup code?
But I believe most of people are waiting for the ticket to be solved.
UPDATE:
Since the OP has updated the question, it seems the middleware way may be better, for he actually needs a request object in startup. All startup codes could be put in a custom middleware's process_request method, where request object is available in the first argument. After these startup codes execute, some flag may be set to avoid rerunning them later(raising MiddlewareNotUsed exception only works in __init__, which doesn't receive a request argument).
BTW, OP's requirement looks a bit weird. On one hand, he needs to initialize some variables when Django starts, on the other hand, he need request object in the initialization. But when Django starts, there may be no incoming request at all. Even if there is one, it doesn't make much sense. I guess what he actually needs may be doing some initialization for each session or user.

there are some cheats for this. The general solution is trying to include the initial code in some special places, so that when the server starts up, it will run those files and also run the code.
Have you ever tried to put print 'haha' in the settings.py files :) ?
Note: be aware that settings.py runs twice during start-up

Django view alter global variable

My django app contains a loop, which is launched by the following code in urls.py:
def start_serial():
rfmon = threading.Thread(target=rf_clicker.RFMonitor().run)
rfmon.daemon = True
rfmon.start()
start_serial()
The loop inside this subthread references a global variable defined in global_vars.py. I would like to change to value of this variable in a view, but it doesn't seem to work.
from views.py:
import global_vars
def my_view(request):
global_vars.myvar = 2
return httpResponse...
How can a let the function inside the loop know that this view has been called?
The loop listens for a signal from a remote, and based on button presses may save data to the database. There are several views in the web interface, which change the settings for the remotes. While these settings are being changed the state inside the loop needs to be such that data will not be saved.

I agree with Ignacio Vazquez-Abrams, don't use globals.
Especially in your use case. The problem with this approach is that, when you deploy your app to a wsgi container or what have you, you will have multiple instances of your app running in different processes, so changing a global variable in one process won't change it in others.
And I would also not recommend using threads. If you need a long running process that handles tasks asynchronously(which seems to be the case), consider looking at Celery( http://celeryproject.org/). It's really good at it.

I will admit to having no experience leveraging them, but if you haven't looked at Django's signaling capabilities, it seems like they would be a prime candidate for this kind of activity (and more appropriate than global variables).
https://docs.djangoproject.com/en/dev/ref/signals/

Creating an in memory constant when django starts

I have this kind of setup :
Overridden BaseRunserverCommand that adds another option (--token) that would get a string token.
Store it in the app called "vault" as a global variable
Then continue executing the BaseRunserverCommand
Now later when I try to get the value of this global variable after the server started, I am unable to see the value. Is this going out of scope? How to store this one time token that is entered before the django starts?

Judging by the name of the command line option, this sounds like a configuration variable -- why not put it in settings.py like all the other configurations?
If it's a secure value that you don't want checked in to version control, one pattern I've seen is to put secure or environment-sensitive (i.e. only makes sense in production or development) configurations in a local_settings.py file which is not checked in to version control, then add to the end of your settings.py:
try:
from local_settings import *
except ImportError:
# if you require a local_settings to be present,
# you could let this exception rise, or raise a
# more specific exception here
pass
(Note that some people like to invert the import relationship and then use --settings on the command line with runserver -- this works, too, but requires that you always remember to use --settings.)

There's no such thing as a "global variable" in Python that is available from everywhere. You need to import all names before you can use them.
Even if you did this, though, it wouldn't actually work on production. Not only is there no command that you run to start up the production server (it's done automatically by Apache or whatever server software you're using), the server usually runs in multiple processes, so the variable would have to be set in each of them.
You should use a setting, as dcrosta suggests.

In Python in GAE, what is the best way to limit the risk of executing untrusted code?

I would like to enable students to submit python code solutions to a few simple python problems. My applicatoin will be running in GAE. How can I limit the risk from malicios code that is sumitted? I realize that this is a hard problem and I have read related Stackoverflow and other posts on the subject. I am curious if the restrictions aleady in place in the GAE environment make it simpler to limit damage that untrusted code could inflict. Is it possible to simply scan the submitted code for a few restricted keywords (exec, import, etc.) and then ensure the code only runs for less than a fixed amount of time, or is it still difficult to sandbox untrusted code even in the resticted GAE environment? For example:
# Import and execute untrusted code in GAE
untrustedCode = """#Untrusted code from students."""
class TestSpace(object):pass
testspace = TestSpace()
try:
#Check the untrusted code somehow and throw and exception.
except:
print "Code attempted to import or access network"
try:
# exec code in a new namespace (Thanks Alex Martelli)
# limit runtime somehow
exec untrustedCode in vars(testspace)
except:
print "Code took more than x seconds to run"

#mjv's smiley comment is actually spot-on: make sure the submitter IS identified and associated with the code in question (which presumably is going to be sent to a task queue), and log any diagnostics caused by an individual's submissions.
Beyond that, you can indeed prepare a test-space that's more restrictive (thanks for the acknowledgment;-) including a special 'builtin' that has all you want the students to be able to use and redefines __import__ &c. That, plus a token pass to forbid exec, eval, import, __subclasses__, __bases__, __mro__, ..., gets you closer. A totally secure sandbox in a GAE environment however is a real challenge, unless you can whitelist a tiny subset of the language that the students are allowed.
So I would suggest a layered approach: the sandbox GAE app in which the students upload and execute their code has essentially no persistent layer to worry about; rather, it "persists" by sending urlfetch requests to ANOTHER app, which never runs any untrusted code and is able to vet each request very critically. Default-denial with whitelisting is still the holy grail, but with such an extra layer for security you may be able to afford a default-acceptance with blacklisting...

You really can't sandbox Python code inside App Engine with any degree of certainty. Alex's idea of logging who's running what is a good one, but if the user manages to break out of the sandbox, they can erase the event logs. The only place this information would be safe is in the per-request logging, since users can't erase that.
For a good example of what a rathole trying to sandbox Python turns into, see this post. For Guido's take on securing Python, see this post.
There are another couple of options: If you're free to choose the language, you could run Rhino (a Javascript interpreter) on the Java runtime; Rhino is nicely sandboxed. You may also be able to use Jython; I don't know if it's practical to sandbox it, but it seems likely.
Alex's suggestion of using a separate app is also a good one. This is pretty much the approach that shell.appspot.com takes: It can't prevent you from doing malicious things, but the app itself stores nothing of value, so there's no harm if you do.

Here's an idea. Instead of running the code server-side, run it client-side with Skuplt:
http://www.skulpt.org/
This is both safer, and easier to implement.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

mod_python caching of variables - python

Obligatory: Switch to wsgi using mod_wsgi. Don't use mod_python. There is Help available for configuring mod_wsgi with trac.

Related

Safest place for initilization code

Django : Call a method only once when the django starts up

Django view alter global variable

Creating an in memory constant when django starts

In Python in GAE, what is the best way to limit the risk of executing untrusted code?

Categories

Resources