I have a multit-threaded xmlrpc service running which stores a huge amount of data ~2G in memory. Currently, if I want to update a method the server exposes I have to restart the service. The problem here is that if I restart the service it needs to load all of the data it had in memory back into memory by using a database or using shelved data.
I am using methods like this:
xmlrpc_getUser(self, uid):
return self.users[uid]
What I was hoping I could do is just use these methods as a proxy to another module, so my methods would look more like this
xmlrpc_getUser(self, uid):
return self.proxy.getUser(uid)
This way I could update code on the development server then simply copy my update proxy module to the production server without the need for a restart.
I tried adding
import service_proxy
to the constructor of my xmlrpc service controller, but I think the module is cached and won't reload.
Is there a good way to do this? Thanks.
You could use the reload method. You would need to write some code to check the last modified time of the modules file.
If reload doesn't work, you could try twisted.python.rebuild; your application need not be written in Twisted to use this twisted.python utility.
I also recently saw this livecoding thing ("a code reloading library for Python"), but it talks about a custom module system and I don't know what's going on there.
Related
Multiple flask processes (managed by gunicorn) serve the frontend and have to use a shared resource: A data structure that allows reads and updates and therefore needs to be protected by a simple (or RW) lock.
What options do I have regarding the communication between web frontend and data structure? I already had a look at the following libraries:
pyZMQ. I'm held back by the problem that arises when the service is restarting while the client is expecting data. Also I would need to implement method calling, de-/serialization and the like.
https://github.com/0rpc/zerorpc-python This is an additional layer around pyZMQ and works around this issue but seems not very actively developed and I don't want to be forced to use gevent.
Pyro. Seems to provide the functionality I need (using a single instance or python threads for the service). Might be a bit heavyweight for my needs.
socketserver. Pretty lowlevel but might also do what I want as long as I implement method calling, de-/serialization, ...
Are there better options?
Ever since I read
A untested application is broken
in the flask documentation about testing here
I have been working down my list of things to make for some of my applications.
I currently have a flask web app when I write a new route I just write a requests.get('https://api.github.com/user', auth=('user', 'pass')), post, put, etc to test the route.
Is this a decent alternative? Or should I try and do tests via what flask's documentation says, and if so why?
Fundamentally it's the same concept, you are running functionality tests as they do. However, you have a prerequisite, a live application running somewhere (if I got it right). They create a fake application (aka mock) so you can test it without being live, e.g. you want to run tests in a CI environment.
In my opinion it's a better alternative than a live system. Your current approach consumes more resources on your local machine, since you are required to run the whole system to test something (i.e. at least a DB and the application itself). In their approach they don't, the fake instance does not need to have real data, thus no connection to a DB or any other external dependency.
I suggest you to switch to their testing, in the end you will like it.
Is it possible to use the python reload command (or similar) on a single module in a standalone cherrypy web app? I have a CherryPy based web application that often is under continual usage. From time to time I'll make an "important" change that only affects one module. I would like to be able to reload just that module immediately, without affecting the rest of the web application. A full restart is, admittedly, fast, however there are still several seconds of downtime that I would prefer to avoid if possible.
Reloading modules is very, very hard to do in a sane way. It leads to the potential of stale objects in your code with impossible-to-interrogate state and subtle bugs. It's not something you want to do.
What real web applications tend to do is to have a server that stays alive in front of their application, such as Apache with mod_proxy, to serve as a reverse proxy. You start your new app server, change your reverse proxy's routing, and only then kill the old app server.
No downtime. No insane, undebuggable code.
I am a PHP programmer learning Python, when ever I get a chance.
I read that Python web Application stay active between requests.
Meaning that data stays in memory and is available between requests, right?
I am wondering how that works.
In php we place a cookie with a unique token, and save data in sessions.
Sessions are arrays, saved on disk or database.
Between requests the session functions, restore the correct session array based on the cookie with the unique token. That means each browser gets it's own unique session, and the session has a preset expiration time. If the user is inactive and the expiration get's triggered then the session gets purged. A new session has to be created when the user comes back.
My understanding is Python doesn't need this, because the application stays active between requests.
Doesn't each request get a unique thread in Python?
How does it distinguish between requests, who the requester is?
Is there a handling method to separate vars between users and application?
Lets say I have a dict saved, is this dict globally available between all requests from any browser, or only to that one browser.
When and how does the memory get cleared. If everything stays in the memory. What if the app is running for a couple years without a restart. There must be some kind of expiration setting or memory handling?
One commenter says it depends on the web app. So I am using Bottle.py to learn.
I would assume the answer would depend on which web application framework you are using within python. Some of them have session management pieces in them that track a user across requests. But if you just had a basic port listener that responded with http, you would have to build any cookie support or session management yourself.
The other big difference is that in php, you have a module installed on the server that the actual http server delegates to in order to generate a response. PHP doesn't handle the routing or actual serving of the responses. Where as python can actually be the server and the resource for generating the response. It depends on how python is installed/accessed on the machine where the server is running. So in that sense you can do whatever you want within a python web application.
If you are interested, you should look at some available python web frameworks.
Edit: I see you mentioned bottle.py, and out of the box, it does not provide authentication and session management because it's a micro framework for fast prototyping and not necessarily suitable for a large scale application (although not impossible, just a lot of work).
Yes and no. If you check out this question you get an idea how it could work for a Django application.
However, the way you state it, it will not work. Defining a dict in one request without passing it somewhere in order to make it accessible in following request will obviously not make it available in further requests. So, yes, you have the options do this but its not the casue out of the box!
I was able to persist an object in Python between requests before using Twisted's web server. I have not tried seeing for myself if it persists across browsers though but I have a feeling it does. Here's a code snippet from the documentation:
Twisted includes an event-driven web server. Here's a sample web application; notice how the resource object persists in memory, rather than being recreated on each request:
from twisted.web import server, resource
from twisted.internet import reactor
class HelloResource(resource.Resource):
isLeaf = True
numberRequests = 0
def render_GET(self, request):
self.numberRequests += 1
request.setHeader("content-type", "text/plain")
return "I am request #" + str(self.numberRequests) + "\n"
reactor.listenTCP(8080, server.Site(HelloResource()))
reactor.run()
First of all you should understand the difference between local and global variables in python, and also how thread local storage works.
This is a (very) short explanation:
global variables are those declared at module scope and are shared by all threads. They live as long as the process is running, unless explicitly removed
local variables are those declared inside a function and instantiated for each call of that function. They are deleted when the function is over unless it is still referenced somewhere else.
thread local stoarage enables defining global variables that are specific to the current thread. The live as tong as the current thread is running, unless explicitly removed.
And now I'll try to answer your original questions (the answers are specific to bottle.py, but it is the most common implementation in python web servers)
Doesn't each request get a unique thread in Python?
Each concurrent will have a separate thread, future requests might reuse the previous threads.
How does it distinguish between requests, who the requester is?
bottle.py uses thread local storage to access the current request
Is there a handling method to separate vars between users and application?
Sounds like you are looking for a session. If so, there is no standard way of doing it, because different implementation have advantages and disadvantages. For example this is a bottle.py middleware for sessions.
Lets say I have a dict saved, is this dict globally available between
all requests from any browser, or only to that one browser. When and
how does the memory get cleared.
If everything stays in the memory. What if the app is running for a
couple years without a restart. There must be some kind of expiration
setting or memory handling?
Exactly, there must be an expiration setting. Since you are using a custom dict you need a timer that checks each entry in the dict for expiration.
I've got a Python application which is daemonized and running on a server 24/7. I'd like to be able to give an incredibly simple web interface so that I can monitor the changing values of a few variables within the program.
I'm using Tornado, and I'm up and running with the simple 'Hello, world' that you can find on the Tornado homepage. However, as soon as tornado.ioloop.IOLoop.instance().start() is called, it enters into the loop and doesn't return. My existing program is (essentially) an infinite loop as well, but I want to integrate the two.
So, my question is: how can I construct my program so that I can monitor variables inside my infinite loop by using Tornado to provide a web interface?
Is it possible to use the threading package and run Tornado inside of its own thread?
Edit:
The threading module documentation at http://docs.python.org/library/threading.html has more details, but I am imagining something like this:
import threading
t = threading.Thread(target = tornado.ioloop.IOLoop.instance().start)
t.start()
Let me know if that works!
I believe that the best (read: easiest) approach would be to have your daemon app write those particular variables you want to monitor out to a shared spaced that your tornado app can access. This could be a file, a socket, a database, or key-value store. Some ideas ideas that come to mind is to use your existing database (if there is one,) sqlite, or even memcached. Then, you would simply have your tornado application read those values from wherever you stored them.
You are correct in that once you run tornado.ioloop.IOLoop.instance().start() tornado's control flow never returns from that loop. From that point forward, your application's control will stay within the Application and RequestHandlers that you defined.
Another less elegant solution would be to utilize yaml to serialize the objects periodically from your main app, and have the web app read those in. You can even dump objects into yaml, so you could see the different states of those.
You could try using http://www.zeromq.org/ to as a means of communication between to the two processes / threads.