Run Python Pyramid server in production w/o Apache - python

So we used to run our Pyramid server with Apache in production. But we are moving to Docker containerization for easier prod deployments etc, and we want to adhere to the philosophy of "one process per container"..so instead of running Apache in the container + 4 python procs, we just want 1 python proc.
So my question is - is there a way to run a Pyramid server in production directly? Without using WSGI+Apache?
https://www.digitalocean.com/community/tutorials/how-to-use-the-pyramid-framework-to-build-your-python-web-app-on-ubuntu
My understanding is that pserve is for development only?
Create an application.py file and fill it with the following contents:
from wsgiref.simple_server import make_server
from pyramid.config import Configurator
from pyramid.response import Response
def hello_world(request):
return Response('<h1>Hello world!</h1>')
if __name__ == '__main__':
config = Configurator()
config.add_view(hello_world)
app = config.make_wsgi_app()
server = make_server('0.0.0.0', 8080, app)
server.serve_forever()
Will the above work as a production-grade server?

The latest official recommendation is one concern per container. From the Docker docs (emphasis my own):
Each container should have only one concern. Decoupling applications
into multiple containers makes it easier to scale horizontally and
reuse containers. For instance, a web application stack might consist
of three separate containers, each with its own unique image, to
manage the web application, database, and an in-memory cache in a
decoupled manner.
Limiting each container to one process is a good rule of thumb, but it
is not a hard and fast rule. For example, not only can containers be
spawned with an init process, some programs might spawn additional
processes of their own accord. For instance, Celery can spawn multiple
worker processes, and Apache can create one process per request.
In your case, your web application server is a single concern. Running Apache+WSGI is totally fine. Don't worry about the processes—That's Apache's job.
For a better understanding of the "one concern" rule, this post is a good overview of what problems its trying to solve.

You can use Waitress, which, according to their documentation, is
... meant to be a production-quality pure-Python WSGI server with very
acceptable performance. It has no dependencies except ones which live
in the Python standard library.
Waitress is a part of the Pylons Project just like Pyramid is.

It looks like Bjoern is a solid choice when it comes to running Python directly, where the Python server has WSGI bindings:
https://www.appdynamics.com/blog/engineering/a-performance-analysis-of-python-wsgi-servers-part-2/
https://github.com/jonashaag/bjoern

Related

Enable multithreading of my web app using python Bottle framework

I have a web app written with Bottle framework. It have a global somedict list accessed by multiple HTTP query.
After some researching, I find that the Bottle framework only support 1 thread in 1 process mode to run my app(I don't believe it is true, perhaps migrating it to other frameworks like Flask is a good idea.).
1 To enable multi-threading, I find WSGI solution but it does not support multiple processs(1 threads for each process) accessing global variable like somedict in my app, because process will re-init the list every time a query gets handled. How can I handle this issue?
2 Is there any other solutions except WSGI that solve the problem to enable this app to serve multiple HTTP query at once?
from bottle import request, route
import threading
somedict = {}
somedict_lock = threading.Lock()
#route("/read")
def read():
with somedict_lock:
return somedict
#route("/write", method="POST")
def write():
with somedict_lock:
somedict[request.forms.get("key1")] = request.forms.get("value1")
somedict[request.forms.get("key2")] = request.forms.get("value2")
It's best to serve a WSGI app via a server like gunicorn or waitress, which will handle your concurrency needs, but almost no matter what you do for concurrency your global queue in memory will not work the way you want it to. You need to use an external memory store like memcached, redis, etc. Static data is one thing, but mutable state should never be shared between web app processes. That's contrary to Python web server idioms and the typical execution model of Python web apps.
I'm not saying it's literally impossible to do in Python, but it's not the way Python solves this problem.
You can process incoming requests asynchronously, currently Celery seems very suitable for running asynchronous tasks. Read how Celery can do this.

Synchronous reading of a multipart upload alongside Django

I have a Django application which needs to have access to reading multipart file uploads as file-like objects as they're uploaded, which means that I need more or less synchronous access to the request object and a way to unpack it in chunks to binary data. Django unfortunately handles uploads by moving them directly into memory or to temporary files, which won't work for my use case.
Some one recommended that I use gevent/greenlet to handle the upload, but I'm not sure how that plays into the equation and what setup is required alongside Django to make it work. Plus, running something outside of Django would mean that I would have to implement a database connection layer to validate that the upload is allowed (using a ticket id).
With this said, how can I set this up? Django should be running in a WSGI application, and someone had also recommended writing a second WSGI application to capture a single URL path for uploads. I'd like to essentially take as much advantage of the Django framework as I can, while being able to read uploads synchronously?
(I just became familiar with the requests Python library and have to say I'm a pretty big fan, though I wouldn't know the first thing about using it in a server context.)
I believe a lot of these suggestions are overcomplicating things.
You need to change the way Django handles uploaded files? Simply modify the upload handler.
The base class is relatively straightforward, and gives you lots of great hooks. You should be able to extend it to do what you want.
While I can't write all the code for you (it's complex), here is my recommended setup.
Use Tornado + Django: Tornado can embed WSGI processes, so this gives you the ability to have a single process host both Django, and this one off Tornado handler. Here's a quick sample from one of my active projects (though this using Tornado as a Socket.io handler, it should give you the gist of the solution (:
# Socket Server
db = momoko.Pool(DB_DSN, **DB_CONFIG)
router = tornadio2.TornadioRouter(QueryRouter, user_settings={'db':db})
sock_app = tornado.web.Application(router.urls, flash_policy_port = FL_PORT, flash_policy_file = path.join(PROJECT_ROOT, 'assets/xml/flashpolicy.xml'), socket_io_port = WS_PORT, debug=True)
# Django Server
os.environ['DJANGO_SETTINGS_MODULE'] = 'sever.settings'
application = django.core.handlers.wsgi.WSGIHandler()
container = tornado.wsgi.WSGIContainer(application)
# Start the web servers
if __name__ == "__main__":
try:
import logging
tornado.options.parse_command_line()
logging.getLogger().setLevel(logging.INFO)
logging.info('Server started')
tornado.locale.set_default_locale('us_US')
http_server = tornado.httpserver.HTTPServer(container)
http_server.listen(8000)
tornadio2.SocketServer(sock_app, auto_start=True)
tornado.ioloop.IOLoop.instance().start()
except KeyboardInterrupt:
tornado.ioloop.IOLoop.instance().stop()
logging.info("Stopping servers.")
This could easily be converted to two server instances running on two different ports, with 80 being reserved for Django and 8080 being used for your upload handler.
I'm recommending Tornado because it supports streaming request body, and is very well suited to this type of use. Here's a gist that might help you.
Your proxying setup will matter. If you're using NGINX, make sure to turn proxy_buffering off.
I wouldn't use a database for the ticket/upload check. Redis or memcache would probably be a much faster way to handle this. A cache would also be great way to bass upload progress back and forth between Django and Tornado, since the overhead for setting/getting a new value would be so small.
This is a big hairy problem that will serious engineering to come up with something elegant, but it's more than doable.

uWSGI, cherrypy and threading

preface: I would like to separate these problems into smaller questions, but apparently, I am missing some pieces of the puzzle and it seems impossible to me.
I developed my cherrypy application using cherrypy's built in WSGI server. I naively assumed that when the time comes, I will be able to use created WSGI Application class and deploy it using any WSGI compliant server.
I used this blog post to create my own (but very similar) cherrypy Plugin and Tool to connect to database using SQLAlchemy during http requests.
I expected that any server will somehow work like cherrypy's built in server:
main process will spawn X threads to satisfy X concurrent requests
my engine Plugin will create SQLalchemy engine with connection pool = X (so any request will have its connection)
on request arrival, my Tool will supply sql alchemy connection from pool
This flow does not match with uWSGI (as long as I understand it).
I assign my application.py in uWSGI configuration. This file looks something like this:
cherrypy.tools.db = DbConnectorTool()
cherrypy.engine.dbengine = DbEnginePlugin(cherrypy.engine, settings.database)
cherrypy.config.update({
'engine.dbengine.on': True
})
from myapp.application import Application
root = Application(settings)
application = cherrypy.Application(root, script_name='', config=settings)
I was using this application.py to mount my application into cherrypy's built in server when I was developing and testing it.
The problems are that uWSGI does not create any threads itself and my SQLAlchemy plugin is not working with it, because no cherrypy.engine is created.
Does uWSGI support threading in the meaning of using threads to serve multiple concurrent requests? Can I start these threads in my application.py? Will uWSGI understand it and use these threads for concurrent requests? And how can this be done? I think cherrypy can be used somehow, or not?
And what about my SQLAlchemy Plugin, how can I start cherrypy.engine when using only WSGI cherrypy.Application?
Any help or information that could help me will be appreciated.
Edit:
My uWSGI configuration:
<uwsgi>
<socket>127.0.0.1:9001</socket>
<master/>
<daemonize>/var/log/uwsgi/app.log</daemonize>
<logdate/>
<threads/>
<pidfile>/home/web/uwsgi.pid</pidfile>
<uid>uwsgi</uid>
<gid>uwsgi</gid>
<workers>2</workers>
<harakiri>90</harakiri>
<harakiri-verbose/>
<home>/home/web/</home>
<pythonpath>/home/web/instance</pythonpath>
<module>core.application</module>
<no-orphans/>
<touch-reload>/home/web/uwsgi-reload-web</touch-reload>
</uwsgi>
uWSGI uses worker processes, not threads. It's worth noting that it means that the globals are not shared between all requests any more. You can use SharedArea for global data.
The processes are forked by default, so make sure you're ok with that or adjust settings (see Things to know).
Get Cherrypy's WSGI application with cherrypy.tree.mount(root, config=settings) call.
If your DB plugin does not have threading / shared data issues, chances are it will work. Like you say, you may need cherrypy.engine.start(), but definitely not cherrypy.engine.block(), since your main thread is now uWSGI worker.
You should post your uWSGI config, otherwise it will be hard to understand what is going on.
By the way to spawn additional threads (per worker) you simply need to add --threads N

Python: moving from dev server to production server

I am developing an application with the bottlepy framework. I am using the standard library WSGIRefServer() to run a development server. It is a single threaded server.
Now when going into production, I will want to move to a multi-threaded production server, and there are many choices. Let's say I choose CherryPy.
Now, in my code, I am initializing a single wsgi application. Other than that, I am also initializing other things...
Memcached connection
Mako templates
MongoDB connection
Since standard library wsgiref is a single threaded server, and I am creating only a single wsgi app (wsgi callable), everything works just fine.
What I want to know is that when I move to the multi-threaded server, how will my wsgi app, initialization code, connections to different server, etc. behave.
Will a multi-threaded server create a separate instance of wsgi app for every thread. And will a new thread be spawned for each new request (which then means a new wsgi app for each request)?
Will my connections to memcached, mongoDB, etc, be shared across threads or not. What else will be shared between threads
Please explain the request-response cycle for a threaded server
In general your application is using wsgi compliant framework and you shouldn't be afraid of multi-threaded / single-threaded server side. It's meant to work transparent and has to react same way despite of what kind of server is it, as long as it is wsgi compliant.
Every code block before bottle.run() will be run only once. As so, every connection (database, memcached) will be instantiated only once and shared.
When you call bottle.run() bottlepy starts wsgi server for you. Every request to that server fires some wsgi callable inside bottlepy framework. You are not really interested if it is single or multi -threaded environment, as long as you don't do something strange.
For strange i mean for instance synchronizing something through global variables. (Exception here is global request object for which bottlepy ensures that it contains proper request in proper context).
And in response to first question on the list: request may be computed in newly spawned thread or thread from the pool of threads (CherryPy is thread-pooled)

Replace AppEngine Devserver With Spawning (BaseHTTPRequestHandler as WSGI)

I'm looking to replace AppEngine's devserver with spawning. Spawning handles standard wsgi handlers, just like appengine, so running your app on it is easy.
But the devserver takes into account your app.yaml file that has url redirects etc. I've been going through the devserver code and it is pretty easy to get the BaseHTTPRequestHandler like this:
from google.appengine.tools.dev_appserver import CreateRequestHandler
dev = CreateRequestHandler(os.path.dirname(__file__), '', require_indexes=False, static_caching=True)
But the BaseHTTPRequestHandler is not a WSGI app, so my guess is I need to put something around it to make it work. Any hints?
I don't think you're going to be able to pull out a part of the dev_appserver and use it in a custom WSGI server quite so easily. The dev_appserver does a lot of 'magic', and it isn't really structured to be pulled out and used as a WSGI wrapper in another server (more's the pity).
You may want to check out TwistedAE, which is working on creating an alternate serving environment; if you really want to use spawning, you can probably use TwistedAE's work as a basis.
That said, if you do want to do it yourself, there's a couple of options:
You can write your own shim to interface WSGI with the class returned by CreateRequestHandler. In that case, you need to replicate the interface in BaseHTTPServer.BaseHTTPRequestHandler from the Python SDK. Converting WSGI to that, just so the dev_appserver code can convert it back seems a bit perverse, though.
You can rip out the code from the _HandleRequest method of DevAppServerRequestHandler, modify it to work with WSGI, and create a WSGI app from that (probably your best bet if you want to DIY).
You can start from scratch, which I believe is the approach taken by TwistedAE.
One thing to bear in mind whatever you do: App Engine explicitly expects a single-threaded environment for its apps. Don't use a multithreaded approach if you want apps to work the same locally as they do in production or on the dev_appserver!

Categories

Resources