I'm really confusing about multiple thread in mod_wsgi, even after reading this document
The main problem is that, how does mod_wsgi call my python scripts?
To make my question clear, i will conclude it to questions below
Suppose I have server configured like this:
WSGIDaemonProcess XXX.com processes=12 threads=20 display-name=%{GROUP}
WSGIProcessGroup XXX.com
WSGIScriptAlias /sign /XXX/Http/upload.wsgi
And I'm using prefork MPM
And my /XXX/Http/upload.wsgi looks like this
class app(object):
def __init__(self):
pass
def __call__(self, environ, start_response):
temp_unsign = tempfile.mkdtemp()
temp_signed = tempfile.mkdtemp()
try:
response_headers = [('Content-type', 'text/plain;charset=UTF-8'),('Content-Length', str(len('123')))]
status = '200 OK'
start_response(status, response_headers)
return ['123']
finally:
shutil.rmtree(temp_unsign)
shutil.rmtree(temp_signed)
application = app()
Them my questions are:
If there're 5 requests from different IP at the same time, how many app instances will be there? One instance ? Or one instance per request
What if there're 5 requests from same IP and same computer at the same time?
Does all 5 __call__ methods run in 5 different threads?
I ask this because I always get error No such directory when execute shutil.rmtree.
And of course, i can't understand how does mod_wsgi actually work in the matter of multiple thread
Related
I'm trying to run python on the web to do some CSS/JSS extraction from websites. I'm using mod_wsgi as my interface for python. I've been following this website to get an idea on getting started.
The below is their sample code.
#! /usr/bin/env python
# Our tutorial's WSGI server
from wsgiref.simple_server import make_server
def application(environ, start_response):
# Sorting and stringifying the environment key, value pairs
response_body = ['%s: %s' % (key, value)
for key, value in sorted(environ.items())]
response_body = '\n'.join(response_body)
status = '200 OK'
response_headers = [('Content-Type', 'text/plain'),
('Content-Length', str(len(response_body)))]
start_response(status, response_headers)
return [response_body]
# Instantiate the WSGI server.
# It will receive the request, pass it to the application
# and send the application's response to the client
httpd = make_server(
'localhost', # The host name.
8051, # A port number where to wait for the request.
application # Our application object name, in this case a function.
)
# Wait for a single request, serve it and quit.
httpd.handle_request()
While this runs fine with python 2.7, I can't get it to run on Python 3. For my CSS/JSS extraction, I have modified the above code and put in my own functionalities which use BeautifulSoup and urllib3. While for using those modules I need python 3, for the WSGI code I need python 2.7. Hence, I can't merge the two. When trying to run BS and urllib in python3, I get an error. But when I try to run the WSGI code with python3, I'm just unable to load the webpage.
Any help would be greatly appreciated! Any workarounds, or suggestions as well.
I have a question to ask regarding the performance of my flask app when I incorporated uwsgi and nginx.
My app.view file looks like this:
import app.lib.test_case as test_case
from app import app
import time
#app.route('/<int:test_number>')
def test_case_match(test_number):
rubbish = test_case.test(test_number)
return "rubbish World!"
My app.lib.test_case file look like this:
import time
def test_case(test_number):
time.sleep(30)
return None
And my config.ini for my uwsgi looks like this:
[uwsgi]
socket = 127.0.0.1:8080
chdir = /home/ubuntu/test
module = app:app
master = true
processes = 2
daemonize = /tmp/uwsgi_daemonize.log
pidfile = /tmp/process_pid.pid
Now if I run this test case just purely through the flask framework without switching on uwsgi + nginx, using the ab benchmark, I received a response in 31seconds which is expected owning to the sleep function. What I dont get is when I run the app through uwsgi + nginx , the response time I got was 38 seconds, which is an overhead of around 25%. Can anyone enlighten me?
time.sleep() is not time-safe.
From the documentation of time.sleep(secs):
[…] Also, the suspension time may be longer than requested by an arbitrary amount because of the scheduling of other activity in the system.
I have used mod_wsgi to create a web server that can be called locally. Now I just found out I need to change it so it runs through the Apache server. I'm hoping to do this without rewriting my whole script.
from wsgiref.simple_server import make_server
class FileUploadApp(object):
firstcult = ""
def __init__(self, root):
self.root = root
def __call__(self, environ, start_response):
if environ['REQUEST_METHOD'] == 'POST':
post = cgi.FieldStorage(
fp=environ['wsgi.input'],
environ=environ,
keep_blank_values=True
)
body = u"""
<html><body>
<head><title>title</title></head>
<h3>text</h3>
<form enctype="multipart/form-data" action="http://localhost:8088" method="post">
</body></html>
"""
return self.__bodyreturn(environ, start_response,body)
def __bodyreturn(self, environ, start_response,body):
start_response(
'200 OK',
[
('Content-type', 'text/html; charset=utf8'),
('Content-Length', str(len(body))),
]
)
return [body.encode('utf8')]
def main():
PORT = 8080
print "port:", PORT
ROOT = "/home/user/"
httpd = make_server('', PORT, FileUploadApp(ROOT))
print "Serving HTTP on port %s..."%(PORT)
httpd.serve_forever() # Respond to requests until process is killed
if __name__ == "__main__":
main()
I am hoping to find a way to make it possible to avoid making the server and making it possible to run multiple instances of my script.
The documentation at:
http://code.google.com/p/modwsgi/wiki/ConfigurationGuidelines
explains what mod_wsgi is expecting to be given.
If you also read:
http://blog.dscpl.com.au/2011/01/implementing-wsgi-application-objects.html
you will learn about the various ways that WSGI application entry points can be constructed.
From that you should identify that FileUploadApp fits one of the described ways of defining a WSGI application and thus you only need satisfy the requirement that mod_wsgi has of the WSGI application object being accessible as 'application'.
I've got the following minimal code for a CGI-handling HTTP server, derived from several examples on the inner-tubes:
#!/usr/bin/env python
import BaseHTTPServer
import CGIHTTPServer
import cgitb;
cgitb.enable() # Error reporting
server = BaseHTTPServer.HTTPServer
handler = CGIHTTPServer.CGIHTTPRequestHandler
server_address = ("", 8000)
handler.cgi_directories = [""]
httpd = server(server_address, handler)
httpd.serve_forever()
Yet, when I execute the script and try to run a test script in the same directory via CGI using http://localhost:8000/test.py, I see the text of the script rather than the results of the execution.
Permissions are all set correctly, and the test script itself is not the problem (as I can run it fine using python -m CGIHTTPServer, when the script resides in cgi-bin). I suspect the problem has something to do with the default CGI directories.
How can I get the script to execute?
My suspicions were correct. The examples from which this code is derived showed the wrong way to set the default directory to be the same directory in which the server script resides. To set the default directory in this way, use:
handler.cgi_directories = ["/"]
Caution: This opens up potentially huge security holes if you're not behind any kind of a firewall. This is only an instructive example. Use only with extreme care.
The solution doesn't seem to work (at least for me) if the .cgi_directories requires multiple layers of subdirectories ( ['/db/cgi-bin'] for instance). Subclassing the server and changing the is_cgi def seemed to work. Here's what I added/substituted in your script:
from CGIHTTPServer import _url_collapse_path
class MyCGIHTTPServer(CGIHTTPServer.CGIHTTPRequestHandler):
def is_cgi(self):
collapsed_path = _url_collapse_path(self.path)
for path in self.cgi_directories:
if path in collapsed_path:
dir_sep_index = collapsed_path.rfind(path) + len(path)
head, tail = collapsed_path[:dir_sep_index], collapsed_path[dir_sep_index + 1:]
self.cgi_info = head, tail
return True
return False
server = BaseHTTPServer.HTTPServer
handler = MyCGIHTTPServer
Here is how you would make every .py file on the server a cgi file (you probably don't want that for production/a public server ;):
import BaseHTTPServer
import CGIHTTPServer
import cgitb; cgitb.enable()
server = BaseHTTPServer.HTTPServer
# Treat everything as a cgi file, i.e.
# `handler.cgi_directories = ["*"]` but that is not defined, so we need
class Handler(CGIHTTPServer.CGIHTTPRequestHandler):
def is_cgi(self):
self.cgi_info = '', self.path[1:]
return True
httpd = server(("", 9006), Handler)
httpd.serve_forever()
I have a Python script that I'd like to be run from the browser, it seem mod_wsgi is the way to go but the method feels too heavy-weight and would require modifications to the script for the output. I guess I'd like a php approach ideally. The scripts doesn't take any input and will only be accessible on an internal network.
I'm running apache on Linux with mod_wsgi already set up, what are the options here?
I would go the micro-framework approach just in case your requirements change - and you never know, it may end up being an app rather just a basic dump... Perhaps the simplest (and old fashioned way!?) is using CGI:
Duplicate your script and include print 'Content-Type: text/plain\n' before any other output to sys.stdout
Put that script somewhere apache2 can access it (your cgi-bin for instance)
Make sure the script is executable
Make sure .py is added to the Apache CGI handler
But - I don't see anyway this is going to be a fantastic advantage (in the long run at least)
You could use any of python's micro frameworks to quickly run your script from a server. Most include their own lightweight servers.
From cherrypy home page documentation
import cherrypy
class HelloWorld(object):
def index(self):
# run your script here
return "Hello World!"
index.exposed = True
cherrypy.quickstart(HelloWorld())
ADditionally python provides the tools necessary to do what you want in its standard library
using HttpServer
A basic server using BaseHttpServer:
import time
import BaseHTTPServer
HOST_NAME = 'example.net' # !!!REMEMBER TO CHANGE THIS!!!
PORT_NUMBER = 80 # Maybe set this to 9000.
class MyHandler(BaseHTTPServer.BaseHTTPRequestHandler):
def do_HEAD(s):
s.send_response(200)
s.send_header("Content-type", "text/html")
s.end_headers()
def do_GET(s):
"""Respond to a GET request."""
s.send_response(200)
s.send_header("Content-type", "text/html")
s.end_headers()
s.wfile.write("<html><head><title>Title goes here.</title></head>")
s.wfile.write("<body><p>This is a test.</p>")
# If someone went to "http://something.somewhere.net/foo/bar/",
# then s.path equals "/foo/bar/".
s.wfile.write("<p>You accessed path: %s</p>" % s.path)
s.wfile.write("</body></html>")
if __name__ == '__main__':
server_class = BaseHTTPServer.HTTPServer
httpd = server_class((HOST_NAME, PORT_NUMBER), MyHandler)
print time.asctime(), "Server Starts - %s:%s" % (HOST_NAME, PORT_NUMBER)
try:
httpd.serve_forever()
except KeyboardInterrupt:
pass
httpd.server_close()
print time.asctime(), "Server Stops - %s:%s" % (HOST_NAME, PORT_NUMBER)
What's nice about the microframeworks is they abstract out writing headers and such (but should still provide you an interface to, if required)