Python basehttpserver not serving requests properly

Python basehttpserver not serving requests properly - python

I'm trying to write down a simple local proxy for javascript: since I need to load some stuff from javascript within a web page, I wrote this simple daemon in python:
import string,cgi,time
from os import curdir, sep
import urllib
import urllib2
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
class MyHandler(BaseHTTPRequestHandler):
def fetchurl(self, url, post, useragent, cookies):
headers={"User-Agent":useragent, "Cookie":cookies}
url=urllib.quote_plus(url, ":/?.&-=")
if post:
req = urllib2.Request(url,post,headers)
else:
req=urllib2.Request(url, None, headers)
try:
response=urllib2.urlopen(req)
except urllib2.URLError, e:
print "URLERROR: "+str(e)
return False
except urllib2.HTTPError, e:
print "HTTPERROR: "+str(e)
return False
else:
return response.read()
def do_GET(self):
if self.path != "/":
[callback, url, post, useragent, cookies]=self.path[1:].split("%7C")
print "callback = "+callback
print "url = "+url
print "post = "+post
print "useragent = "+useragent
print "cookies = "+cookies
if useragent=="":
useragent="pyjproxy v. 1.0"
load=self.fetchurl(url, post, useragent, cookies)
pack=load.replace("\\", "\\\\").replace("\"", "\\\"").replace("\n", "\\n").replace("\r", "\\r").replace("\t", "\\t").replace(" </script>", "</scr\"+\"ipt>")
response=callback+"(\""+pack+"\");"
if load:
self.send_response(200)
self.send_header('Content-type', 'text/javascript')
self.end_headers()
self.wfile.write(response)
self.wfile.close()
return
else:
self.send_error(404,'File Not Found: %s' % self.path)
return
else:
embedscript="function pyjload(datadict){ if(!datadict[\"url\"] || !datadict[\"callback\"]){return false;} if(!datadict[\"post\"]) datadict[\"post\"]=\"\"; if(!datadict[\"useragent\"]) datadict[\"useragent\"]=\"\"; if(!datadict[\"cookies\"]) datadict[\"cookies\"]=\"\"; var oHead = document.getElementsByTagName('head').item(0); var oScript= document.createElement(\"script\"); oScript.type = \"text/javascript\"; oScript.src=\"http://localhost:1180/\"+datadict[\"callback\"]+\"%7C\"+datadict[\"url\"]+\"%7C\"+datadict[\"post\"]+\"%7C\"+datadict[\"useragent\"]+\"%7C\"+datadict[\"cookies\"]; oHead.appendChild( oScript);}"
self.send_response(200)
self.send_header("Content-type", "text/html")
self.end_headers()
self.wfile.write(embedscript)
self.wfile.close()
return
def main():
try:
server = HTTPServer(('127.0.0.1', 1180), MyHandler)
print 'started httpserver...'
server.serve_forever()
except KeyboardInterrupt:
print '^C received, shutting down server'
server.socket.close()
if __name__ == '__main__':
main()
And I use within a web page like this one:
<!DOCTYPE HTML>
<html><head>
<script>
function miocallback(htmlsource)
{
alert(htmlsource);
}
</script>
<script type="text/javascript" src="http://localhost:1180"></script>
</head><body>
<a onclick="pyjload({'url':'http://www.google.it','callback':'miocallback'});"> Take the Red Pill</a>
</body></html>
Now, on Firefox and Chrome looks like it works always. On Opera and Internet Explorer, however, I noticed that sometimes it doesn't work, or it hangs for a lot of time... what's up, I wonder? Did I misdo something?
Thank for any help!
Matteo

You have to understand that (modern) browsers try to optimize their browsing speed using different techniques, which is why you get different results on different browsers.
In your case, the technique that caused you trouble is concurrent HTTP/1.1 session setup: in order to utilize your bandwidth better, your browser is able to start several HTTP/1.1 sessions at the same time. This allows to retrieve multiple resources (e.g. images) simultaneously.
However, BaseHTTPServer is not threaded: as soon as your browser tries to open another connection, it will fail to do so because BaseHTTPServer is already blocked by the first session that's still open. The request will never reach the server and run into a timeout. This also means that only one user can access your service at a given time. Inconvenient? Aye, but help is here:
Threads! .. and python makes this one rather easy:
Derive a new class from HTTPServer using a MixIn from socketserver.
.
Example:
from BaseHTTPServer import HTTPServer, BaseHTTPRequestHandler
from SocketServer import ThreadingMixIn
import threading
class Handler(BaseHTTPRequestHandler):
def do_HEAD(self):
pass
def do_GET(self):
pass
class ThreadedHTTPServer(ThreadingMixIn, HTTPServer):
""" This class allows to handle requests in separated threads.
No further content needed, don't touch this. """
if __name__ == '__main__':
server = ThreadedHTTPServer(('localhost', 80), Handler)
print 'Starting server on port 80...'
server.serve_forever()
From now on, BaseHTTPServer is threaded and ready to serve multiple connections ( and therefore requests ) at the same time which will solve your problem.
Instead of the ThreadingMixIn, you can also use the ForkingMixIn in order to spawn another process instead of another thread.
all the best,
creo

Note that Python basehttpserver is a very basic HTTP server far to be perfect, but that's not your first issue.
What is happening if you put the two scripts at the end of the document just before the </body> tag? Does it help?

Related

Force reload on SimpleHTTP Server in Python

I have a very simple HTTPServer implemented in Python. The code is the following:
import SimpleHTTPServer
import SocketServer as socketserver
import os
import threading
class MyHandler(SimpleHTTPServer.SimpleHTTPRequestHandler):
path_to_image = 'RGBWebcam1.png'
img = open(path_to_image, 'rb')
statinfo = os.stat(path_to_image)
img_size = statinfo.st_size
print(img_size)
def do_HEAD(self):
self.send_response(200)
self.send_header("Content-type", "image/png")
self.send_header("Content-length", img_size)
self.end_headers()
def do_GET(self):
self.send_response(200)
self.send_header("Content-type", "image/png")
self.send_header("Content-length", img_size)
self.end_headers()
f = open(path_to_image, 'rb')
self.wfile.write(f.read())
f.close()
class MyServer(socketserver.ThreadingMixIn, socketserver.TCPServer):
def __init__(self, server_adress, RequestHandlerClass):
self.allow_reuse_address = True
socketserver.TCPServer.__init__(self, server_adress, RequestHandlerClass, False)
if __name__ == "__main__":
HOST, PORT = "192.168.2.10", 9999
server = MyServer((HOST, PORT), MyHandler)
server.server_bind()
server.server_activate()
server_thread = threading.Thread(target=server.serve_forever)
server_thread.start()
while(1):
print "test"
If I connect to the given IP-Adress the page loads and everything is fine. Now it would be nice if the page would automatically refresh every n seconds.
I am very new to python and especially new to webcoding. I have found LiveReload however I cannot get my head around how I merge these two libraries together.
Thank you for your help

You'll require a connection to the client if you want the server to tell it to refresh. A HTTP server means you've sent information (HTML) and the client will process it. There is no communication beyond that. That would require AJAX or Websockets to be implemented - both protocols that allow frequent communication.
Since you can't communicate, you should automate the refresh in the content you initially send. In our example we'll say we want the page to refresh every 30 seconds. This is possible to do in either HTML or Javascript:
<meta http-equiv="refresh" content="30" />
or
setTimeout(function(){
window.location.reload(1);
}, 30000);

Python http server - can't read cookie

I've made python server and i'd like to create, send and receive cookies. I have problem with receiving them, when I visit it on Chrome I can see cookie was created. I've read that it should appear in os.environ but it never does. Here's my code:
import os
import time
import Cookie
import BaseHTTPServer
from multiprocessing import Process
from SocketServer import ThreadingMixIn
from BaseHTTPServer import HTTPServer, BaseHTTPRequestHandler
class MyHandler(BaseHTTPServer.BaseHTTPRequestHandler):
def do_GET(s):
#creating cookie
c = Cookie.SimpleCookie()
c['api'] = 'token'
c['api']['expires'] = 3*60*60
s.send_response(200)
#sending cookie
s.wfile.write(c)
s.wfile.write('\r\n')
s.send_header("Access-Control-Allow-Origin", "*")
s.send_header("Access-Control-Expose-Headers", "Access-Control-Allow-Origin")
s.send_header("Access-Control-Allow-Headers", "Origin, X-Requested-With, Content-Type, Accept")
s.end_headers()
#reading cookies
if 'HTTP_COOKIE' in os.environ:
cookie_string = os.environ.get('HTTP_COOKIE')
c = Cookie.SimpleCookie()
c.load(cookie_string)
try:
data=c['api'].value
print "cookie data: "+data
except:
print "The cookie was not set or has expired"
else:
print 'The cookie was not set'
class ThreadedHTTPServer(ThreadingMixIn, HTTPServer):
''
if __name__ == '__main__':
httpd = ThreadedHTTPServer(('', 8666), MyHandler)
print time.asctime(), "Server Starts - %s:%s" % (HOST_NAME, PORT_NUMBER)
try:
httpd.serve_forever()
except KeyboardInterrupt:
pass
httpd.server_close()
print time.asctime(), "Server Stops - %s:%s" % (HOST_NAME, PORT_NUMBER)
After I visit my site cookie is being created but there's never HTTP_COOKIE in os.environ.

For future readers:
here's how you parse cookies in python3:
from http.server import BaseHTTPRequestHandler
from http.cookies import SimpleCookie
class MyHandler(BaseHTTPRequestHandler):
def do_GET(self):
cookies = SimpleCookie(self.headers.get('Cookie'))
# then use somewhat like a dict, e.g:
username = cookies['username'].value
To answer OP's question:
The problem is that you are looking for the cookie in the wrong place.
With the following lines, you check in your computer's operating system environment variables if one is named HTTP_COOKIE:
if 'HTTP_COOKIE' in os.environ:
cookie_string = os.environ.get('HTTP_COOKIE')
But there is no reason that running a python server would create an operating system wide environment variable.
Instead, you must look inside the BaseHTTPRequestHandler that you are deriving from.
The correct way to access the cookies is the following:
cookie_string = s.headers.get('Cookie')
which will parse the headers sent by the client and give you the corresponding cookie string.

How to return HTTP 303 from python?

This question comes from this one.
What I want is to be able to return the HTTP 303 header from my python script, when the user clicks on a button. My script is very simple and as far as output is concerned, it only prints the following two lines:
print "HTTP/1.1 303 See Other\n\n"
print "Location: http://192.168.1.109\n\n"
I have also tried many different variants of the above (with a different number of \r and \n at the end of the lines), but without success; so far I always get Internal Server Error.
Are the above two lines enough for sending a HTTP 303 response? Should there be something else?

Assuming you are using cgi (2.7)(3.5)
The example below should redirect to the same page. The example doesn't attempt to parse headers, check what POST was send, it simply redirects to the page '/' when a POST is detected.
# python 3 import below:
# from http.server import HTTPServer, BaseHTTPRequestHandler
# python 2 import below:
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
import cgi
#stuff ...
class WebServerHandler(BaseHTTPRequestHandler):
def do_GET(self):
try:
if self.path.endswith("/"):
self.send_response(200)
self.send_header('Content-type', 'text/html')
self.end_headers()
page ='''<html>
<body>
<form action="/" method="POST">
<input type="submit" value="Reload" >
</form>
</body>
</html'''
self.wfile.write(page)
except IOError:
self.send_error(404, "File Not Found {}".format(self.path))
def do_POST(self):
self.send_response(303)
self.send_header('Content-type', 'text/html')
self.send_header('Location', '/') #This will navigate to the original page
self.end_headers()
def main():
try:
port = 8080
server = HTTPServer(('', port), WebServerHandler)
print("Web server is running on port {}".format(port))
server.serve_forever()
except KeyboardInterrupt:
print("^C entered, stopping web server...")
server.socket.close()
if __name__ == '__main__':
main()

Typically browsers like to see /r/n/r/n at the end of an HTTP response.

Be very careful about what Python automatically does.
For example, in Python 3, the print function adds line endings to each print, which can mess with HTTP's very specific number of line endings between each message.
You also still need a content type header, for some reason.
This worked for me in Python 3 on Apache 2:
print('Status: 303 See Other')
print('Location: /foo')
print('Content-type:text/plain')
print()

How to serve an mp3 file with built-in python http server

I am currently trying to serve MP3 Files using Python. The problem is that I can only play the MP3 once. Afterwards media controls stop responding and I need to reload entirely the page to be able to listen again to the MP3. (tested in Chrome)
Problem: running the script below, and entering http://127.0.0.1/test.mp3 on my browser will return an MP3 files which can be replayed only if I refresh the page
Notes:
Saving the page as HTML and loading it directly with Chrome (without Python server) would make the problem disappear.
Serving the file with Apache would solve the problem, but this is overkilled: I want to make the script very easy to use and not require installing Apache.
Here is the code I use:
import string
import os
import urllib
import socket
# Setup web server import string,cgi,time
import string,cgi,time
from os import curdir, sep
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
import hashlib
class MyHandler(BaseHTTPRequestHandler):
def do_GET(self):
try:
# serve mp3 files
if self.path.endswith(".mp3"):
print curdir + sep + self.path
f = open(curdir + sep + self.path, 'rb')
st = os.fstat( f.fileno() )
length = st.st_size
data = f.read()
md5 = hashlib.md5()
md5.update(data)
md5_key = self.headers.getheader('If-None-Match')
if md5_key:
if md5_key[1:-1] == md5.hexdigest():
self.send_response(304)
self.send_header('ETag', '"{0}"'.format(md5.hexdigest()))
self.send_header('Keep-Alive', 'timeout=5, max=100')
self.end_headers()
return
self.send_response(200)
self.send_header('Content-type', 'audio/mpeg')
self.send_header('Content-Length', length )
self.send_header('ETag', '"{0}"'.format(md5.hexdigest()))
self.send_header('Accept-Ranges', 'bytes')
self.send_header('Last-Modified', time.strftime("%a %d %b %Y %H:%M:%S GMT",time.localtime(os.path.getmtime('test.mp3'))))
self.end_headers()
self.wfile.write(data)
f.close()
return
except IOError:
self.send_error(404,'File Not Found: %s' % self.path)
from SocketServer import ThreadingMixIn
class ThreadedHTTPServer(ThreadingMixIn, HTTPServer):
pass
if __name__ == "__main__":
try:
server = ThreadedHTTPServer(('', 80), MyHandler)
print 'started httpserver...'
server.serve_forever()
except KeyboardInterrupt:
print '^C received, shutting down server'
server.socket.close()

BaseServer is single-threaded, you should use either ForkingMixIn or ThreadingMixIn to support multiple connections.
For example replace line:
server = HTTPServer(('', 80), MyHandler)
with
from SocketServer import ThreadingMixIn
class ThreadedHTTPServer(ThreadingMixIn, HTTPServer):
pass
server = ThreadedHTTPServer(('', 80), MyHandler)

EDIT: I wrote much of this before I realized Mapadd only planned to use this in a lab. WSGI probably is not required for his use case.
If you are willing to run this as a wsgi app (which I would recommend over vanilla CGI for any real scalability), you can use the script I have included below.
I took the liberty of modifying your source... this works with the assumptions above.. btw, you should spend some time checking that your html is reasonably compliant... this will help ensure that you get better cross-browser compatibility... the original didn't have <head> or <body> tags... mine (below) is strictly prototype html, and could be improved.
To run this, you just run the python executable in your shell and surf to the ipaddress of the machine on 8080. If you were doing this for a production website, we should be using lighttpd or apache for serving files, but since this is simply for lab use, the embedded wsgi reference server should be fine. Substitute the WSGIServer line at the bottom of the file if you want to run in apache or lighttpd.
Save as mp3.py
from webob import Request
import re
import os
import sys
####
#### Run with:
#### twistd -n web --port 8080 --wsgi mp3.mp3_app
_MP3DIV = """<div id="musicHere"></div>"""
_MP3EMBED = """<embed src="mp3/" loop="true" autoplay="false" width="145" height="60"></embed>"""
_HTML = '''<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head></head><body> Hello %s %s</body></html> ''' % (_MP3DIV, _MP3EMBED)
def mp3_html(environ, start_response):
"""This function will be mounted on "/" and refer the browser to the mp3 serving URL."""
start_response('200 OK', [('Content-Type', 'text/html')])
return [_HTML]
def mp3_serve(environ, start_response):
"""Serve the MP3, one chunk at a time with a generator"""
file_path = "/file/path/to/test.mp3"
mimetype = "application/x-mplayer2"
size = os.path.getsize(file_path)
headers = [
("Content-type", mimetype),
("Content-length", str(size)),
]
start_response("200 OK", headers)
return send_file(file_path, size)
def send_file(file_path, size):
BLOCK_SIZE = 4096
fh = open(file_path, 'r')
while True:
block = fh.read(BLOCK_SIZE)
if not block:
fh.close()
break
yield block
def _not_found(environ,start_response):
"""Called if no URL matches."""
start_response('404 NOT FOUND', [('Content-Type', 'text/plain')])
return ['Not Found']
def mp3_app(environ,start_response):
"""
The main WSGI application. Dispatch the current request to
the functions andd store the regular expression
captures in the WSGI environment as `mp3app.url_args` so that
the functions from above can access the url placeholders.
If nothing matches call the `not_found` function.
"""
# map urls to functions
urls = [
(r'^$', mp3_html),
(r'mp3/?$', mp3_serve),
]
path = environ.get('PATH_INFO', '').lstrip('/')
for regex, callback in urls:
match = re.search(regex, path)
if match is not None:
# assign http environment variables...
environ['mp3app.url_args'] = match.groups()
return callback(environ, start_response)
return _not_found(environ, start_response)
Run from the bash shell with: twistd -n web --port 8080 --wsgi mp3.mp3_app from the directory where you saved mp3.py (or just put mp3.py somewhere in $PYTHONPATH).
Now surf to the external ip (i.e. http://some.ip.local:8080/) and it will serve the mp3 directly.
I tried running your original app as it was posted, and could not get it to source the mp3, it barked at me with an error in linux...

Determine site domain in BaseHTTPServer

I try to implement simple server on python based on HTTPServer.
How can i extract information about site domain served in current request?
I mean it can serv several domains such as site1.com and site2.com for example, how can i get it in this code:
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
class MyHandler(BaseHTTPRequestHandler):
def do_GET(self):
print "get"
self.send_response(200)
self.send_header("Content-type", "text/html")
self.end_headers()
#how can i get here host name of serving site?
#site1.com or site2.com ?
domain = ???
self.wfile.write('<html>Welcome on www.%s.com</html>' % (domain))
if __name__ == "__main__":
try:
server = HTTPServer(("", 8070), MyHandler)
print "started httpserver..."
server.serve_forever()
except KeyboardInterrupt:
print "^C received, shutting down server"
server.socket.close()

I guess you should be able to read the Host header.
The headers can be accessed from BaseHTTPRequestHandler.headers

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python basehttpserver not serving requests properly - python

Note that Python basehttpserver is a very basic HTTP server far to be perfect, but that's not your first issue. What is happening if you put the two scripts at the end of the document just before the </body> tag? Does it help?

Related

Force reload on SimpleHTTP Server in Python

Python http server - can't read cookie

How to return HTTP 303 from python?

How to serve an mp3 file with built-in python http server

Determine site domain in BaseHTTPServer

Categories

Resources