Getting Host field from HTTP request in BaseHTTPRequestHandler - python

I'm writing a script using BaseHTTPRequestHandler class. And in do_GET(self) method I need to get the content of the Host field from the HTTP request. I can do it by regexping the str(self.headers) like proposed here: Determine site domain in BaseHTTPServer, but it's kinda ugly and I wonder if there's cleaner way to do that.

The attribute self.headers is a dictionary-like structure, so you can do this:
def do_GET(self):
host = self.headers.get('Host')
When the header does not exist, None is returned.

Related

POSTing to non-CGI scripts?

According to the http.server documentation BaseHTTPRequestHandler can handle POST requests.
class http.server.BaseHTTPRequestHandler(request, client_address,
server)¶ This class is used to handle the HTTP requests that arrive at
the server. By itself, it cannot respond to any actual HTTP requests;
it must be subclassed to handle each request method (e.g. GET or
POST). BaseHTTPRequestHandler provides a number of class and instance variables, and methods for use by subclasses.
However, down below it says:
do_POST() This method serves the 'POST' request type, only allowed for
CGI scripts. Error 501, “Can only POST to CGI scripts”, is output when
trying to POST to a non-CGI url.
What does this part of the documentation mean? Isn't that contradicting itself or am I misunderstanding something?
EDIT: To clarify, the following method I tried seems to work, I'd just like to know what the documentation of do_POST means.
from os import curdir
from os.path import join as pjoin
import requests
from http.server import BaseHTTPRequestHandler, HTTPServer
port = 18888
class StoreHandler(BaseHTTPRequestHandler):
store_path = pjoin(curdir, 'store.json')
def do_POST(self):
if self.path == '/store.json':
print("Got a connection from", self.client_address)
length = self.headers['content-length']
data = self.rfile.read(int(length))
print(data)
with open(self.store_path, 'w') as fh:
fh.write(data.decode())
self.send_response(200)
self.end_headers()
server = HTTPServer(('localhost', port), StoreHandler)
server.serve_forever()
CGIHTTPRequestHandler IS a subclass of SimpleHTTPRequestHandler, which is a subclass of BaseHTTPRequestHandler (I found this out by looking at the source code for SimpleHTTPServer.py and CGIHTTPServer.py). This part below:
do_POST() This method serves the 'POST' request type, only allowed for CGI scripts. Error 501, “Can only POST to CGI scripts”, is output when trying to POST to a non-CGI url.
Refers to CGIHTTPRequestHandler, not BaseHTTPRequestHandler! See:
http.server.BaseHTTPRequestHandler
CGIHTTPRequestHandler
do_POST() as documented is a method of CGIHTTPRequestHandler. Its default behavior does not affect BaseHTTPRequestHandler in any way.

Python read multiline post data

I'm using BaseHTTPRequestHandler to implement my httpserver. How do a I read a multiline post data in my do_PUT/do_POST?
Edit: I'm trying to implement a standalone script which sevices some custom requests, something like listener on a server, which consolidates/archives/extracts from various log files, I don't want implement something which requires a webserver, I don't have much experience in python, I would be grateful if someone could point any better solution.
Edit2: I can't use any external libraries/modules, I have to make do with plain vanilla python 2.4/java1.5/perl5.8.8, restrictive policies, my hands are tied
Getting the request body is as simple as reading from self.rfile, but you'll have to know how much to read if the client is using Connection: keep-alive. Something like this will work if the client specifies the Content-Length header...
from BaseHTTPServer import HTTPServer, BaseHTTPRequestHandler
class RequestHandler(BaseHTTPRequestHandler):
def do_POST(self):
content_length = int(self.headers['Content-Length'])
post_data = self.rfile.read(content_length)
print post_data
server = HTTPServer(('', 8000), RequestHandler)
server.serve_forever()
...although it gets more complicated if the client sends data using chunked transfer encoding.

Can I set a header with python's SimpleHTTPServer?

I'm using SimpleHTTPServer to test some webpages I'm working on. It works great, however I need to do some cross-domain requests. That requires setting a Access-Control-Allow-Origin header with the domains the page is allowed to access.
Is there an easy way to set a header with SimpleHTTPServer and serve the original content? The header would be the same on each request.
This is a bit of a hack because it changes end_headers() behavior, but I think it's slightly better than copying and pasting the entire SimpleHTTPServer.py file.
My approach overrides end_headers() in a subclass and in it calls send_my_headers() followed by calling the superclass's end_headers().
It's not 1 - 2 lines either, less than 20 though; mostly boilerplate.
#!/usr/bin/env python
try:
from http import server # Python 3
except ImportError:
import SimpleHTTPServer as server # Python 2
class MyHTTPRequestHandler(server.SimpleHTTPRequestHandler):
def end_headers(self):
self.send_my_headers()
server.SimpleHTTPRequestHandler.end_headers(self)
def send_my_headers(self):
self.send_header("Access-Control-Allow-Origin", "*")
if __name__ == '__main__':
server.test(HandlerClass=MyHTTPRequestHandler)
I'd say there's no simple way of doing it, where simple means "just add 1-2 lines that will write the additional header and keep the existing functionality". So, the best solution would be to subclass the SimpleHTTPRequestHandler class and re-implement the functionality, with the addition of the new header.
The problem behind why there is no simple way of doing this can be observed by looking at the implementation of the SimpleHTTPRequestHandler class in the Python library: http://hg.python.org/cpython/file/19c74cadea95/Lib/http/server.py#l654
Notice the send_head() method, particularly the lines at the end of the method which send the response headers. Notice the invocation of the end_headers() method. This method writes the headers to the output, together with a blank line which signals the end of all headers and the start of the response body: http://docs.python.org/py3k/library/http.server.html#http.server.BaseHTTPRequestHandler.end_headers
Therefore, it would not be possible to subclass the SimpleHTTPRequestHandler handler, invoke the super-class do_GET() method, and then just add another header -- because the sending of the headers has already finished when the call to the super-class do_GET() method returns. And it has to work like this because the do_GET() method has to send the body (the file that is requested), and to send the body - it has to finalize sending the headers.
So, again, I think you're stuck with sub-classing the SimpleHTTPRequestHandler class, implementing it exactly as the code in the library (just copy-paste it?), and add another header before the call to the end_headers() method in send_head():
...
self.send_header("Last-Modified", self.date_time_string(fs.st_mtime))
# this below is the new header
self.send_header('Access-Control-Allow-Origin', '*')
self.end_headers()
return f
...
# coding: utf-8
import SimpleHTTPServer
import SocketServer
PORT = 9999
def do_GET(self):
self.send_response(200)
self.send_header('Access-Control-Allow-Origin', 'http://example.com')
self.end_headers()
Handler = SimpleHTTPServer.SimpleHTTPRequestHandler
Handler.do_GET = do_GET
httpd = SocketServer.TCPServer(("", PORT), Handler)
httpd.serve_forever()
While this is an older answer, its the first result in google...
Basically what #iMon0 suggested..Seems correct?..Example of doPOST
def do_POST(self):
self.send_response()
self.send_header('Content-type','application/json')
self.send_header('Access-Control-Allow-Origin','*')
self.end_headers()
sTest = {}
sTest['dummyitem'] = "Just an example of JSON"
self.wfile.write(json.dumps(sTest))
By doing this, the flow feels correct..
1: You get a request
2: You apply the headers and response type you want
3: You post back the data you want, be this what or how ever you want.,
The above example is working fine for me and can be extended further, its just a bare bone JSON post server. So i'll leave this here on SOF incase someone needs it or i myself come back in a few months for it.
This does produce a valid JSON file with only the sTest object, Same as a PHP generated page/file.

how to make a delete / put request in python

I can make get or post request using urllib, but how do I make DELETE- and PUT-requests?
The requests library can handle POST, PUT, DELETE, and all other HTTP methods, and is significantly less scary than urllib, httplib and their variants.
You can override get_method with something like this:
def _make_request(url, data, method):
request.urllib2.Request(url, data=data)
request.get_method = lambda: method
Then you pass "DELETE" as method.
This answer covers the details.
PUT request can be performed by httplib2
http://code.google.com/p/httplib2
http://twistedmatrix.com/documents/current/web/howto/client.html
If you're looking to work with HTTP in twisted using the client side I'd suggest checking that out. It demonstrates how you can really easily make a request using the agent class.
As far as I know, urllib and urllib2 only support GET and POST requests. You should probably take a look at httplib or httplib2.
The method is set implicitly in the urlopen call
When you provide the data parameter a POST will be used.
urllib.request.urlopen(url, data=None[, timeout])
I don't think it's possible to use a DELETE HTTP method with urlib because of this line:
Request.get_method()
Return a string
indicating the HTTP request method.
This is only meaningful for HTTP
requests, and currently always returns
'GET' or 'POST'.
Consider using httplib, httplib2, or Twisted instead .for better support of HTTP methods.
The default HTTP methods in urllib library are POST and GET:
def get_method(self):
"""Return a string indicating the HTTP request method."""
default_method = "POST" if self.data is not None else "GET"
return getattr(self, 'method', default_method)
But we can override this get_method() function to get DELETE request:
req = urllib.request.Request(new_url)
req.get_method = lambda: "DELETE"

Best practise when using httplib2.Http() object

I'm writing a pythonic web API wrapper with a class like this
import httplib2
import urllib
class apiWrapper:
def __init__(self):
self.http = httplib2.Http()
def _http(self, url, method, dict):
'''
Im using this wrapper arround the http object
all the time inside the class
'''
params = urllib.urlencode(dict)
response, content = self.http.request(url,params,method)
as you can see I'm using the _http() method to simplify the interaction with the httplib2.Http() object. This method is called quite often inside the class and I'm wondering what's the best way to interact with this object:
create the object in the __init__ and then reuse it when the _http() method is called (as shown in the code above)
or create the httplib2.Http() object inside the method for every call of the _http() method (as shown in the code sample below)
import httplib2
import urllib
class apiWrapper:
def __init__(self):
def _http(self, url, method, dict):
'''Im using this wrapper arround the http object
all the time inside the class'''
http = httplib2.Http()
params = urllib.urlencode(dict)
response, content = http.request(url,params,method)
Supplying 'connection': 'close' in your headers should according to the docs close the connection after a response is received.:
headers = {'connection': 'close'}
resp, content = h.request(url, headers=headers)
You should keep the Http object if you reuse connections. It seems httplib2 is capable of reusing connections the way you use it in your first code, so this looks like a good approach.
At the same time, from a shallow inspection of the httplib2 code, it seems that httplib2 has no support for cleaning up unused connections, or to even notice when a server has decided to close a connection it no longer wants. If that is indeed the case, it looks like a bug in httplib2 to me - so I would rather use the standard library (httplib) instead.

Categories

Resources