Google App Engine: Response "Content-Length" header is always 0 - python

I followed the Google App Engine tutorial and I'm having a bit of trouble with adding content to the response object in the guestbook class.
class Guestbook(webapp2.RequestHandler):
def post(self):
# We set the same parent key on the 'Greeting' to ensure each greeting
# is in the same entity group. Queries across the single entity group
# will be consistent. However, the write rate to a single entity group
# should be limited to ~1/second.
guestbook_name = self.request.get('guestbook_name',
DEFAULT_GUESTBOOK_NAME)
testvar = self.request.get('testvar',
DEFAULT_GUESTBOOK_NAME)
greeting = Greeting(parent=guestbook_key(guestbook_name))
if users.get_current_user():
greeting.author = users.get_current_user()
greeting.content = self.request.get('content')
greeting.info = 'DIDTHISWORK?'
greeting.put()
self.response.headers.add_header("Expires", 'Information here')
#self.response.set_status(200,'Is this working?!')
self.response.headers['Content-Type'] = 'text/plain'
#self.response.headers['Content-Length'] = '5'
self.response.out.write('Hello')
query_params = {'guestbook_name': guestbook_name}
self.redirect('/?' + urllib.urlencode(query_params))
print type(self.response)
Using Wireshark here is the response packet:
HTTP/1.1 302 Found
Cache-Control: no-cache
Expires: Information here
Content-Type: text/plain
Location: http://_______.appspot.com/?guestbook_name=default_guestbook
Date: Fri, 17 May 2013 01:21:52 GMT
Server: Google Frontend
Content-Length: 0
As you can see I'm trying to fill the content body with "Hello" but it keeps giving me content-length = 0 and manually setting it doesn't seem to help so I commented it out. I think you can safely ignore the code with greeting but I added it in there in case it affects anything I do.

You are sending a redirect before you print any content to the page, hence the 0 length.
self.redirect('/?' + urllib.urlencode(query_params)) # redirects
print type(self.response) # never executes
If you look at the Wireshark capture, check out the response code which is a 302 redirect and the Location: header which tells the browser where to redirect to.
HTTP/1.1 302 Found
Location: http://_______.appspot.com/?guestbook_name=default_guestbook

Related

Create a HTTP object from a string in Python

I am having a device which is sending the following http message to my RaspberryPi:
POST /sinvertwebmonitor/InverterService/InverterService.asmx/CollectInverterData HTTP/1.1
Host: www.automation.siemens.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 349
xmlData=<rd><m>xxxxx</m><s>yyyyyy</s><d t="1483019400" l="600"><p i="1">460380AE</p><p i="2">43655DE7</p><p i="3">4212C986</p><p i="4">424805BC</p><p i="5">4604E3D1</p><p i="6">441F616A</p><p i="7">4155E7F5</p><p i="8">E1</p><p i="9">112</p><p i="C">153</p><p i="D">4</p><p i="E">11ABAC</p><p i="F">22A48C</p><p i="10">0</p></d></rd>
I cannot change anything on the device.
On the RaspberryPi im running a script to listen and receive the message from a socket.
This works so far and the received message is the one above.
Now, I would like to create a HTTP object from this message and then extract comfortably the header, content and so on.
Similar to the following example:
r = requests.get('https://www.google.com')
r.status_code
However, without "getting" an url. I just want to read the string I already have.
Pseudo-example:
r = requests.read(hereComesTheString)
r.status_code
I hope the problem became understandable.
Would be glad to get some hints.
Thanks and best regards,
Christoph
You use the status_code property in your example, but what you are receiving is a request not a response. However you can still create a simple object for accessing the data in the request.
It is probably easiest to create your own custom class:
import mimetools
from StringIO import StringIO
request = """POST /sinvertwebmonitor/InverterService/InverterService.asmx/CollectInverterData HTTP/1.1
Host: www.automation.siemens.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 349
xmlData=<rd><m>xxxxx</m><s>yyyyyy</s><d t="1483019400" l="600"><p i="1">460380AE</p><p i="2">43655DE7</p><p i="3">4212C986</p><p i="4">424805BC</p><p i="5">4604E3D1</p><p i="6">441F616A</p><p i="7">4155E7F5</p><p i="8">E1</p><p i="9">112</p><p i="C">153</p><p i="D">4</p><p i="E">11ABAC</p><p i="F">22A48C</p><p i="10">0</p></d></rd>"""
class Request:
def __init__(self, request):
stream = StringIO(request)
request = stream.readline()
words = request.split()
[self.command, self.path, self.version] = words
self.headers = mimetools.Message(stream, 0)
self.content = stream.read()
def __getitem__(self, key):
return self.headers.get(key, '')
r = Request(request)
print(r.command)
print(r.path)
print(r.version)
for header in r.headers:
print(header, r[header])
print(r.content)
This outputs:
POST
/sinvertwebmonitor/InverterService/InverterService.asmx/CollectInverterData
HTTP/1.1
('host', 'www.automation.siemens.com')
('content-type', 'application/x-www-form-urlencoded')
('content-length', '349')
xmlData=<rd><m>xxxxx</m><s>yyyyyy</s><d t="1483019400" l="600"><p i="1">460380AE</p><p i="2">43655DE7</p><p i="3">4212C986</p><p i="4">424805BC</p><p i="5">4604E3D1</p><p i="6">441F616A</p><p i="7">4155E7F5</p><p i="8">E1</p><p i="9">112</p><p i="C">153</p><p i="D">4</p><p i="E">11ABAC</p><p i="F">22A48C</p><p i="10">0</p></d></rd>
If you're using plain socket server, then you need to implement an HTTP server so that you can split the request and respond according to the protocol.
It's probably easier just to use an existing HTTP server and app server. Flask is ideal for this:
from flask import Flask
from flask import request
app = Flask(__name__)
#app.route("/sinvertwebmonitor/InverterService/InverterService.asmx/CollectInverterData", methods=['POST'])
def dataCollector():
data = request.form['xmlData']
print(data)
# parseData. Take a look at ElementTree
if __name__ == "__main__":
app.run(host=0.0.0.0, port=80)
Thanks Alden. Below your code with a few changes so it works with Python3.
import email
from io import StringIO
request = """POST /sinvertwebmonitor/InverterService/InverterService.asmx/CollectInverterData HTTP/1.1
Host: www.automation.siemens.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 349
xmlData=<rd><m>xxxxx</m><s>yyyyyy</s><d t="1483019400" l="600"><p i="1">460380AE</p><p i="2">43655DE7</p><p i="3">4212C986</p><p i="4">424805BC</p><p i="5">4604E3D1</p><p i="6">441F616A</p><p i="7">4155E7F5</p><p i="8">E1</p><p i="9">112</p><p i="C">153</p><p i="D">4</p><p i="E">11ABAC</p><p i="F">22A48C</p><p i="10">0</p></d></rd>"""
class Request:
def __init__(self, request):
stream = StringIO(request)
request = stream.readline()
words = request.split()
[self.command, self.path, self.version] = words
self.headers = email.message_from_string(request)
self.content = stream.read()
def __getitem__(self, key):
return self.headers.get(key, '')
r = Request(request)
print(r.command)
print(r.path)
print(r.version)
for header in r.headers:
print(header, r[header])
print(r.content)

Enabling CORS in Python with Tastypie update (PUT)

There are several questions on SO related to this topic, but none that I have seen solve my issue.
I have an endpoint in my Django/Tastypie API that accepts a PUT in order to update the database. This works great when testing in localhost:8000, however, my production database is located in a different domain, so I need to enable CORS to get this PUT call to update the database.
I have found the tutorial here that gives an example of how to do this, however, when I execute the cURL command:
curl -X PUT --dump-header - -H "Content-Type: application/json" -H "Authorization: ApiKey api:MYAPIKEY" -d "{\"mykey\": \"my_value\", \"resource_uri\": \"/api/v1/mytable/362/\"}" my.domain.com/api/v1/mytable/362/
I am still receiving 401 UNAUTHORIZED for my calls (header dump below):
HTTP/1.1 401 UNAUTHORIZED
Date: Mon, 22 Sep 2014 16:08:34 GMT
Server: Apache/2.4.7 (Ubuntu)
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: Content-Type,Authorization
X-Frame-Options: SAMEORIGIN
Content-Length: 0
Content-Type: text/html; charset=utf-8
My CoresResource superclass code:
class CorsResource(ModelResource):
""" adds CORS headers for cross-domain requests """
def patch_response(self, response):
allowed_headers = ['Content-Type', 'Authorization']
response['Access-Control-Allow-Origin'] = '*'
response['Access-Control-Allow-Headers'] = ','.join(allowed_headers)
return response
def dispatch(self, *args, **kwargs):
""" calls super and patches resonse headers
or
catches ImmediateHttpResponse, patches headers and re-raises
"""
try:
response = super(CorsResource, self).dispatch(*args, **kwargs)
return self.patch_response(response)
except ImmediateHttpResponse, e:
response = self.patch_response(e.response)
# re-raise - we could return a response but then anthing wrapping
# this and expecting an exception would be confused
raise ImmediateHttpResponse(response)
def method_check(self, request, allowed=None):
""" Handle OPTIONS requests """
if request.method.upper() == 'OPTIONS':
if allowed is None:
allowed = []
allows = ','.join([s.upper() for s in allowed])
response = HttpResponse(allows)
response['Allow'] = allows
raise ImmediateHttpResponse(response=response)
return super(CorsResource, self).method_check(request, allowed)
My endpoint code:
class DataModelResource(CorsResource):
data = fields.OneToOneField(DataModelResource, "data", full=True)
class Meta:
allowed_methods = ['get', 'put']
authentication = ApiKeyAuthentication()
authorization = Authorization()
queryset = Data.objects.all()
resource_name = 'mytable'
Does anyone see any reason why making a PUT from a cross-domain should be failing with this code???
I'm at a complete loss here.

Python POST binary data

I am writing some code to interface with redmine and I need to upload some files as part of the process, but I am not sure how to do a POST request from python containing a binary file.
I am trying to mimic the commands here:
curl --data-binary "#image.png" -H "Content-Type: application/octet-stream" -X POST -u login:password http://redmine/uploads.xml
In python (below), but it does not seem to work. I am not sure if the problem is somehow related to encoding the file or if something is wrong with the headers.
import urllib2, os
FilePath = "C:\somefolder\somefile.7z"
FileData = open(FilePath, "rb")
length = os.path.getsize(FilePath)
password_manager = urllib2.HTTPPasswordMgrWithDefaultRealm()
password_manager.add_password(None, 'http://redmine/', 'admin', 'admin')
auth_handler = urllib2.HTTPBasicAuthHandler(password_manager)
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)
request = urllib2.Request( r'http://redmine/uploads.xml', FileData)
request.add_header('Content-Length', '%d' % length)
request.add_header('Content-Type', 'application/octet-stream')
try:
response = urllib2.urlopen( request)
print response.read()
except urllib2.HTTPError as e:
error_message = e.read()
print error_message
I have access to the server and it looks like a encoding error:
...
invalid byte sequence in UTF-8
Line: 1
Position: 624
Last 80 unconsumed characters:
7z¼¯'ÅÐз2^Ôøë4g¸R<süðí6kĤª¶!»=}jcdjSPúá-º#»ÄAtD»H7Ê!æ½]j):
(further down)
Started POST "/uploads.xml" for 192.168.0.117 at 2013-01-16 09:57:49 -0800
Processing by AttachmentsController#upload as XML
WARNING: Can't verify CSRF token authenticity
Current user: anonymous
Filter chain halted as :authorize_global rendered or redirected
Completed 401 Unauthorized in 13ms (ActiveRecord: 3.1ms)
Basically what you do is correct. Looking at redmine docs you linked to, it seems that suffix after the dot in the url denotes type of posted data (.json for JSON, .xml for XML), which agrees with the response you get - Processing by AttachmentsController#upload as XML. I guess maybe there's a bug in docs and to post binary data you should try using http://redmine/uploads url instead of http://redmine/uploads.xml.
Btw, I highly recommend very good and very popular Requests library for http in Python. It's much better than what's in the standard lib (urllib2). It supports authentication as well but I skipped it for brevity here.
import requests
with open('./x.png', 'rb') as f:
data = f.read()
res = requests.post(url='http://httpbin.org/post',
data=data,
headers={'Content-Type': 'application/octet-stream'})
# let's check if what we sent is what we intended to send...
import json
import base64
assert base64.b64decode(res.json()['data'][len('data:application/octet-stream;base64,'):]) == data
UPDATE
To find out why this works with Requests but not with urllib2 we have to examine the difference in what's being sent. To see this I'm sending traffic to http proxy (Fiddler) running on port 8888:
Using Requests
import requests
data = 'test data'
res = requests.post(url='http://localhost:8888',
data=data,
headers={'Content-Type': 'application/octet-stream'})
we see
POST http://localhost:8888/ HTTP/1.1
Host: localhost:8888
Content-Length: 9
Content-Type: application/octet-stream
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/1.0.4 CPython/2.7.3 Windows/Vista
test data
and using urllib2
import urllib2
data = 'test data'
req = urllib2.Request('http://localhost:8888', data)
req.add_header('Content-Length', '%d' % len(data))
req.add_header('Content-Type', 'application/octet-stream')
res = urllib2.urlopen(req)
we get
POST http://localhost:8888/ HTTP/1.1
Accept-Encoding: identity
Content-Length: 9
Host: localhost:8888
Content-Type: application/octet-stream
Connection: close
User-Agent: Python-urllib/2.7
test data
I don't see any differences which would warrant different behavior you observe. Having said that it's not uncommon for http servers to inspect User-Agent header and vary behavior based on its value. Try to change headers sent by Requests one by one making them the same as those being sent by urllib2 and see when it stops working.
This has nothing to do with a malformed upload. The HTTP error clearly specifies 401 unauthorized, and tells you the CSRF token is invalid. Try sending a valid CSRF token with the upload.
More about csrf tokens here:
What is a CSRF token ? What is its importance and how does it work?
you need to add Content-Disposition header, smth like this (although I used mod-python here, but principle should be the same):
request.headers_out['Content-Disposition'] = 'attachment; filename=%s' % myfname
You can use unirest, It provides easy method to post request.
`
import unirest
def callback(response):
print "code:"+ str(response.code)
print "******************"
print "headers:"+ str(response.headers)
print "******************"
print "body:"+ str(response.body)
print "******************"
print "raw_body:"+ str(response.raw_body)
# consume async post request
def consumePOSTRequestASync():
params = {'test1':'param1','test2':'param2'}
# we need to pass a dummy variable which is open method
# actually unirest does not provide variable to shift between
# application-x-www-form-urlencoded and
# multipart/form-data
params['dummy'] = open('dummy.txt', 'r')
url = 'http://httpbin.org/post'
headers = {"Accept": "application/json"}
# call get service with headers and params
unirest.post(url, headers = headers,params = params, callback = callback)
# post async request multipart/form-data
consumePOSTRequestASync()

Python Requests - Is it possible to receive a partial response after an HTTP POST?

I am using the Python Requests Module to datamine a website. As part of the datamining, I have to HTTP POST a form and check if it succeeded by checking the resulting URL. My question is, after the POST, is it possible to request the server to not send the entire page? I only need to check the URL, yet my program downloads the entire page and consumes unnecessary bandwidth. The code is very simple
import requests
r = requests.post(URL, payload)
if 'keyword' in r.url:
success
fail
An easy solution, if it's implementable for you. Is to go low-level. Use socket library.
For example you need to send a POST with some data in its body. I used this in my Crawler for one site.
import socket
from urllib import quote # POST body is escaped. use quote
req_header = "POST /{0} HTTP/1.1\r\nHost: www.yourtarget.com\r\nUser-Agent: For the lulz..\r\nContent-Type: application/x-www-form-urlencoded; charset=UTF-8\r\nContent-Length: {1}"
req_body = quote("data1=yourtestdata&data2=foo&data3=bar=")
req_url = "test.php"
header = req_header.format(req_url,str(len(req_body))) #plug in req_url as {0}
#and length of req_body as Content-length
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM) #create a socket
s.connect(("www.yourtarget.com",80)) #connect it
s.send(header+"\r\n\r\n"+body+"\r\n\r\n") # send header+ two times CR_LF + body + 2 times CR_LF to complete the request
page = ""
while True:
buf = s.recv(1024) #receive first 1024 bytes(in UTF-8 chars), this should be enought to receive the header in one try
if not buf:
break
if "\r\n\r\n" in page: # if we received the whole header(ending with 2x CRLF) break
break
page+=buf
s.close() # close the socket here. which should close the TCP connection even if data is still flowing in
# this should leave you with a header where you should find a 302 redirected and then your target URL in "Location:" header statement.
There's a chance the site uses Post/Redirect/Get (PRG) pattern. If so then it's enough to not follow redirect and read Location header from response.
Example
>>> import requests
>>> response = requests.get('http://httpbin.org/redirect/1', allow_redirects=False)
>>> response.status_code
302
>>> response.headers['location']
'http://httpbin.org/get'
If you need more information on what would you get if you had followed redirection then you can use HEAD on the url given in Location header.
Example
>>> import requests
>>> response = requests.get('http://httpbin.org/redirect/1', allow_redirects=False)
>>> response.status_code
302
>>> response.headers['location']
'http://httpbin.org/get'
>>> response2 = requests.head(response.headers['location'])
>>> response2.status_code
200
>>> response2.headers
{'date': 'Wed, 07 Nov 2012 20:04:16 GMT', 'content-length': '352', 'content-type':
'application/json', 'connection': 'keep-alive', 'server': 'gunicorn/0.13.4'}
It would help if you gave some more data, for example, a sample URL that you're trying to request. That being said, it seems to me that generally you're checking if you had the correct URL after your POST request using the following algorithm relying on redirection or HTTP 404 errors:
if original_url == returned request url:
correct url to a correctly made request
else:
wrong url and a wrongly made request
If this is the case, what you can do here is use the HTTP HEAD request (another type of HTTP request like GET, POST, etc.) in Python's requests library to get only the header and not also the page body. Then, you'd check the response code and redirection url (if present) to see if you made a request to a valid URL.
For example:
def attempt_url(url):
'''Checks the url to see if it is valid, or returns a redirect or error.
Returns True if valid, False otherwise.'''
r = requests.head(url)
if r.status_code == 200:
return True
elif r.status_code in (301, 302):
if r.headers['location'] == url:
return True
else:
return False
elif r.status_code == 404:
return False
else:
raise Exception, "A status code we haven't prepared for has arisen!"
If this isn't quite what you're looking for, additional detail on your requirements would help. At the very least, this gets you the status code and headers without pulling all of the page data.

webapp2 How to remove Cache-Control: no-cache from response header?

I'm having trouble serving an appengine blobstore over HTTPS, specifically with IE 8 & IE 7 browsers, as the browser just does not like to serve downloadable contents over https.
According to a microsoft article, it's because of the Cache-Control: no-cache header.
http://blogs.msdn.com/b/ieinternals/archive/2009/10/02/internet-explorer-cannot-download-over-https-when-no-cache.aspx
The solution proposed in the article is to remove the Cache-Control: no-cache header from the response.
However, it seems that webapp2 automatically adds this header even after I tried setting it to something else.
According to the source code, it seems to be added to every response
http://code.google.com/p/webapp-improved/source/browse/webapp2.py#362
def __init__(self, *args, **kwargs):
"""Constructs a response with the default settings."""
super(Response, self).__init__(*args, **kwargs)
self.headers['Cache-Control'] = 'no-cache'
Is there a way to override this behavior?
This is what I tried to do when building my response, but still the 'Cache-Control: no-cache' would still be there once the response is rendered.
self.response.headers['Pragma'] = 'cache-control'
self.response.headers['Cache-Control'] = 'private'
self.response.cache_control.no_cache = None
self.response.cache_control.public = False
self.response.cache_control.max_age = 1
I do not use webapp2 in the download handler itself. It looks like this:
class DynServe(blobstore_handlers.BlobstoreDownloadHandler):
def get(self, resource):
(key, _, _) = resource.rpartition('.')
blob_info = blobstore.BlobInfo.get(key)
self.send_blob(blob_info, save_as=True)
app = webapp2.WSGIApplication(
[
('/dynserve/([^/]+)?', DynServe),
], debug=True)
def main():
app.run()
To serve a pdf like: /dynserve/{{ blob-key }}.pdf

Categories

Resources