tornado web application fails to decode compressed http body - python

I am writing some simple prototype tornado web applications and found that tornado fails to decode the http request body with
Error -3 while decompressing: incorrect header check
From one of the tornado web application, I am sending http request by compressing the body using zlib.
http_body = zlib.compress(data)
And also added http header:
'Content-Encoding': 'gzip'
However, when I receive this http request in another tornado web application, I see that it results in decompressing failure as mentioned above.
Sample Code to handle the http request:
class MyRequestHandler(tornado.web.RequestHandler):
def post(self):
global num
message = self.request.body
self.set_status(200)
I have also ensured that decompress_request=True when application listen.
I have checked the tornado documentation and earlier post and found nothing regarding compressed http body part or any example of it. The only thing mentioned is decompress_response parameter which just ensures compressed http response from a server.
Am I missing any settings here?

gzip and zlib are both based on the same underlying compression algorithm, but they are not the same thing. You must use gzip and not just zlib here:
def post_gzip(self, body):
bytesio = BytesIO()
gzip_file = gzip.GzipFile(mode='w', fileobj=bytesio)
gzip_file.write(utf8(body))
gzip_file.close()
compressed_body = bytesio.getvalue()
return self.fetch('/', method='POST', body=compressed_body,
headers={'Content-Encoding': 'gzip'})
Some zlib functions also take cryptic options that cause them to produce gzip-format output. These can be used with zlib.decompressobj and zlib.compressobj to do streaming compression and decompression.

Related

Stream PDF file from http response directly to client with Python requests library using client certificate

I'm making a request to an endpoint that returns a PDF as streamable binary. This endpoint uses mutual TLS authentication so when I hit the endpoint I must send a client certificate. To achieve this I am using https://pypi.org/project/requests-pkcs12/ which supports the Python requests library.
I would like to download this PDF from the client.
Ideally when the end user clicks 'download' it hits the endpoint and directly streams the data and downloads it.
I am struggling to do this in one single step.
Currently what I'm doing is downloading the PDF to a file, then sending this file back to the client. Writing to the file is slow and I'd like to avoid the download-to-file step and simply send a streaming response back somehow.
Is there a way to stream this directly using Python's Request?
#hit the mutual tls authenticated endpoint
response = post(f'{url}, stream=True, pkcs12_filename=client_certificate_path,
pkcs12_password=client_certificate_passphrase)
#Write the returned data to a file
with open('/tmp/newfile.pdf', 'wb') as f:
f.write(response.content)
#Send the file back to client with Django's FileResponse
return FileResponse(open('/tmp/newfile.pdf', 'rb'))
While I am using Django which seems to handle this problem nicely with StreamingHttpResponse, I was unable to get this working as it doesn't allow me to send a client certificate and password protected client certificate key.

SOAP client in python, how to replicate with XML

I am using suds to send XML and I got my request working, but I'm really confused by how to replicate my results using XML. I have the XML request that my suds client is sending by using:
from suds.client import Client
ulr = "xxxxxxx"
client = Client(url)
...
client.last_received.str()
but I'm not sure where I would send that request to if I was using the requests library. How would I replicate the request from the suds client in a python request?
Most SOAP APIs are just over plain HTTP, use POST - and therefore are easily mimicked with any standard HTTP client such as Requests.
First look here to see how to view the headers and body that suds is sending - it is then a matter of replicating these headers/XML body and passing them into the Requests library.
One defining characteristic in 99% of all HTTP SOAP API's is that your request is going to the same end-point for each request (for example 'http://yyy.com:8080/Posting/LoadPosting.svc), and the actual action is specified in the header using SOAPAction header). Contrast this to a RESTful API where the action is implied with the verb + end-point you call (POST /user, GET /menu etc.)

Is it possible to use gzip compression with Server-Sent Events (SSE)?

I would like to know if it is possible to enable gzip compression
for Server-Sent Events (SSE ; Content-Type: text/event-stream).
It seems it is possible, according to this book:
http://chimera.labs.oreilly.com/books/1230000000545/ch16.html
But I can't find any example of SSE with gzip compression. I tried to
send gzipped messages with the response header field
Content-Encoding set to "gzip" without success.
For experimenting around SSE, I am testing a small web application
made in Python with the bottle framework + gevent ; I am just running
the bottle WSGI server:
#bottle.get('/data_stream')
def stream_data():
bottle.response.content_type = "text/event-stream"
bottle.response.add_header("Connection", "keep-alive")
bottle.response.add_header("Cache-Control", "no-cache")
bottle.response.add_header("Content-Encoding", "gzip")
while True:
# new_data is a gevent AsyncResult object,
# .get() just returns a data string when new
# data is available
data = new_data.get()
yield zlib.compress("data: %s\n\n" % data)
#yield "data: %s\n\n" % data
The code without compression (last line, commented) and without gzip
content-encoding header field works like a charm.
EDIT: thanks to the reply and to this other question: Python: Creating a streaming gzip'd file-like?, I managed to solve the problem:
#bottle.route("/stream")
def stream_data():
compressed_stream = zlib.compressobj()
bottle.response.content_type = "text/event-stream"
bottle.response.add_header("Connection", "keep-alive")
bottle.response.add_header("Cache-Control", "no-cache, must-revalidate")
bottle.response.add_header("Content-Encoding", "deflate")
bottle.response.add_header("Transfer-Encoding", "chunked")
while True:
data = new_data.get()
yield compressed_stream.compress("data: %s\n\n" % data)
yield compressed_stream.flush(zlib.Z_SYNC_FLUSH)
TL;DR: If the requests are not cached, you likely want to use zlib and declare Content-Encoding to be 'deflate'. That change alone should make your code work.
If you declare Content-Encoding to be gzip, you need to actually use gzip. They are based on the the same compression algorithm, but gzip has some extra framing. This works, for example:
import gzip
import StringIO
from bottle import response, route
#route('/')
def get_data():
response.add_header("Content-Encoding", "gzip")
s = StringIO.StringIO()
with gzip.GzipFile(fileobj=s, mode='w') as f:
f.write('Hello World')
return s.getvalue()
That only really makes sense if you use an actual file as a cache, though.
There's also middleware you can use so you don't need to worry about gzipping responses for each of your methods. Here's one I used recently.
https://code.google.com/p/ibkon-wsgi-gzip-middleware/
This is how I used it (I'm using bottle.py with the gevent server)
from gzip_middleware import Gzipper
import bottle
app = Gzipper(bottle.app())
run(app = app, host='0.0.0.0', port=8080, server='gevent')
For this particular library, you can set w/c types of responses you want to compress by modifying the DEFAULT_COMPRESSABLES variable for example
DEFAULT_COMPRESSABLES = set(['text/plain', 'text/html', 'text/css',
'application/json', 'application/x-javascript', 'text/xml',
'application/xml', 'application/xml+rss', 'text/javascript',
'image/gif'])
All responses go through the middleware and get gzipped without modifying your existing code. By default, it compresses responses whose content-type belongs to DEFAULT_COMPRESSABLES and whose content-length is greater than 200 characters.

Multipart POST request Google Glass

I am trying to add an attachment to my timeline with the multipart encoding. I've been doing something like the following:
req = urllib2.Request(url,data={body}, header={header})
resp = urllib2.urlopen(req).read()
And it has been working fine for application/json. However, I'm not sure how to format the body for multipart. I've also used some libraries: requests and poster and they both return 401 for some reason.
How can I make a multipart request either with a libary(preferably a plug-in to urllib2) or with urllib2 itself (like the block of code above)?
EDIT:
I also would like this to be able to support the mirror-api "video/vnd.google-glass.stream-url" from https://developers.google.com/glass/timeline
For the request using poster library here is the code:
register_openers()
datagen, headers = multipart_encode({'image1':open('555.jpg', 'rb')})
Here it is using requets:
headers = {'Authorization' : 'Bearer %s' % access_token}
files = {'file': open('555.jpg', 'rb')}
r = requests.post(timeline_url,files=files, headers=headers)
Returns 401 -> header
Thank you
There is a working Curl example of a multipart request that uses the streaming video url feature here:
Previous Streaming Video Answer with Curl example
It does exactly what you are trying to do, but with Curl. You just need to adapt that to your technology stack.
The 401 you are receiving is going to prevent you even if you use the right syntax. A 401 response indicates you do not have authorization to modify the timeline. Make sure you can insert a simple hello world text only card first. Once you get past the 401 error and get into parsing errors and format issues the link above should be everything you need.
One last note, you don't need urllib2, the Mirror API team dropped a gem of a feature in our lap and we don't need to be bothered with getting the binary of the video, check that example linked above I only provided a URL in the multipart payload, no need to stream the binary data! Google does all the magic in XE6 and above for us.
Thanks Team Glass!
I think you will find this is simpler than you think. Try out the curl example and watch out for incompatible video types, when you get that far, if you don't use a compatible type it will appear not to work in Glass, make sure your video is encoded in a Glass friendly format.
Good luck!
How to add an attachment to a timeline with multipart encoding:
The easiest way to add attachments with multipart encoding to a timeline is to use the
Google APIs Client Library for Python. With this library, you can simple use the following example code provided in the Mirror API timeline insert documentation (click the Python tab under Examples).
from apiclient.discovery import build
service = build('mirror', 'v1')
def insert_timeline_item(service, text, content_type=None, attachment=None,
notification_level=None):
timeline_item = {'text': text}
media_body = None
if notification_level:
timeline_item['notification'] = {'level': notification_level}
if content_type and attachment:
media_body = MediaIoBaseUpload(
io.BytesIO(attachment), mimetype=content_type, resumable=True)
try:
return service.timeline().insert(
body=timeline_item, media_body=media_body).execute()
except errors.HttpError, error:
print 'An error occurred: %s' % error
You cannot actually use requests or poster to automatically encode your data, because these libraries encode things in multipart/form-data whereas Mirror API wants things in multipart/related.
How to debug your current error code:
Your code gives a 401, which is an authorization error. This means you are probably failing to include your access token with your requests. To include an access token, set the Authorization field to Bearer: YOUR_ACCESS_TOKEN in your request (documentation here).
If you do not know how to get an access token, the Glass developer docs has a page here explaining how to obtain an access token. Make sure that your authorization process requested the following scope for multipart-upload, otherwise you will get a 403 error. https://www.googleapis.com/auth/glass.timeline
This is how I did it and how the python client library does it.
from email.mime.multipart import MIMEMultipart
from email.mime.nonmultipart import MIMENonMultipart
from email.mime.image import MIMEImage
mime_root = MIMEMultipart('related', '===============xxxxxxxxxxxxx==')
headers= {'Content-Type': 'multipart/related; '
'boundary="%s"' % mime_root.get_boundary(),
'Authorization':'Bearer %s' % access_token}
setattr(mime_root, '_write_headers', lambda self: None)
#Create the metadata part of the MIME
mime_text = MIMENonMultipart(*['application','json'])
mime_text.set_payload("{'text':'waddup doe!'}")
print "Attaching the json"
mime_root.attach(mime_text)
if method == 'Image':
#DO Image
file_upload = open('555.jpg', 'rb')
mime_image = MIMENonMultipart(*['image', 'jpeg'])
#add the required header
mime_image['Content-Transfer-Encoding'] = 'binary'
#read the file as binary
mime_image.set_payload(file_upload.read())
print "attaching the jpeg"
mime_root.attach(mime_image)
elif method == 'Video':
mime_video = MIMENonMultipart(*['video', 'vnd.google-glass.stream-url'])
#add the payload
mime_video.set_payload('https://dl.dropboxusercontent.com/u/6562706/sweetie-wobbly-cat-720p.mp4')
mime_root.attach(mime_video)
Mark Scheel I used your video for testing purposes :) Thank you.

Upload a large XML file with Python Requests library

I'm trying to replace curl with Python & the requests library. With curl, I can upload a single XML file to a REST server with the curl -T option. I have been unable to do the same with the requests library.
A basic scenario works:
payload = '<person test="10"><first>Carl</first><last>Sagan</last></person>'
headers = {'content-type': 'application/xml'}
r = requests.put(url, data=payload, headers=headers, auth=HTTPDigestAuth("*", "*"))
When I change payload to a bigger string by opening an XML file, the .put method hangs (I use the codecs library to get a proper unicode string). For example, with a 66KB file:
xmlfile = codecs.open('trb-1996-219.xml', 'r', 'utf-8')
headers = {'content-type': 'application/xml'}
content = xmlfile.read()
r = requests.put(url, data=content, headers=headers, auth=HTTPDigestAuth("*", "*"))
I've been looking into using the multipart option (files), but the server doesn't seem to like that.
So I was wondering if there is a way to simulate curl -T behaviour in Python requests library.
UPDATE 1:
The program hangs in textmate, but throws an UnicodeEncodeError error on the commandline. Seems that must be the problem. So the question would be: is there a way to send unicode strings to a server with the requests library?
UPDATE 2:
Thanks to the comment of Martijn Pieters the UnicodeEncodeError went away, but a new issue turned up.
With a literal (ASCII) XML string, logging shows the following lines:
2012-11-11 15:55:05,154 INFO Starting new HTTP connection (1): my.ip.address
2012-11-11 15:55:05,294 DEBUG "PUT /v1/documents?uri=/example/test.xml HTTP/1.1" 401 211
2012-11-11 15:55:05,430 DEBUG "PUT /v1/documents?uri=/example/test.xml HTTP/1.1" 201 0
Seems the server always bounces the first authentication attempt (?) but then accepts the second one.
With a file object (open('trb-1996-219.xml', 'rb')) passed to data, the logfile shows:
2012-11-11 15:50:54,309 INFO Starting new HTTP connection (1): my.ip.address
2012-11-11 15:50:55,105 DEBUG "PUT /v1/documents?uri=/example/test.xml HTTP/1.1" 401 211
2012-11-11 15:51:25,603 WARNING Retrying (0 attempts remain) after connection broken by 'BadStatusLine("''",)': /v1/documents?uri=/example/test.xml
So, first attempt is blocked as before, but no second attempt is made.
According to Martijn Pieters (below), the second issue can be explained by a faulty server (empty line).
I will look into this, but if someone has a workaround (apart from using curl) I wouldn't mind hearing it.
And I am still surprised that the requests library behaves so differently for small string and file object. Isn't the file object serialized before it gets to the server anyway?
To PUT large files, don't read them into memory. Simply pass the file as the data keyword:
xmlfile = open('trb-1996-219.xml', 'rb')
headers = {'content-type': 'application/xml'}
r = requests.put(url, data=xmlfile, headers=headers, auth=HTTPDigestAuth("*", "*"))
Moreover, you were opening the file as unicode (decoding it from UTF-8). As you'll be sending it to a remote server, you need raw bytes, not unicode values, and you should open the file as a binary instead.
Digest authentication always requires you to make at least two request to the server. The first request doesn't contain any authentication data. This first request will fail with a 401 "Authorization required" response code and a digest challenge (called a nounce) to be used for hashing your password etc. (the exact details don't matter here). This is used to make a second request to the server containing your credentials hashed with the challenge.
The problem is in the this two step authentication: your large file was already send with the first unauthorized request (send in vain) but on the second request the file object is already at the EOF position. Since the file size was also send in the Content-length header of the second request, this causes the server to wait for a file that will never be send.
You could solve it using a requests Session and first make a simple request for authentication purposes (say a GET request). Then make a second PUT request containing the actual payload using the same digest challenge form the first request.
sess = requests.Session()
sess.auth = HTTPDigestAuth("*", "*")
sess.get(url)
headers = {'content-type': 'application/xml'}
with codecs.open('trb-1996-219.xml', 'r', 'utf-8') as xmlfile:
sess.put(url, data=xmlfile, headers=headers)
i used requests in python to upload an XML file using the commands.
first to open the file use open()
file = open("PIR.xsd")
fragment = file.read()
file.close()
copy the data of XML file in the payload of the requests and post it
payload = {'key':'PFAkrzjmuZR957','xmlFragment':fragment}
r = requests.post(URL,data=payload)
to check the html validation code
print (r.text)

Categories

Resources