How do I batch send a multipart html post with multiple urls? - python

I am speaking to the gmail api and would like to batch the requests. They have a friendly guide for this here, https://developers.google.com/gmail/api/guides/batch, which suggests that I should be able to use multipart/mixed and include different urls.
I am using Python and the Requests library, but am unsure how to issue different urls. Answers like this one How to send a "multipart/form-data" with requests in python? don't mention an option for changing that part.
How do I do this?

Unfortunately, requests does not support multipart/mixed in their API. This has been suggested in several GitHub issues (#935 and #1081), but there are no updates on this for now. This also becomes quite clear if you search for "mixed" in the requests sources and get zero results :(
Now you have several options, depending on how much you want to make use of Python and 3rd-party libraries.
Google API Client
Now, the most obvious answer to this problem is to use the official Python API that Google is providing here. It comes with a HttpBatchRequest class that can handle the batch requests that you need. This is documented in detail in this guide.
Essentially, you create an HttpBatchRequest object and add all your requests to it. The library will then put everything together (taken from the guide above):
batch = BatchHttpRequest()
batch.add(service.animals().list(), callback=list_animals)
batch.add(service.farmers().list(), callback=list_farmers)
batch.execute(http=http)
Now, if for whatever reason you cannot or will not use the official Google libraries you will have to build parts of the request body yourself.
requests + email.mime
As I already mentioned, requests does not officially support multipart/mixed. But that does not mean that we cannot "force" it. When creating a Request object, we can use the files parameter to provide multipart data.
files is a dictionary that accepts 4-tuple values of this format: (filename, file_object, content_type, headers). The filename can be empty. Now we need to convert a Request object into a file(-like) object. I wrote a small method that covers the basic examples from the Google example. It is partly inspired by the internal methods that Google uses in their Python library:
import requests
from email.mime.multipart import MIMEMultipart
from email.mime.nonmultipart import MIMENonMultipart
BASE_URL = 'http://www.googleapis.com/batch'
def serialize_request(request):
'''Returns the string representation of the request'''
mime_body = ''
prepared = request.prepare()
# write first line (method + uri)
if request.url.startswith(BASE_URL):
mime_body = '%s %s\r\n' % (request.method, request.url[len(BASE_URL):])
else:
mime_body = '%s %s\r\n' % (request.method, request.url)
part = MIMENonMultipart('application', 'http')
# write headers (if possible)
for key, value in prepared.headers.iteritems():
mime_body += '%s: %s\r\n' % (key, value)
if getattr(prepared, 'body', None) is not None:
mime_body += '\r\n' + prepared.body + '\r\n'
return mime_body.encode('utf-8').lstrip()
This method will transform a requests.Request object into a UTF-8 encoded string that can later be used a a payload for a MIMENonMultipart object, i.e. the different multiparts.
Now in order to generate the actual batch request, we first need to squeeze a list of (Google API) requests into a files dictionary for the requests lib. The following method will take a list of requests.Request objects, transform each into a MIMENonMultipart and then return a dictionary that complies to the structure of the files dictionary:
import uuid
def prepare_requests(request_list):
message = MIMEMultipart('mixed')
output = {}
# thanks, Google. (Prevents the writing of MIME headers we dont need)
setattr(message, '_write_headers', lambda self: None)
for request in request_list:
message_id = new_id()
sub_message = MIMENonMultipart('application', 'http')
sub_message['Content-ID'] = message_id
del sub_message['MIME-Version']
sub_message.set_payload(serialize_request(request))
# remove first line (from ...)
sub_message = str(sub_message)
sub_message = sub_message[sub_message.find('\n'):]
output[message_id] = ('', str(sub_message), 'application/http', {})
return output
def new_id():
# I am not sure how these work exactly, so you will have to adapt this code
return '<item%s:12930812#barnyard.example.com>' % str(uuid.uuid4())[-4:]
Finally, we need to change the Content-Type from multipart/form-data to multipart/mixed and also remove the Content-Disposition and Content-Type headers from each request part. These we generated by requests and cannot be overwritten by the files dictionary.
import re
def finalize_request(prepared):
# change to multipart/mixed
old = prepared.headers['Content-Type']
prepared.headers['Content-Type'] = old.replace('multipart/form-data', 'multipart/mixed')
# remove headers at the start of each boundary
prepared.body = re.sub(r'\r\nContent-Disposition: form-data; name=.+\r\nContent-Type: application/http\r\n', '', prepared.body)
I have tried my best to test this with the Google Example from the Batching guide:
sheep = {
"animalName": "sheep",
"animalAge": "5",
"peltColor": "green"
}
commands = []
commands.append(requests.Request('GET', 'http://www.googleapis.com/batch/farm/v1/animals/pony'))
commands.append(requests.Request('PUT', 'http://www.googleapis.com/batch/farm/v1/animals/sheep', json=sheep, headers={'If-Match': '"etag/sheep"'}))
commands.append(requests.Request('GET', 'http://www.googleapis.com/batch/farm/v1/animals', headers={'If-None-Match': '"etag/animals"'}))
files = prepare_requests(commands)
r = requests.Request('POST', 'http://www.googleapis.com/batch', files=files)
prepared = r.prepare()
finalize_request(prepared)
s = requests.Session()
s.send(prepared)
And the resulting request should be close enough to what Google is providing in their Batching guide:
POST http://www.googleapis.com/batch
Content-Length: 1006
Content-Type: multipart/mixed; boundary=a21beebd15b74be89539b137bbbc7293
--a21beebd15b74be89539b137bbbc7293
Content-Type: application/http
Content-ID: <item8065:12930812#barnyard.example.com>
GET /farm/v1/animals
If-None-Match: "etag/animals"
--a21beebd15b74be89539b137bbbc7293
Content-Type: application/http
Content-ID: <item5158:12930812#barnyard.example.com>
GET /farm/v1/animals/pony
--a21beebd15b74be89539b137bbbc7293
Content-Type: application/http
Content-ID: <item0ec9:12930812#barnyard.example.com>
PUT /farm/v1/animals/sheep
Content-Length: 63
Content-Type: application/json
If-Match: "etag/sheep"
{"animalAge": "5", "animalName": "sheep", "peltColor": "green"}
--a21beebd15b74be89539b137bbbc7293--
In the end, I highly recommend the official Google library but if you cannot use it, you will have to improvise a bit :)
Disclaimer: I havent actually tried to send this request to the Google API Endpoints because the authentication procedure is too much of a hassle. I was just trying to get as close as possible to the HTTP request that is described in the Batching guide. There might be some problems with \r and \n line endings, depending on how strict the Google Endpoints are.
Sources:
requests github (especially issues #935 and #1081)
requests API documentation
Google APIs for Python

Related

Send data through POST request

I had been using sockets, with Python, for some time ago and I'm trying to understand why this POST which should send some data on fields data1 and data2 do not work.
POST /method.php HTTP/1.1\r\nHost: localhost\r\nContent-Type: multipart/form-data\r\n\r\ndata1=something&data2= otherthing\r\n\r\n
What is the problem with this request?
There are several things wrong with your request:
POST /method.php HTTP/1.1
Host: localhost
Content-Type: multipart/form-data
data1=something&data2= otherthing
First, whenever a body is used within a HTTP request the length of the body must be known. This is typically done by given the length up-front with Content-length in the HTTP header although also chunked encoding might be used if the full length is not known up front. Your request does not do any of these which means the request is an invalid HTTP request.
Additionally you claim a Content-Type of multipart/form-data although your body is not of this type. With multipart/form-data your body would consist of several MIME parts separated by a text boundary and this boundary would need to have been declared in your Content-type header. The correct type for the body you show would be instead application/x-www-form-urlencoded.
Even with application/x-www-form-urlencoded the body is partly wrong. This type of body should be only pairs of key=value concatenated by &, i.e. there should be neither as space after a key as you have after data2= nor there should be new lines added after the end of the data as you have.
When removing all these problems you should probably send the following request:
body = "data1=something&data2=otherthing"
request = ("POST /method.php HTTP/1.1\r\n" + \
"Host: localhost\r\n" + \
"Content-Type: application/x-www-form-urlencoded\r\n" + \
"Content-Length: %d\r\n" + \
"\r\n%s") % (len(body),body)
But once you have send this request the trouble continues since getting the response correctly is complex too. Generally I recommend to not code your own HTTP handling unless you really know what you do but instead use existing libraries. While HTTP might look simple when just looking at a few example requests it is way more complex than it initially looks. And while your code might seem to work against specific servers it might fail with other servers.
It might be easier to use the requests library so your code would look something like this:
import requests
# Data
data = {
'data1':'something',
'data2':'otherthing'
}
# Custom headers
headers = {
'content-type': 'multipart/form-data'
}
# Get response from server
response = requests.post('http://localhost/', data=data, headers=headers)
# If you care about the response
print(response.json())
You can also send files and a whole lot of other stuff
Have you tried using the Requests library instead, example of a post request below
import requests
header = {"Content-Type": "multipart/form-data"}
data1="something"
data2= "otherthing"
session_requests = requests.session()
result = session_requests.post("http://localhost/", data=dict(data1, data2), headers=header)

Compressing request body with python-requests?

(This question is not about transparent decompression of gzip-encoded responses from a web server; I know that requests handles that automatically.)
Problem
I'm trying to POST a file to a RESTful web service. Obviously, requests makes this pretty easy to do:
files = dict(data=(fn, file))
response = session.post(endpoint_url, files=files)
In this case, my file is in a really highly-compressible format (yep, XML) so I'd like to make sure that the request body is compressed.
The server claims to accept gzip encoding (Accept-Encoding: gzip in response headers), so I should be able to gzip the whole body request body, right?
Attempted solution
Here's my attempt to make this work: I first construct the request and prepare it, then I go into the PreparedRequest object, yank out the body, run it through gzip, and put it back. (Oh, and don't forget to update the Content-Length and Content-Encoding headers.)
files = dict(data=(fn, file))
request = request.Request('POST',endpoint_url, files=files)
prepped = session.prepare_request(request)
with NamedTemporaryFile(delete=True) as gzfile:
gzip.GzipFile(fileobj=gzfile, mode="wb").write(prepped.body)
prepped.headers['Content-Length'] = gzfile.tell()
prepped.headers['Content-Encoding'] = 'gzip'
gzfile.seek(0,0)
prepped.body = gzfile.read()
response = session.send(prepped)
Unfortunately, the server is not cooperating and returns 500 Internal Server Error. Perhaps it doesn't really accept gzip-encoded requests?
Or perhaps there is a mistake in my approach? It seems rather convoluted. Is there an easier way to do request body compression with python-requests?
EDIT: Fixed (3) and (5) from #sigmavirus24's answer (these were basically just artifacts I'd overlooked in simplifying the code to post it here).
Or perhaps there is a mistake in my approach?
I'm unsure how you arrived at your approach, frankly, but there's certainly a simpler way of doing this.
First, a few things:
The files parameter constructs a multipart/form-data body. So you're compressing something that the server potentially has no clue about.
Content-Encoding and Transfer-Encoding are two very different things. You want Transfer-Encoding here.
You don't need to set a suffix on your NamedTemporaryFile.
Since you didn't explicitly mention that you're trying to compress a multipart/form-data request, I'm going to assume that you don't actually want to do that.
Your call to session.Request (which I assume should be, requests.Request) is missing a method, i.e., it should be: requests.Request('POST', endpoint_url, ...)
With those out of the way, here's how I would do this:
# Assuming `file` is a file-like obj
with NamedTemporaryFile(delete=True) as gzfile:
gzip.GzipFile(fileobj=gzfile, mode="wb").write(file.read())
headers = {'Content-Length': str(gzfile.tell()),
'Transfer-Encoding': 'gzip'}
gzfile.seek(0, 0)
response = session.post(endpoint_url, data=gzfile,
headers=headers)
Assuming that file has the xml content in it and all you meant was to compress it, this should work for you. You probably want to set a Content-Type header though, for example, you'd just do
headers = {'Content-Length': gzfile.tell(),
'Content-Type': 'application/xml', # or 'text/xml'
'Transfer-Encoding': 'gzip'}
The Transfer-Encoding tells the server that the request is being compressed only in transit and it should uncompress it. The Content-Type tells the server how to handle the content once the Transfer-Encoding has been handled. 
I had a question that was marked as an exact duplicate. I was concernd with both ends of the transaction.
The code from sigmavirus24 wasn't a direct cut and paste fix, but it was the inspiration for this version.
Here's how my solution ended up looking:
sending from the python end
import json
import requests
import StringIO
import gzip
url = "http://localhost:3000"
headers = {"Content-Type":"application/octet-stream"}
data = [{"key": 1,"otherKey": "2"},
{"key": 3,"otherKey": "4"}]
payload = json.dumps(data)
out = StringIO.StringIO()
with gzip.GzipFile(fileobj=out, mode="w") as f:
f.write(json.dumps(data))
out.getvalue()
r = requests.post(url+"/zipped", data=out.getvalue(), headers=headers)
receiving at the express end
var zlib = require("zlib");
var rawParser = bodyParser.raw({type: '*/*'});
app.post('/zipped', rawParser, function(req, res) {
zlib.gunzip(req.body, function(err, buf) {
if(err){
console.log("err:", err );
} else{
console.log("in the inflate callback:",
buf,
"to string:", buf.toString("utf8") );
}
});
res.status(200).send("I'm in ur zipped route");
});
There's a gist here with more verbose logging included. This version doesn't have any safety or checking built in either.

Is it possible to use gzip compression with Server-Sent Events (SSE)?

I would like to know if it is possible to enable gzip compression
for Server-Sent Events (SSE ; Content-Type: text/event-stream).
It seems it is possible, according to this book:
http://chimera.labs.oreilly.com/books/1230000000545/ch16.html
But I can't find any example of SSE with gzip compression. I tried to
send gzipped messages with the response header field
Content-Encoding set to "gzip" without success.
For experimenting around SSE, I am testing a small web application
made in Python with the bottle framework + gevent ; I am just running
the bottle WSGI server:
#bottle.get('/data_stream')
def stream_data():
bottle.response.content_type = "text/event-stream"
bottle.response.add_header("Connection", "keep-alive")
bottle.response.add_header("Cache-Control", "no-cache")
bottle.response.add_header("Content-Encoding", "gzip")
while True:
# new_data is a gevent AsyncResult object,
# .get() just returns a data string when new
# data is available
data = new_data.get()
yield zlib.compress("data: %s\n\n" % data)
#yield "data: %s\n\n" % data
The code without compression (last line, commented) and without gzip
content-encoding header field works like a charm.
EDIT: thanks to the reply and to this other question: Python: Creating a streaming gzip'd file-like?, I managed to solve the problem:
#bottle.route("/stream")
def stream_data():
compressed_stream = zlib.compressobj()
bottle.response.content_type = "text/event-stream"
bottle.response.add_header("Connection", "keep-alive")
bottle.response.add_header("Cache-Control", "no-cache, must-revalidate")
bottle.response.add_header("Content-Encoding", "deflate")
bottle.response.add_header("Transfer-Encoding", "chunked")
while True:
data = new_data.get()
yield compressed_stream.compress("data: %s\n\n" % data)
yield compressed_stream.flush(zlib.Z_SYNC_FLUSH)
TL;DR: If the requests are not cached, you likely want to use zlib and declare Content-Encoding to be 'deflate'. That change alone should make your code work.
If you declare Content-Encoding to be gzip, you need to actually use gzip. They are based on the the same compression algorithm, but gzip has some extra framing. This works, for example:
import gzip
import StringIO
from bottle import response, route
#route('/')
def get_data():
response.add_header("Content-Encoding", "gzip")
s = StringIO.StringIO()
with gzip.GzipFile(fileobj=s, mode='w') as f:
f.write('Hello World')
return s.getvalue()
That only really makes sense if you use an actual file as a cache, though.
There's also middleware you can use so you don't need to worry about gzipping responses for each of your methods. Here's one I used recently.
https://code.google.com/p/ibkon-wsgi-gzip-middleware/
This is how I used it (I'm using bottle.py with the gevent server)
from gzip_middleware import Gzipper
import bottle
app = Gzipper(bottle.app())
run(app = app, host='0.0.0.0', port=8080, server='gevent')
For this particular library, you can set w/c types of responses you want to compress by modifying the DEFAULT_COMPRESSABLES variable for example
DEFAULT_COMPRESSABLES = set(['text/plain', 'text/html', 'text/css',
'application/json', 'application/x-javascript', 'text/xml',
'application/xml', 'application/xml+rss', 'text/javascript',
'image/gif'])
All responses go through the middleware and get gzipped without modifying your existing code. By default, it compresses responses whose content-type belongs to DEFAULT_COMPRESSABLES and whose content-length is greater than 200 characters.

Flask-RESTful: Using GET to download a file with REST

I am trying to write a file sharing application that exposes a REST interface.
The library I am using, Flask-RESTful only supports returning JSON by default. Obviously attempting to serve binary data over JSON is not a good idea at all.
What is the most "RESTful" way of serving up binary data through a GET method? It appears possible to extend Flask-RESTful to support returning different data representations besides JSON but the documentation is scarce and I'm not sure if it's even the best approach.
The approach suggested in the Flask-RESTful documentation is to declare our supported representations on the Api object so that it can support other mediatypes. The mediatype we are looking for is application/octet-stream.
First, we need to write a representation function:
from flask import Flask, send_file, safe_join
from flask_restful import Api
app = Flask(__name__)
api = Api(app)
#api.representation('application/octet-stream')
def output_file(data, code, headers):
filepath = safe_join(data["directory"], data["filename"])
response = send_file(
filename_or_fp=filepath,
mimetype="application/octet-stream",
as_attachment=True,
attachment_filename=data["filename"]
)
return response
What this representation function does is to convert the data, code, headers our method returns into a Response object with mimetype application/octet-stream. Here we use send_file function to construct this Response object.
Our GET method can be something like:
from flask_restful import Resource
class GetFile(Resource):
def get(self, filename):
return {
"directory": <Our file directory>,
"filename": filename
}
And that's all the coding we need. When sending this GET request, we need to change the Accept mimetype to Application/octet-stream so that our API will call the representation function. Otherwise it will return the JSON data as by default.
There's an xml example on github
I know this question was asked 7 years ago so it probably doesn't matter any more to #Ayrx. Hope it helps to whoever drops by.
As long as you're setting the Content-Type header accordingly and respecting the Accept header sent by the client, you're free to return any format you want. You can just have a view that returns your binary data with the application/octet-stream content type.
After lot of trials and experiments, including hours of browsing to make the Response class as Single Responsible down loader
class DownloadResource(Resource):
def get(self):
item_list = dbmodel.query.all()
item_list = [item.image for item in item_list]
data = json.dumps({'items':item_list})
response = make_response(data)
response.headers['Content-Type'] = 'text/json'
response.headers['Content-Disposition'] = 'attachment; filename=selected_items.json'
return response
Change your filename and content type to support the format you want.

Multipart POST request Google Glass

I am trying to add an attachment to my timeline with the multipart encoding. I've been doing something like the following:
req = urllib2.Request(url,data={body}, header={header})
resp = urllib2.urlopen(req).read()
And it has been working fine for application/json. However, I'm not sure how to format the body for multipart. I've also used some libraries: requests and poster and they both return 401 for some reason.
How can I make a multipart request either with a libary(preferably a plug-in to urllib2) or with urllib2 itself (like the block of code above)?
EDIT:
I also would like this to be able to support the mirror-api "video/vnd.google-glass.stream-url" from https://developers.google.com/glass/timeline
For the request using poster library here is the code:
register_openers()
datagen, headers = multipart_encode({'image1':open('555.jpg', 'rb')})
Here it is using requets:
headers = {'Authorization' : 'Bearer %s' % access_token}
files = {'file': open('555.jpg', 'rb')}
r = requests.post(timeline_url,files=files, headers=headers)
Returns 401 -> header
Thank you
There is a working Curl example of a multipart request that uses the streaming video url feature here:
Previous Streaming Video Answer with Curl example
It does exactly what you are trying to do, but with Curl. You just need to adapt that to your technology stack.
The 401 you are receiving is going to prevent you even if you use the right syntax. A 401 response indicates you do not have authorization to modify the timeline. Make sure you can insert a simple hello world text only card first. Once you get past the 401 error and get into parsing errors and format issues the link above should be everything you need.
One last note, you don't need urllib2, the Mirror API team dropped a gem of a feature in our lap and we don't need to be bothered with getting the binary of the video, check that example linked above I only provided a URL in the multipart payload, no need to stream the binary data! Google does all the magic in XE6 and above for us.
Thanks Team Glass!
I think you will find this is simpler than you think. Try out the curl example and watch out for incompatible video types, when you get that far, if you don't use a compatible type it will appear not to work in Glass, make sure your video is encoded in a Glass friendly format.
Good luck!
How to add an attachment to a timeline with multipart encoding:
The easiest way to add attachments with multipart encoding to a timeline is to use the
Google APIs Client Library for Python. With this library, you can simple use the following example code provided in the Mirror API timeline insert documentation (click the Python tab under Examples).
from apiclient.discovery import build
service = build('mirror', 'v1')
def insert_timeline_item(service, text, content_type=None, attachment=None,
notification_level=None):
timeline_item = {'text': text}
media_body = None
if notification_level:
timeline_item['notification'] = {'level': notification_level}
if content_type and attachment:
media_body = MediaIoBaseUpload(
io.BytesIO(attachment), mimetype=content_type, resumable=True)
try:
return service.timeline().insert(
body=timeline_item, media_body=media_body).execute()
except errors.HttpError, error:
print 'An error occurred: %s' % error
You cannot actually use requests or poster to automatically encode your data, because these libraries encode things in multipart/form-data whereas Mirror API wants things in multipart/related.
How to debug your current error code:
Your code gives a 401, which is an authorization error. This means you are probably failing to include your access token with your requests. To include an access token, set the Authorization field to Bearer: YOUR_ACCESS_TOKEN in your request (documentation here).
If you do not know how to get an access token, the Glass developer docs has a page here explaining how to obtain an access token. Make sure that your authorization process requested the following scope for multipart-upload, otherwise you will get a 403 error. https://www.googleapis.com/auth/glass.timeline
This is how I did it and how the python client library does it.
from email.mime.multipart import MIMEMultipart
from email.mime.nonmultipart import MIMENonMultipart
from email.mime.image import MIMEImage
mime_root = MIMEMultipart('related', '===============xxxxxxxxxxxxx==')
headers= {'Content-Type': 'multipart/related; '
'boundary="%s"' % mime_root.get_boundary(),
'Authorization':'Bearer %s' % access_token}
setattr(mime_root, '_write_headers', lambda self: None)
#Create the metadata part of the MIME
mime_text = MIMENonMultipart(*['application','json'])
mime_text.set_payload("{'text':'waddup doe!'}")
print "Attaching the json"
mime_root.attach(mime_text)
if method == 'Image':
#DO Image
file_upload = open('555.jpg', 'rb')
mime_image = MIMENonMultipart(*['image', 'jpeg'])
#add the required header
mime_image['Content-Transfer-Encoding'] = 'binary'
#read the file as binary
mime_image.set_payload(file_upload.read())
print "attaching the jpeg"
mime_root.attach(mime_image)
elif method == 'Video':
mime_video = MIMENonMultipart(*['video', 'vnd.google-glass.stream-url'])
#add the payload
mime_video.set_payload('https://dl.dropboxusercontent.com/u/6562706/sweetie-wobbly-cat-720p.mp4')
mime_root.attach(mime_video)
Mark Scheel I used your video for testing purposes :) Thank you.

Categories

Resources