GAE - how to use blobstore stub in testbed?

GAE - how to use blobstore stub in testbed? - python

My code goes like this:
self.testbed.init_blobstore_stub()
upload_url = blobstore.create_upload_url('/image')
upload_url = re.sub('^http://testbed\.example\.com', '', upload_url)
response = self.testapp.post(upload_url, params={
'shopid': id,
'description': 'JLo',
}, upload_files=[('file', imgPath)])
self.assertEqual(response.status_int, 200)
how come it shows 404 error? For some reasons the upload path does not seem to exist at all.

You can't do this. I think the problem is that webtest (which I assume is where self.testapp came from) doesn't work well with testbed blobstore functionality. You can find some info at this question.
My solution was to override unittest.TestCase and add the following methods:
def create_blob(self, contents, mime_type):
"Since uploading blobs doesn't work in testing, create them this way."
fn = files.blobstore.create(mime_type = mime_type,
_blobinfo_uploaded_filename = "foo.blt")
with files.open(fn, 'a') as f:
f.write(contents)
files.finalize(fn)
return files.blobstore.get_blob_key(fn)
def get_blob(self, key):
return self.blobstore_stub.storage.OpenBlob(key).read()
You will also need the solution here.
For my tests where I would normally do a get or post to a blobstore handler, I instead call one of the two methods above. It is a bit hacky but it works.
Another solution I am considering is to use Selenium's HtmlUnit driver. This would require the dev server to be running but should allow full testing of blobstore and also javascript (as a side benefit).

I think Kekito is right, you cannot POST to the upload_url directly.
But if you want to test the BlobstoreUploadHandler, you can fake the POST request it would normally received from the blobstore in the following way. Assuming your handler is at /handler :
import email
...
def test_upload(self):
blob_key = 'abcd'
# The blobstore upload handler receives a multipart form request
# containing uploaded files. But instead of containing the actual
# content, the files contain an 'email' message that has some meta
# information about the file. They also contain a blob-key that is
# the key to get the blob from the blobstore
# see blobstore._get_upload_content
m = email.message.Message()
m.add_header('Content-Type', 'image/png')
m.add_header('Content-Length', '100')
m.add_header('X-AppEngine-Upload-Creation', '2014-03-02 23:04:05.123456')
# This needs to be valie base64 encoded
m.add_header('content-md5', 'd74682ee47c3fffd5dcd749f840fcdd4')
payload = m.as_string()
# The blob-key in the Content-type is important
params = [('file', webtest.forms.Upload('test.png', payload,
'image/png; blob-key='+blob_key))]
self.testapp.post('/handler', params, content_type='blob-key')
I figured that out by digging into the blobstore code. The important bit is that the POST request that the blobstore sends to the UploadHandler doesn't contain the file content. Instead, it contains an "email message" (well, informations encoded like in an email) with metadata about the file (content-type, content-length, upload time and md5). It also contains a blob-key that can be used to retrieve the file from the blobstore.

Related

get_uploads from Blobstore Upload returns empty list although file shows up in bucket

I've recently deployed my python GAE app from the development server and my image upload function stopped working properly...
After a bit of testing, it seems that the get_uploads function from blobstore is returning an empty list and hence I get an out of index error from the upload handler (also tried the get_file_infos function and had the same result)
However, when I check the GCS browser, the file is properly uploaded so my problem seems to be that I can't find a way to extract the image link from the post to Upload Handler
Anybody have clues as to why this is happening? and if there's a way around this?
(The form uses a post method with multipart/form-data so hopefully that isn't an issue)
Here's the function I'm calling to post to the upload handler:
upload_url = blobstore.create_upload_url('/upload', gs_bucket_name='BUCKET')
result = urlfetch.fetch(url= upload_url,
payload=self.request.body,
method=urlfetch.POST,
headers=self.request.headers)
And here's the code for the upload handler:
class UploadHandler(blobstore_handlers.BlobstoreUploadHandler):
def post(self):
upload_files = self.get_uploads('file')
blob_info = upload_files[0]
self.response.write(str(blob_info.key()))

What do you try to do?
It looks like you try to post a received body to GCS. Why not write it using the Google Cloud Storage Client Library.
with gcs.open(gcs_filename, 'w', content_type, options={b'x-goog-acl': b'public-read'}) as f:
f.write(blob)

How do you get Google App Engine to gunzip during download?

I am trying to get Google App Engine to gunzip my .gz blob file (single file compressed) automatically by setting the response headers as follows:
class download(blobstore_handlers.BlobstoreDownloadHandler):
def get(self, resource):
resource = str(urllib.unquote(resource))
blob_info = blobstore.BlobInfo.get(resource)
self.response.headers['Content-Encoding'] = str('gzip')
# self.response.headers['Content-type'] = str('application/x-gzip')
self.response.headers['Content-type'] = str(blob_info.content_type)
self.response.headers['Content-Length'] = str(blob_info.size)
cd = 'attachment; filename=%s' % (blob_info.filename)
self.response.headers['Content-Disposition'] = str(cd)
self.response.headers['Cache-Control'] = str('must-revalidate, post-check=0, pre-check=0')
self.response.headers['Pragma'] = str(' public')
self.send_blob(blob_info)
When this runs, the file is downloaded without the .gz extension. However, the downloaded file is still gzipped. The file size of the downloaded data match the .gz file size on the server. Also, I can confirm this by manually gunzipping the downloaded file. I am trying to avoid the manual gunzip step.
I am trying to get the blob file to automatically gunzip during the download. What am I doing wrong?
By the way, the gzip file contains only a single file. On my self-hosted (non Google) server, I could accomplish the automatic gunzip by setting same response headers; albeit, my code there is written in PHP.
UPDATE:
I rewrote the handler to serve data from the bucket. However, this generates HTML 500 error. The file is partially downloaded before the failure. The rewrite is as follows:
class download(blobstore_handlers.BlobstoreDownloadHandler):
def get(self, resource):
resource = str(urllib.unquote(resource))
blob_info = blobstore.BlobInfo.get(resource)
file = '/gs/mydatabucket/%s' % blob_info.filename
print file
self.response.headers['Content-Encoding'] = str('gzip')
self.response.headers['Content-Type'] = str('application/x-gzip')
# self.response.headers['Content-Length'] = str(blob_info.size)
cd = 'filename=%s' % (file)
self.response.headers['Content-Disposition'] = str(cd)
self.response.headers['Cache-Control'] = str('must-revalidate, post-check=0, pre-check=0')
self.response.headers['Pragma'] = str(' public')
self.send_blob(file)
This downloads 540,672 bytes of the 6,094,848 bytes file to the client before the server terminate and issued a 500 error. When I issue 'file' on the partially downloaded file from the command line, Mac OS seems to correctly identify the file format as 'SQLite 3.x database' file. Any idea of why the 500 error on the server? How can I fix the problem?

You should first check to see if your requesting client supports gzipped content. If it does support gzip content encoding, then you may pass the gzipped blob as is with the proper content-encoding and content-type headers, otherwise you need to decompress the blob for the client. You should also verify that your blob's content_type isn't gzip (this depends on how you created your blob to begin with!)
You may also want to look at Google Cloud Storage as this automatically handles gzip transportation so long as you properly compress the data before storing it with the proper content-encoding and content-type metadata.
See this SO question: Google cloud storage console Content-Encoding to gzip
Or the GCS Docs: https://cloud.google.com/storage/docs/gsutil/addlhelp/WorkingWithObjectMetadata#content-encoding
You may use GCS as easily (if not more easily) as you use the blobstore in AppEngine and it seems to be the preferred storage layer to use going forward. I say this because the File API has been deprecated which made blobstore interaction easier and great efforts and advancements have been made to the GCS libraries making the API similar to the base python file interaction API
UPDATE:
Since the objects are stored in GCS, you can use 302 redirects to point users to files rather than relying on the Blobstore API. This eliminates any unknown behavior of the Blobstore API and GAE delivering your stored objects with the content-type and content-encoding you intended to use. For objects with a public-read ACL, you may simply direct them to either storage.googleapis.com/<bucket>/<object> or <bucket>.storage.googleapis.com/<object>. Alternatively, if you'd like to have application logic dictate access, you should keep the ACL to the objects private and can use GCS Signed URLs to create short lived URLs to use when doing a 302 redirect.
Its worth noting that if you want users to be able to upload objects via GAE, you'd still use the Blobstore API to handle storing the file in GCS, but you'd have to modify the object after it was uploaded to ensure proper gzip compressing and content-encoding meta data is used.
class legacy_download(blobstore_handlers.BlobstoreDownloadHandler):
def get(self, resource):
filename = str(urllib.unquote(resource))
url = 'https://storage.googleapis.com/mybucket/' + filename
self.redirect(url)

GAE already serves everything using gzip if the client supports it.
So I think what's happening after your update is that the browser expects there to be more of the file, but GAE thinks it's already at the end of the file since it's already gzipped. That's why you get the 500.
(if that makes sense)
Anyway, since GAE already handles compression for you, the easiest way is probably to put non compressed files in GCS and let the Google infrastructure handle the compression automatically for you when you serve them.

Streaming file upload using bottle (or flask or similar)

I have a REST frontend written using Python/Bottle which handles file uploads, usually large ones. The API is wirtten in such a way that:
The client sends PUT with the file as a payload. Among other things, it sends Date and Authorization headers. This is a security measure against replay attacks -- the request is singed with a temporary key, using target url, the date and several other things
Now the problem. The server accepts the request if the supplied date is in given datetime window of 15 minutes. If the upload takes long enough time, it will be longer than the allowed time delta. Now, the request authorization handling is done using decorator on bottle view method. However, bottle won't start the dispatch process unless the upload is finished, so the validation fails on longer uploads.
My question is: is there a way to explain to bottle or WSGI to handle the request immediately and stream the upload as it goes? This would be useful for me for other reasons as well. Or any other solutions? As I am writing this, WSGI middleware comes to mind, but still, I'd like external insight.
I would be willing to switch to Flask, or even other Python frameworks, as the REST frontend is quite lightweight.
Thank you

I recommend splitting the incoming file into smaller-sized chunks on the frontend. I'm doing this to implement a pause/resume function for large file uploads in a Flask application.
Using Sebastian Tschan's jquery plugin, you can implement chunking by specifying a maxChunkSize when initializing the plugin, as in:
$('#file-select').fileupload({
url: '/uploads/',
sequentialUploads: true,
done: function (e, data) {
console.log("uploaded: " + data.files[0].name)
},
maxChunkSize: 1000000 // 1 MB
});
Now the client will send multiple requests when uploading large files. And your server-side code can use the Content-Range header to patch the original large file back together. For a Flask application, the view might look something like:
# Upload files
#app.route('/uploads/', methods=['POST'])
def results():
files = request.files
# assuming only one file is passed in the request
key = files.keys()[0]
value = files[key] # this is a Werkzeug FileStorage object
filename = value.filename
if 'Content-Range' in request.headers:
# extract starting byte from Content-Range header string
range_str = request.headers['Content-Range']
start_bytes = int(range_str.split(' ')[1].split('-')[0])
# append chunk to the file on disk, or create new
with open(filename, 'a') as f:
f.seek(start_bytes)
f.write(value.stream.read())
else:
# this is not a chunked request, so just save the whole file
value.save(filename)
# send response with appropriate mime type header
return jsonify({"name": value.filename,
"size": os.path.getsize(filename),
"url": 'uploads/' + value.filename,
"thumbnail_url": None,
"delete_url": None,
"delete_type": None,})
For your particular application, you will just have to make sure that the correct auth headers are still sent with each request.
Hope this helps! I was struggling with this problem for a while ;)

When using plupload solution might be like this one:
$("#uploader").plupload({
// General settings
runtimes : 'html5,flash,silverlight,html4',
url : "/uploads/",
// Maximum file size
max_file_size : '20mb',
chunk_size: '128kb',
// Specify what files to browse for
filters : [
{title : "Image files", extensions : "jpg,gif,png"},
],
// Enable ability to drag'n'drop files onto the widget (currently only HTML5 supports that)
dragdrop: true,
// Views to activate
views: {
list: true,
thumbs: true, // Show thumbs
active: 'thumbs'
},
// Flash settings
flash_swf_url : '/static/js/plupload-2.1.2/js/plupload/js/Moxie.swf',
// Silverlight settings
silverlight_xap_url : '/static/js/plupload-2.1.2/js/plupload/js/Moxie.xap'
});
And your flask-python code in such case would be similar to this:
from werkzeug import secure_filename
# Upload files
#app.route('/uploads/', methods=['POST'])
def results():
content = request.files['file'].read()
filename = secure_filename(request.values['name'])
with open(filename, 'ab+') as fp:
fp.write(content)
# send response with appropriate mime type header
return jsonify({
"name": filename,
"size": os.path.getsize(filename),
"url": 'uploads/' + filename,})
Plupload always sends chunks in exactly same order, from first to last, so you do not have to bother with seek or anything like that.

How to discard uploaded file if content-type is not allowed?

I'm trying to block unwanted content-type of uploaded files. I'm using the code from documentation:
class UploadHandler(blobstore_handlers.BlobstoreUploadHandler):
def post(self):
upload_files = self.get_uploads('file') # 'file' is file upload field in the form
blob_info = upload_files[0]
self.redirect('/serve/%s' % blob_info.key())
What I've found out is that before the last line with redirect, the blob is already in blobstore, so the only thing left to do is to check it's content-type and perform delete if it is unwanted.
Is there any other way to discard the file before it hits blobstore?

the only way to do it before it hits the Blobstore is on the client side for example by checking the extension of the file with javascript.
once it hits your UploadHandler its already in the BlobStore.

You could also try the accept attribute of the <input>, but it's not supported in all browsers and if you really want to, you can try some Flash or Java Applet solutions as mention in another answer. I personally would go for the server-side check and delete.

How to receive an uploaded file to store it using Blobstore API

I have a server side code to process uploaded binary files:
class UploadHandler(webapp.RequestHandler):
def post(self):
file_name = files.blobstore.create(mime_type='application/octet-stream')
with files.open(file_name, 'a') as f:
f.write('data')
files.finalize(file_name)
blob_key = files.blobstore.get_blob_key(file_name)
It's the code from examples so actually it doesn't process any uploaded files, just create a new Blobstore entity and writes some data to this. From the client side I have this part of the code that actually sends the file to the server:
var xhr = new XMLHttpRequest();
xhr.open("post", "/upload", true);
xhr.setRequestHeader("Content-Type", "multipart/form-data");
xhr.setRequestHeader("X-File-Name", file.fileName);
xhr.setRequestHeader("X-File-Size", file.fileSize);
xhr.setRequestHeader("X-File-Type", file.type);
xhr.send(file);
In FireBug I see it uploads the file to the server and the server code creates a file as it is supposed to be. The thing I can't figure out is how to connect these two parts so that server side code could receive the uploaded file as a stream. I don't use forms so I can't get the file with something like upload_files = self.get_uploads('file'). How do I retrieve the file on the server side?
UPDATE: I have found an answer in GAE documentation about webapp request handlers. I need to use something like this uploaded_file = self.request.body to get the file stream. Then I just use f.write(uploaded_file) to save it. It seems to work for me. Please share you thoughts if it's a good approach.

Should be something like this:
class UploadHandler(webapp.RequestHandler):
def post(self):
mime_type = self.request.headers['X-File-Type']
name = self.request.headers['X-File-Name']
file_name = files.blobstore.create(mime_type=mime_type,
_blobinfo_uploaded_filename=name)
with files.open(file_name, 'a') as f:
f.write(self.request.body)
files.finalize(file_name)
blob_key = files.blobstore.get_blob_key(file_name)
Your custom headers and body can be pulled from the WebOb Request object. Note that you don't need to inherit from BlobStoreUploadHandler since you're not using an HTML upload form.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

GAE - how to use blobstore stub in testbed? - python

Related

get_uploads from Blobstore Upload returns empty list although file shows up in bucket

How do you get Google App Engine to gunzip during download?

Streaming file upload using bottle (or flask or similar)

How to discard uploaded file if content-type is not allowed?

How to receive an uploaded file to store it using Blobstore API

Categories

Resources