Downloaded filename with Google App Engine Blobstore - python

I'm using the Google App Engine Blobstore to store a range of file types (PDF, XLS, etc) and am trying to find a mechanism by which the original filename of the uploaded file - as stored in blob_info - can be used to name the downloaded file i.e. so that the user sees 'some_file.pdf' in the save dialogue rather than 'very_long_db_key.pdf'.
I can't see anything in the docs that would allow this:
http://code.google.com/appengine/docs/python/blobstore/overview.html
I've seen hints in other posts that you could use the information in blob_info to set the content-disposition header. Is this the best approach to achieving the desired end?

There is an optional 'save_as' parameter in the send_blob function. By default this is set to False. Setting it to True will cause the file to be treated as an attachment (ie it will trigger a 'Save/Open' download dialog) and the user will see the proper filename.
Example:
class ServeHandler(blobstore_handlers.BlobstoreDownloadHandler):
def get(self, resource):
resource = str(urllib.unquote(resource))
blob_info = blobstore.BlobInfo.get(resource)
self.send_blob(blob_info,save_as=True)
It is also possible to overwrite the filename by passing in a string:
self.send_blob(blob_info,save_as='my_file.txt')
If you want some content (such as pdfs) to open rather than save you could use the content_type to determine the behavior:
blob_info = blobstore.BlobInfo.get(resource)
type = blob_info.content_type
if type == 'application/pdf':
self.response.headers['Content-Type'] = type
self.send_blob(blob_info,save_as=False)
else:
self.send_blob(blob_info,save_as=True)

For future reference, save_as and the BlobstoreDownloadHandler is documented here:
http://code.google.com/appengine/docs/python/tools/webapp/blobstorehandlers.html
It does seem like it should be a bit easier to find. Let's see if it can be improved.

Another option is to append the file name to the end of the download URL. For example:
/files/AMIfv95HJJY3F75v3lz2EeyvWIvGKxEcDagKtyDSgQSPWiMnE0C2iYTUxLZlFHs2XxnV_j1jdWmmKbSVwBj6lYT0-G_w5wENIdPKDULHqa8Q3E_uyeY1gFu02Iiw9xm523Rxk3LJnqHf9n8209t4sPEHhwVOKdDF2A/prezents-list.doc
If you use Jinja2 for templating, you can construct such an URL like this:
{{file.filename}}
then you should adapt your URL mapping accordingly to something like this:
('/files/([^/]+)/?.*', DownloadHandler)
If you have the blob key in the URL, you can ignore the file name in your server-side code.
The benefit of this approach is that content types like images or PDF open directly in the browser, which is convenient for quick viewing. Other content types will just be saved to disk.

Yes it is the best approach; just query the BlobInfo object using the given Blobstore key and use its content-type property.

Related

Pyramid fileresponse with header

I'm returning an image to my webpage with a pyramid fileResponse like this:
response = FileResponse(newPath)
response.content_disposition = f'attachment; filename="{newImage}"'
return response
I get the file back fine, but i can't figure out how to add more parameters to this. I want to return the file and the name of the file. I've looked at questions like How to set file name in response, but i can't seem to make this work.
this answer seems to suggest that i wouldn't even want to use Content-Disposition, because i am displaying it, but i can't find any other ways to add a parameter.
That said, you can use the Content-Disposition header to specify that
you want the browser to download the file rather than display it, and
you can also suggest a filename for the file to use for that file. It
looks like this:
How can i add another parameter to my fileResponse?
The pattern in your example is valid, you should see the Content-Disposition header set on the response that way. As far as displaying it instead of downloading as an attachment I believe that the final path segment plus in the URL itself dictates how the file will be named, plus possibly content type, when saving it by default. The disposition header just overrides that in certain situations.

How to convert FileStorage object to b2sdk.v2.AbstractUploadSource in Python

I am using Backblaze B2 and b2sdk.v2 in Flask to upload files.
This is code I tried, using the upload method:
# I am not showing authorization code...
def upload_file(file):
bucket = b2_api.get_bucket_by_name(bucket_name)
file = request.files['file']
bucket.upload(
upload_source=file,
file_name=file.filename,
)
This shows an error like this
AttributeError: 'SpooledTemporaryFile' object has no attribute 'get_content_length'
I think it's because I am using a FileStorage instance for the upload_source parameter.
I want to know whether I am using the API correctly or, if not, how should I use this?
Thanks
You're correct - you can't use a Flask FileStorage instance as a B2 SDK UploadSource. What you need to do is to use the upload_bytes method with the file's content:
def upload_file(file):
bucket = b2_api.get_bucket_by_name(bucket_name)
file = request.files['file']
bucket.upload_bytes(
data_bytes=file.read(),
file_name=file.filename,
...other parameters...
)
Note that this reads the entire file into memory. The upload_bytes method may need to restart the upload if something goes wrong (with the network, usually), so the file can't really be streamed straight through into B2.
If you anticipate that your files will not fit into memory, you should look at using create_file_stream to upload the file in chunks.

Which would be the best way to convert a csv file to excel?

In order to become more familiar with django, I decided to build a website which is gonna let a user upload a csv file, which is then gonna be converted to excel and the user will be able to download it.
In order to achieve that I created a modelform with one model FileField called csv_file as shown below:
#models.py
class CSVUpload(models.Model):
csv_file = models.FileField(upload_to="csvupload/")
def __str__(self):
return self.csv_file
#forms.py
class CsvForm(forms.ModelForm):
class Meta:
model = CSVUpload
fields = ('csv_file', )
and the corresponding view is:
from django.shortcuts import render, redirect
import pandas as pd
import os
#converts file from csv to excel
def convertcsv2excel(filename):
fpath = os.path.join(settings.MEDIA_ROOT + "\csvupload", filename)
df = pd.read_csv(fpath)
newfilename = str(filename) +".xlsx"
newpathfile = os.path.join(settings.MEDIA_ROOT, newfilename)
df.to_excel(newpathfile, encoding='utf-8', index=False)
return newfilename
def csvtoexcel(request):
if request.method == 'POST':
form = CsvForm(request.POST, request.FILES)
if form.is_valid():
form.save()
print(form.cleaned_data['csv_file'].name)
convertcsv2excel(form.cleaned_data['csv_file'].name)
return redirect('exceltocsv')
else:
form = CsvForm()
return render(request, 'xmlconverter/csvtoexcel.html',{'form': form})
right now as you can see I am using Pandas in order to convert the csv file to excel inside the views.py file. My question is, is there a better way to do it (for instance in the form or model module) in order to make the excel file more effectively downloadable?
I appreciate any help you can provide!
First, I want to point out that your example demonstrates an arbitrary file upload vulnerability. Pandas does not validate the format of the file for you, so as an attacker, I can simply upload something like malware.php.csv to your conversion script, and any malicious code I include will remain intact. Since you aren't validating that this file's contents are, in fact, in CSV format, then you are giving users a means to directly upload a file with an arbitrary extension and possibly execute code on your website. Since you are rendering the xlsx format on the webpage the way you are, there's a good chance someone could abuse this. If this is just your own personal experiment to help yourself get familiar, that's one thing, but I strongly recommend against deploying this in production. What you are doing here is very dangerous.
As for your more immediate problem, I'm not personally familiar with Django, but this looks very similar to this question: Having Django serve downloadable files
In your case, you do not want to actually save the file's contents to your server but rather you want to process the file contents and return it in the body of the response. The django smartfile module looks to be exactly what you want: https://github.com/smartfile/django-transfer
This provides components for Apache, Nginx, and lighttpd and should allow you to provide a means to provide files in the response immediately following a request to upload/convert the file. I should emphasize that you need to be very careful about where you save these files, validating their contents, ensure end-users cannot browse to or execute these files under the web server context, and that they are deleted immediately after the response and file is successfully sent.
Someone more familiar with Django can feel free to correct me or provide a usable code example, but this kind of functionality, in my experience, is how you introduce code execution into your site. It's usually a bad idea.

Uploading a file via paperclip through an external script

I'm trying to create a rails app that is a CMS for a client. The app currently has a documents class that uploads the document with paperclip.
Separate to this, we're running a python script that accesses the database and gets a bunch of information for a given event, creates a proposal word document, and uploads it to the database under the correct event.
This all works, but the app does not recognize the document. How do I make a python script that will correctly upload the document such that paperclip knows what's going on?
Here is my paperclip controller:
def new
#event = Event.find(params[:event_id])
#document = Document.new
end
def create
#event = Event.find(params[:event_id])
#document = #event.documents.new(document_params)
if #document.save
redirect_to event_path(#event)
end
end
private
def document_params
params.require(:document).permit(:event_id, :data, :title)
end
Model
validates :title, presence: true
has_attached_file :data
validates_attachment_content_type :data, :content_type => ["application/pdf", "application/msword"]
Here is the python code.
f = open(propStr, 'r')
binary = psycopg2.Binary(f.read())
self.cur.execute("INSERT INTO documents (event_id, title, data_file_name, data_content_type) VALUES (%d,'Proposal.doc',%s,'application/msword');" % (self.eventData[0], binary))
self.con.commit()
You should probably use Ruby to script this since it can load in any model information or other classes you need.
But assuming your requirements dictate the use of python, be aware that Paperclip does not store the documents in your database tables, only the files' metadata. The actual file is stored in your file system in the /public dir by default (could also be s3, etc depending on your configuration). I would make sure you were actually saving the file to the correct anticipated directory. The default path according to the docs is:
:rails_root/public/system/:class/:attachment/:id_partition/:style/:filename
so you will have to make another sql query to retrieve the id of your new record. I don't believe pdfs have a :style attribute since you don't use imagicmagick to resize them, so build a path that looks something like this:
/public/system/documents/data/000/000/123/my_file.pdf
and save it from your python script.

send data from blobstore as email attachment in GAE

Why isn't the code below working? The email is received, and the file comes through with the correct filename (it's a .png file). But when I try to open the file, it doesn't open correctly (Windows Gallery reports that it can't open this photo or video and that the file may be unsupported, damaged or corrupted).
When I download the file using a subclass of blobstore_handlers.BlobstoreDownloadHandler (basically the exact handler from the GAE docs), and the same blob key, everything works fine and Windows reads the image.
One more bit of info - the binary files from the download and the email appear very similar, but have a slightly different length.
Anyone got any ideas on how I can get email attachments sending from GAE blobstore? There are similar questions on S/O, suggesting other people have had this issue, but there don't appear to be any conclusions.
from google.appengine.api import mail
from google.appengine.ext import blobstore
def send_forum_post_notification():
blob_reader = blobstore.BlobReader('my_blobstore_key')
blob_info = blobstore.BlobInfo.get('my_blobstore_key')
value = blob_reader.read()
mail.send_mail(
sender='my.email#address.com',
to='my.email#address.com',
subject='this is the subject',
body='hi',
reply_to='my.email#address.com',
attachments=[(blob_info.filename, value)]
)
send_forum_post_notification()
I do not understand why you use a tuple for the attachment. I use :
message = mail.EmailMessage(sender = ......
message.attachments = [blob_info.filename,blob_reader.read()]
I found that this code doesn't work on dev_appserver but does work when pushed to production.
I ran into a similar problem using the blobstore on a Python Google App Engine application. My application handles PDF files instead of images, but I was also seeing a "the file may be unsupported, damaged or corrupted" error using code similar to your code shown above.
Try approaching the problem this way: Call open() on the BlobInfo object before reading the binary stream. Replace this line:
value = blob_reader.read()
... with these two lines:
bstream = blob_info.open()
value = bstream.read()
Then you can remove this line, too:
blob_reader = blobstore.BlobReader('my_blobstore_key')
... since bstream above will be of type BlobReader.
Relevant documentation from Google is located here:
https://cloud.google.com/appengine/docs/python/blobstore/blobinfoclass#BlobInfo_filename

Categories

Resources