App Engine - download files from Cloud Storage

App Engine - download files from Cloud Storage - python

I am using Python 2.7 and Reportlab to create .pdf files for display/print in my app engine system. I am using ndb.Model to store the data if that matters.
I am able to produce the equivalent of a bank statement for a single client on-line. That is; the user clicks the on-screen 'pdf' button and the .pdf statement appears on screen in a new tab, exactly as it should.
I am using the following code to save .pdf files to Google Cloud Storage successfully
buffer = StringIO.StringIO()
self.p = canvas.Canvas(buffer, pagesize=portrait(A4))
self.p.setLineWidth(0.5)
try:
# create .pdf of .csv data here
finally:
self.p.save()
pdfout = buffer.getvalue()
buffer.close()
filename = getgcsbucket() + '/InvestorStatement.pdf'
write_retry_params = gcs.RetryParams(backoff_factor=1.1)
try:
gcs_file = gcs.open(filename,
'w',
content_type='application/pdf',
retry_params=write_retry_params)
gcs_file.write(pdfout)
except:
logging.error(traceback.format_exc())
finally:
gcs_file.close()
I am using the following code to create a list of all files for display on-screen, it shows all the files stored above.
allfiles = []
bucket_name = getgcsbucket()
rfiles = gcs.listbucket(bucket_name)
for rfile in rfiles:
allfiles.append(rfile.filename)
return allfiles
My screen (html) shows rows of ([Delete] and Filename). When the user clicks the [Delete] button, the following delete code snippet works (filename is /bucket/filename, complete)
filename = self.request.get('filename')
try:
gcs.delete(filename)
except gcs.NotFoundError:
pass
My question - given I have a list of files on-screen, I want the user to click on the filename and for that file to be downloaded to the user's computer. In Google's Chrome Browser, this would result in the file being downloaded, with it's name displayed on the bottom left of the screen.
One other point, the above example is for .pdf files. I will also have to show .csv files in the list and would like them to be downloaded as well. I only want the files to be downloaded, no display is required.
So, I would like a snippet like ...
filename = self.request.get('filename')
try:
gcs.downloadtousercomputer(filename) ???
except gcs.NotFoundError:
pass
I think I have tried everything I can find both here and elsewhere. Sorry I have been so long-winded. Any hints for me?

To download a file instead of showing it in the browser, you need to add a header to your response:
self.response.headers["Content-Disposition"] = 'attachment; filename="%s"' % filename
You can specify the filename as shown above and it works for any file type.

One solution you can try is to read the file from the bucket and print the content as the response with the correct header:
import cloudstorage
...
def read_file(self, filename):
bucket_name = "/your_bucket_name"
file = bucket_name + '/' + filename
with cloudstorage.open(file) as cloudstorage_file:
self.response.headers["Content-Disposition"] = str('attachment;filename=' + filename)
contents = cloudstorage_file.read()
cloudstorage_file.close()
self.response.write(contents)
Here filename could be something you are sending as GET parameter and needs to be a file that exist on your bucket or you will raise an exception.
[1] Here you will find a sample.
[1]https://cloud.google.com/appengine/docs/standard/python/googlecloudstorageclient/read-write-to-cloud-storage

Related

Download a single file from a shared Dropbox's folder without having the share link of this file

As the title says, I have access to a shared folder where some files are uploaded. I just want to donwload an specific file, called "db.dta". So, I have this script:
def download_file(url, filename):
url = url
file_name = filename
with open(file_name, "wb") as f:
print("Downloading %s" % file_name)
response = requests.get(url, stream=True)
total_length = response.headers.get('content-length')
if total_length is None: # no content length header
f.write(response.content)
else:
dl = 0
total_length = int(total_length)
for data in response.iter_content(chunk_size=4096):
dl += len(data)
f.write(data)
done = int(50 * dl / total_length)
sys.stdout.write("\r[%s%s]" % ('=' * done, ' ' * (50-done)) )
sys.stdout.flush()
print(" ")
print('Descarga existosa.')
It actually download shares links of files if I modify the dl=0 to 1, like this:
https://www.dropbox.com/s/ajklhfalsdfl/db_test.dta?dl=1
The thing is, I dont have the share link of this particular file in this shared folder, so if I use the url of the file preview, I get an error of denied access (even if I change dl=0 to 1).
https://www.dropbox.com/sh/a630ksuyrtw33yo/LKExc-MKDKIIWJMLKFJ?dl=1&preview=db.dta
Error given:
dropbox.exceptions.ApiError: ApiError('22eaf5ee05614d2d9726b948f59a9ec7', GetSharedLinkFileError('shared_link_access_denied', None))
Is there a way to download this file?

If you have the shared link to the parent folder and not the specific file you want, you can use the /2/sharing/get_shared_link_file endpoint to download just the specific file.
In the Dropbox API v2 Python SDK, that's the sharing_get_shared_link_file method (or sharing_get_shared_link_file_to_file). Based on the error output you shared, it looks like you are already using that (though not in the particular code snippet you posted).
Using that would look like this:
import dropbox
dbx = dropbox.Dropbox(ACCESS_TOKEN)
folder_shared_link = "https://www.dropbox.com/sh/a630ksuyrtw33yo/LKExc-MKDKIIWJMLKFJ"
file_relative_path = "/db.dat"
res = dbx.sharing_get_shared_link_file(url=folder_shared_link, path=file_relative_path)
print("Metadata: %s" % res[0])
print("File data: %s bytes" % len(res[1].content))
(You mentioned both "db.dat" and "db.dta" in your question. Make sure you use whichever is actually correct.)
Additionally, note if you using a Dropbox API app registered with the "app folder" access type: there's currently a bug that can cause this shared_link_access_denied error when using this method with an access token for an app folder app.

Elaborate and store inside an azure blob multiple files from a form-data request by using azure functions

I am developing an azure function that receives in input several files of different formats (eg xlsx, csv, txt, pdf, png) through the form-data format. The idea is to develop a function that can take files and store them one by one inside a blob. At the moment, my code is as follows:
def main(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
filename, contents = False, False
try:
files = req.files.values()
for file in files:
filename = str(file.filename)
logging.info(type(file.stream.read()))
contents = file.stream.read().decode('utf-8')
except Exception as ex:
logging.error(str(type(ex)) + ': ' + str(ex))
return func.HttpResponse(body=str(ex), status_code=400)
Then i write the content variable inside the blob but the files inside the blob had 0 as size and if i try to download the file, the files are empty. How can i manage this operation to store different format files inside a blob? Thanks a lot for your support!

Below is the code to upload multiple files of different formats:
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient,PublicAccess
import os
def UploadFiles():
CONNECTION_STRING="ENTER_STORAGE_CONNECTION_STRING"
Container_name="uploadcontainer"
service_client=BlobServiceClient.from_connection_string(CONNECTION_STRING)
container_client = service_client.get_container_client(Container_name)
ReplacePath = "C:\\"
local_path = "C:\Testupload" #the local folder
for r,d,f in os.walk(local_path):
if f:
for file in f:
AzurePath = os.path.join(r,file).replace(ReplacePath,"")
LocalPath = os.path.join(r,file)
blob_client = container_client.get_blob_client(AzurePath)
with open(LocalPath,'rb') as data:
blob_client.upload_blob(data)
if __name__ == '__main__':
UploadFiles()
print("Files Copied")
As from your question I am not able to get how your function is getting triggered, and where you are uploading your files.
So as per your logic you can use the above piece of code to upload all type of files.
Currently above code can be used to upload all the files in a local folder. Below is the screenshot for a repro:

read google bucket files using python

I have to read google bucket files which are in xlsx format.
The file structure in the bucket look like
bucket_name
folder_name_1
file_name_1
folder_name_2
folder_name_3
file_name_3
The python snippet looks like
def main():
storage_client = storage.Client.from_service_account_json(
Constants.GCP_CRENDENTIALS)
bucket = storage_client.bucket(Constants.GCP_BUCKET_NAME)
blob = bucket.blob(folder_name_2 + '/' + Constants.GCP_FILE_NAME)
data_bytes = blob.download_as_bytes()
df = pd.read_excel(data_bytes, engine='openpyxl')
print(df)
def function1():
print("no file in the folder") # sample error
In the above snippet, I'm trying to open folder_name_2, it returns an error because there's no file to read.
Instead of throwing an error, I need to use function1 to print the error whenever there's no file in any folder.
Any ideas of doing this?

I'm not familiar with the GCP API, but you're going to want to do something along the lines of this:
try:
blob = bucket.blob(folder_name_2 + '/' + Constants.GCP_FILE_NAME)
data_bytes = blob.download_as_bytes()
except Exception as e:
print(e)
https://docs.python.org/3/tutorial/errors.html#handling-exceptions

I'm not sure to understand what is your final goal, but an other logic is to list available resources in the bucket, and process it.
First, let's define a fonction that will list the available resources in a Bucket. You can add a prefix if you want to limit the research to a sub folder inside the Bucket.
def list_resource(client, bucket_name, prefix=''):
path_files = []
for blob in client.list_blobs(bucket_name, prefix=prefix):
path_files.append(blob.name)
return path_files
Now you can process your xlsx files:
for resource in list_resource(storage_client, Constants.GCP_BUCKET_NAME):
if '.xlsx' in resource:
print(resource)
# Load blob and process your xlsx file

Download zip file with Django

I'm quite new on Django and i'm looking for a way to dwonload a zip file from my django site but i have some issue when i'm running this piece of code:
def download(self):
dirName = settings.DEBUG_FOLDER
name = 'test.zip'
with ZipFile(name, 'w') as zipObj:
# Iterate over all the files in directory
for folderName, subfolders, filenames in os.walk(dirName):
for filename in filenames:
# create complete filepath of file in directory
filePath = os.path.join(folderName, filename)
# Add file to zip
zipObj.write(filePath, basename(filePath))
path_to_file = 'http://' + sys.argv[-1] + '/' + name
resp= {}
# Grab ZIP file from in-memory, make response with correct MIME-type
resp = HttpResponse(content_type='application/zip')
# ..and correct content-disposition
resp['Content-Disposition'] = 'attachment; filename=%s' % smart_str(name)
resp['X-Sendfile'] = smart_str(path_to_file)
return resp
I get:
Exception Value:
<HttpResponse status_code=200, "application/zip"> is not JSON serializable
I tried to change the content_type to octet-stream but it doesn't work
And to use a wrapper as followw:
wrapper = FileWrapper(open('test.zip', 'rb'))
content_type = 'application/zip'
content_disposition = 'attachment; filename=name'
# Grab ZIP file from in-memory, make response with correct MIME-type
resp = HttpResponse(wrapper, content_type=content_type)
# ..and correct content-disposition
resp['Content-Disposition'] = content_disposition
I didn't find useful answer so far but maybe I didn't search well, so if it seems my problem had been already traited, feel free to notify me
Thank you very much for any help

You have to send the zip file as byte
response = HttpResponse(zipObj.read(), content_type="application/zip")
response['Content-Disposition'] = 'attachment; filename=%s' % smart_str(name)
return response

I would do like this:
(Caveat I use wsl so the python function will make use of cmd lines)
In view:
import os
def zipdownfun(request):
""" Please establish in settings.py where media file should be downloaded from.
In my case is media with a series of other folders inside. Media folder is at the same level of project root folder, where settings.py is"""
file_name = os.path.join(MEDIA_URL,'folder_where_your_file_is','file_name.zip')
"""let us put the case that you have zip folder in media folder"""
file_folder_path = os.path.join(MEDIA_URL,'saving_folder')
"""The command line takes as first variable the name of the
future zip file and as second variable the destination folder"""
cmd = f'zip {file_name} {file_folder_path}'
"""With os I open a process in the background so that some magic
happens"""
os.system(cmd)
"""I don't know what you want to do with this, but I placed the
URL of the file in a button for the download, so you will need
the string of the URL to place in href of an <a> element"""
return render(request,'your_html_file.html', {'url':file_name})
The db I have created, will be updated very often. I used a slightly different version of this function with -r clause since I had to zip, each time, a folder. Why I did this? The database I have created has to allow the download of this zipped folder. This folder will be updated daily. So this function basically overwrites the file each time that is downloaded. It will be so fresh of new data each time.
Please refer to this page to understand how to create a button for the download of the generated file.
Take as reference approach 2. The URL variable that you are passing to the Django template should be used at the place of the file (screenshot attached)
I hope it can help!

updating file with Google Drive API with mimeType html causes margin increase

I am trying to download a file using Google Drive (in python), edit some values, saving the file locally and then upload that local file as an update. I am getting the HTML version of the file and uploading using mimeType = "text/html". The edits and uploads work except for the margins or line spacing on headings (h2 & h3) increase slightly each time the script is run.
I have tried putting the content directly into the local file after the download without editing it and the same thing happens (see code below). Has anyone any ideas as to what might be causing this?
KB
# get the google drive api object
service = get_service()
# search for file
params = {}
params["q"] = "title = 'Test File'"
values = service.files().list(**params).execute()
# get file
found_file = values['items'][0]
# download file content
content = download_file(service, found_file, "text/html")
f = open('temp.html', 'wb')
f.write(content)
f.close()
file = service.files().get(fileId=found_file['id']).execute()
# set the new values
media_body = MediaFileUpload("temp.html", mimetype="text/html",
resumable=True)
try:
# update the file
updated_file = service.files().update(fileId=found_file['id'], body=file,
media_body=media_body).execute()
except:
print("error uploading file")

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

App Engine - download files from Cloud Storage - python

To download a file instead of showing it in the browser, you need to add a header to your response: self.response.headers["Content-Disposition"] = 'attachment; filename="%s"' % filename You can specify the filename as shown above and it works for any file type.

Related

Download a single file from a shared Dropbox's folder without having the share link of this file

Elaborate and store inside an azure blob multiple files from a form-data request by using azure functions

read google bucket files using python

Download zip file with Django

updating file with Google Drive API with mimeType html causes margin increase

Categories

Resources