Generate and upload a file to s3 with flask - python

is it possible to generate and upload a file to s3?
I have tried to implement this flask application, but it dies with this error:
OSError: [Errno 30] Read-only file system: '/report.csv'
This is my app.py:
#app.route("/upload/<string:START_REPORT>/<string:STOP_REPORT>/<string:IDS>/", methods=['POST'])
def upload(START_REPORT, STOP_REPORT, IDS):
"""
Function to upload a file to an S3 bucket
"""
os.environ["START_REPORT"] = START_REPORT
os.environ["STOP_REPORT"] = STOP_REPORT
os.environ["IDS"] = IDS
dft, filename = report_visit.report()
dft.to_csv(filename, index=False) # <-- Error here
object_name = filename
s3_client = boto3.client(
"s3",
region_name='eu-west-1',
endpoint_url="url",
aws_access_key_id="key",
aws_secret_access_key="key"
)
with open(filename, "rb") as f:
response = s3_client.upload_fileobj(f, app.config['UPLOAD_FOLDER'], f.filename)
return response
if __name__ == '__main__':
app.run (debug=True, port='8080', host='127.0.0.1')
where filename is 'report.csv'
This is the post request I do with postman:
POST http://localhost:8080/upload/2020_2_10_0_0_0_0/2020_2_10_1_0_0_0/138/
It would be hard I think for me to give a proper working example, but any suggestion are welcome
the function report_visit.report() returns a pandas dataframe and the name of the filename
Any suggestion would be appreciated, thanks
Matteo

You are trying to write into the root directory '/'. So, your app needs root permissions (run with sudo; highly not recommended by security thoughts) or specify some path to tmp folder to store those files there(looks like you want to use "./report.csv" - file will be created into working directory).

Related

unable to find the path for uploading a file using streamlit python code

Im writting a simple python application where the user selects a file from their local file manager and tries to upload using strealit
Im able to succesfully take the file the user had given using streamlit.uploader and stored the file in a temp directory from the stramlit app folder but the issue is i cant give the path of the file of the file stored in the newly created directory in order to send the application into my gcp clouds bucket
Adding my snippet below any help is appreciated :)
import streamlit as st
from google.oauth2 import service_account
from google.cloud import storage
import os
from os import listdir
from os.path import isfile, join
from pathlib import Path
from PIL import Image, ImageOps
bucketName=('survey-appl-dev-public')
# Create API client.
credentials = service_account.Credentials.from_service_account_info(
st.secrets["gcp_service_account"]
)
client = storage.Client(credentials=credentials)
#create a bucket object to get bucket details
bucket = client.get_bucket(bucketName)
file = st.file_uploader("Upload An file")
def main():
if file is not None:
file_details = {"FileName":file.name,"FileType":file.type}
st.write(file_details)
#img = load_image(image_file)
#st.image(img, caption='Sunrise by the mountains')
with open(os.path.join("tempDir",file.name),"wb") as f:
f.write(file.getbuffer())
st.success("Saved File")
object_name_in_gcs_bucket = bucket.blob(".",file.name)
object_name_in_gcs_bucket.upload_from_filename("tempDir",file.name)
if __name__ == "__main__":
main()
ive tried importing the path of the file using cwd command and also tried os library for file path but nothing worked
edited:
All i wanted to implement is make a file upload that is selected by customer using the dropbox of file_uploader option im able to save the file into a temporary directory after the file is selected using the file.getbuffer as shown in the code but i couldnt amke the code uploaded into the gcs bucket since its refering as str cannnot be converted into int while i press the upload button
may be its the path issue "the code is unable to find the path of the file stored in the temp directory " but im unable to figure iut how to give the path to the upload function
error coding im facing
TypeError: '>' not supported between instances of 'str' and 'int'
Traceback:
File "/home/raviteja/.local/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 564, in _run_script
exec(code, module.__dict__)
File "/home/raviteja/test/streamlit/test.py", line 43, in <module>
main()
File "/home/raviteja/test/streamlit/test.py", line 29, in main
object_name_in_gcs_bucket = bucket.blob(".",file.name)
File "/home/raviteja/.local/lib/python3.10/site-packages/google/cloud/storage/bucket.py", line 795, in blob
return Blob(
File "/home/raviteja/.local/lib/python3.10/site-packages/google/cloud/storage/blob.py", line 219, in __init__
self.chunk_size = chunk_size # Check that setter accepts value.
File "/home/raviteja/.local/lib/python3.10/site-packages/google/cloud/storage/blob.py", line 262, in chunk_size
if value is not None and value > 0 and value % self._CHUNK_SIZE_MULTIPLE != 0:
Thanks all for response after days of struggle at last I've figured out the mistake im making.
I dont know if I'm right or wrong correct me if I'm wrong but this worked for me:
object_name_in_gcs_bucket = bucket.blob("path-to-upload"+file.name)
Changing the , to + between the filepath and filename made my issue solve.
Sorry for the small issue.
Happy that I could solve it.
You have some variables in your code and I guess you know what they represent. Try this out else make sure you add every relevant information to the question and the code snippet.
def main():
file = st.file_uploader("Upload file")
if file is not None:
file_details = {"FileName":file.name,"FileType":file.type}
st.write(file_details)
file_path = os.path.join("tempDir/", file.name)
with open(file_path,"wb") as f:
f.write(file.getbuffer())
st.success("Saved File")
print(file_path)
def upload():
file_name = file_path
read_file(file_name)
st.write(file_name)
st.session_state["upload_state"] = "Saved successfully!"
object_name_in_gcs_bucket = bucket.blob("gcp-bucket-destination-path"+ file.name)
object_name_in_gcs_bucket.upload_from_filename(file_path)
st.write("Youre uploading to bucket", bucketName)
st.button("Upload file to GoogleCloud", on_click=upload)
if __name__ == "__main__":
main()
This one works for me.
Solution 1
import streamlit as st
from google.oauth2 import service_account
from google.cloud import storage
import os
STREAMLIT_SCRIPT_FILE_PATH = os.path.dirname(os.path.abspath(__file__))
credentials = service_account.Credentials.from_service_account_info(
st.secrets["gcp_service_account"]
)
client = storage.Client(credentials=credentials)
def main():
bucketName = 'survey-appl-dev-public'
file = st.file_uploader("Upload file")
if file is not None:
file_details = {"FileName":file.name,"FileType":file.type}
st.write(file_details)
with open(os.path.join("tempDir", file.name), "wb") as f:
f.write(file.getbuffer())
st.success("Saved File")
bucket = client.bucket(bucketName)
object_name_in_gcs_bucket = bucket.blob(file.name)
# src_relative = f'./tempDir/{file.name}' # also works
src_absolute = f'{STREAMLIT_SCRIPT_FILE_PATH}/tempDir/{file.name}'
object_name_in_gcs_bucket.upload_from_filename(src_absolute)
if __name__ == '__main__':
main()
Solution 2
Instead of saving the file to disk, use the file bytes directly using upload_from_string().
References:
Google Cloud upload_from_string
Streamlit file uploader
credentials = service_account.Credentials.from_service_account_info(
st.secrets["gcp_service_account"]
)
client = storage.Client(credentials=credentials)
def gcs_upload_data():
bucket_name = 'your_gcs_bucket_name'
file = st.file_uploader("Upload file")
if file is not None:
fname = file.name
ftype = file.type
file_details = {"FileName":fname,"FileType":ftype}
st.write(file_details)
# Define gcs bucket.
bucket = client.bucket(bucket_name)
bblob = bucket.blob(fname)
# Upload the bytes directly instead of a disk file.
bblob.upload_from_string(file.getvalue(), ftype)
if __name__ == '__main__':
gcs_upload_data()

Elaborate and store inside an azure blob multiple files from a form-data request by using azure functions

I am developing an azure function that receives in input several files of different formats (eg xlsx, csv, txt, pdf, png) through the form-data format. The idea is to develop a function that can take files and store them one by one inside a blob. At the moment, my code is as follows:
def main(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
filename, contents = False, False
try:
files = req.files.values()
for file in files:
filename = str(file.filename)
logging.info(type(file.stream.read()))
contents = file.stream.read().decode('utf-8')
except Exception as ex:
logging.error(str(type(ex)) + ': ' + str(ex))
return func.HttpResponse(body=str(ex), status_code=400)
Then i write the content variable inside the blob but the files inside the blob had 0 as size and if i try to download the file, the files are empty. How can i manage this operation to store different format files inside a blob? Thanks a lot for your support!
Below is the code to upload multiple files of different formats:
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient,PublicAccess
import os
def UploadFiles():
CONNECTION_STRING="ENTER_STORAGE_CONNECTION_STRING"
Container_name="uploadcontainer"
service_client=BlobServiceClient.from_connection_string(CONNECTION_STRING)
container_client = service_client.get_container_client(Container_name)
ReplacePath = "C:\\"
local_path = "C:\Testupload" #the local folder
for r,d,f in os.walk(local_path):
if f:
for file in f:
AzurePath = os.path.join(r,file).replace(ReplacePath,"")
LocalPath = os.path.join(r,file)
blob_client = container_client.get_blob_client(AzurePath)
with open(LocalPath,'rb') as data:
blob_client.upload_blob(data)
if __name__ == '__main__':
UploadFiles()
print("Files Copied")
As from your question I am not able to get how your function is getting triggered, and where you are uploading your files.
So as per your logic you can use the above piece of code to upload all type of files.
Currently above code can be used to upload all the files in a local folder. Below is the screenshot for a repro:

read google bucket files using python

I have to read google bucket files which are in xlsx format.
The file structure in the bucket look like
bucket_name
folder_name_1
file_name_1
folder_name_2
folder_name_3
file_name_3
The python snippet looks like
def main():
storage_client = storage.Client.from_service_account_json(
Constants.GCP_CRENDENTIALS)
bucket = storage_client.bucket(Constants.GCP_BUCKET_NAME)
blob = bucket.blob(folder_name_2 + '/' + Constants.GCP_FILE_NAME)
data_bytes = blob.download_as_bytes()
df = pd.read_excel(data_bytes, engine='openpyxl')
print(df)
def function1():
print("no file in the folder") # sample error
In the above snippet, I'm trying to open folder_name_2, it returns an error because there's no file to read.
Instead of throwing an error, I need to use function1 to print the error whenever there's no file in any folder.
Any ideas of doing this?
I'm not familiar with the GCP API, but you're going to want to do something along the lines of this:
try:
blob = bucket.blob(folder_name_2 + '/' + Constants.GCP_FILE_NAME)
data_bytes = blob.download_as_bytes()
except Exception as e:
print(e)
https://docs.python.org/3/tutorial/errors.html#handling-exceptions
I'm not sure to understand what is your final goal, but an other logic is to list available resources in the bucket, and process it.
First, let's define a fonction that will list the available resources in a Bucket. You can add a prefix if you want to limit the research to a sub folder inside the Bucket.
def list_resource(client, bucket_name, prefix=''):
path_files = []
for blob in client.list_blobs(bucket_name, prefix=prefix):
path_files.append(blob.name)
return path_files
Now you can process your xlsx files:
for resource in list_resource(storage_client, Constants.GCP_BUCKET_NAME):
if '.xlsx' in resource:
print(resource)
# Load blob and process your xlsx file

Download zip file with Django

I'm quite new on Django and i'm looking for a way to dwonload a zip file from my django site but i have some issue when i'm running this piece of code:
def download(self):
dirName = settings.DEBUG_FOLDER
name = 'test.zip'
with ZipFile(name, 'w') as zipObj:
# Iterate over all the files in directory
for folderName, subfolders, filenames in os.walk(dirName):
for filename in filenames:
# create complete filepath of file in directory
filePath = os.path.join(folderName, filename)
# Add file to zip
zipObj.write(filePath, basename(filePath))
path_to_file = 'http://' + sys.argv[-1] + '/' + name
resp= {}
# Grab ZIP file from in-memory, make response with correct MIME-type
resp = HttpResponse(content_type='application/zip')
# ..and correct content-disposition
resp['Content-Disposition'] = 'attachment; filename=%s' % smart_str(name)
resp['X-Sendfile'] = smart_str(path_to_file)
return resp
I get:
Exception Value:
<HttpResponse status_code=200, "application/zip"> is not JSON serializable
I tried to change the content_type to octet-stream but it doesn't work
And to use a wrapper as followw:
wrapper = FileWrapper(open('test.zip', 'rb'))
content_type = 'application/zip'
content_disposition = 'attachment; filename=name'
# Grab ZIP file from in-memory, make response with correct MIME-type
resp = HttpResponse(wrapper, content_type=content_type)
# ..and correct content-disposition
resp['Content-Disposition'] = content_disposition
I didn't find useful answer so far but maybe I didn't search well, so if it seems my problem had been already traited, feel free to notify me
Thank you very much for any help
You have to send the zip file as byte
response = HttpResponse(zipObj.read(), content_type="application/zip")
response['Content-Disposition'] = 'attachment; filename=%s' % smart_str(name)
return response
I would do like this:
(Caveat I use wsl so the python function will make use of cmd lines)
In view:
import os
def zipdownfun(request):
""" Please establish in settings.py where media file should be downloaded from.
In my case is media with a series of other folders inside. Media folder is at the same level of project root folder, where settings.py is"""
file_name = os.path.join(MEDIA_URL,'folder_where_your_file_is','file_name.zip')
"""let us put the case that you have zip folder in media folder"""
file_folder_path = os.path.join(MEDIA_URL,'saving_folder')
"""The command line takes as first variable the name of the
future zip file and as second variable the destination folder"""
cmd = f'zip {file_name} {file_folder_path}'
"""With os I open a process in the background so that some magic
happens"""
os.system(cmd)
"""I don't know what you want to do with this, but I placed the
URL of the file in a button for the download, so you will need
the string of the URL to place in href of an <a> element"""
return render(request,'your_html_file.html', {'url':file_name})
The db I have created, will be updated very often. I used a slightly different version of this function with -r clause since I had to zip, each time, a folder. Why I did this? The database I have created has to allow the download of this zipped folder. This folder will be updated daily. So this function basically overwrites the file each time that is downloaded. It will be so fresh of new data each time.
Please refer to this page to understand how to create a button for the download of the generated file.
Take as reference approach 2. The URL variable that you are passing to the Django template should be used at the place of the file (screenshot attached)
I hope it can help!

How to get file name from uploads via put request

I am developing API using Flask-restplus. One of the endpoints handles audio file uploads which can be either mp3 or wav format. According to PUT request to upload a file not working in Flask, file uploaded by put is in either request.data or request.stream. So this is what I did:
#ns.route('/upload')
class AudioUpload(Resource):
def put(self):
now = datetime.now()
filename = now.strftime("%Y%m%d_%H%M%S") + ".mp3"
cwd = os.getcwd()
filepath = os.path.join(cwd, filename)
with open(filepath, 'wb') as f:
f.write(request.stream.read())
return filepath
I am saving the file as mp3. However sometime the file comes in as wav. Is there a way to get the original file name from put request in the similar way as post request:
file = request.files['file']
filename = file.filename

Categories

Resources