how to save file as zip without saving it to local folder - python

I'm trying to create a download function for my streamlit app. But what I currently have allows me to download a zip file via a button on my streamlit app but unfortunately it also saves it to my local folder. I don't want it to save to my local folder. The problem is when I initialize the file_zip object. I want the zip file in a specific name ideally the same name of the file that the user upload with a '.zip' extension (i.e datafile that contains the string file name as a parameter in the function). But everytime I do that it keeps saving the zip file in my local folder. Is there an alternative to this? BTW I'm trying to save list of pandas dataframe into one zip file.
def downloader(list_df, datafile, file_type):
file = datafile.name.split(".")[0]
#create zip file
with zipfile.ZipFile("{}.zip".format(file), 'w', zipfile.ZIP_DEFLATED) as file_zip:
for i in range(len(list_df)):
file_zip.writestr(file+"_group_{}".format(i)+".csv", pd.DataFrame(list_df[i]).to_csv())
file_zip.close()
#pass it to front end for download
zip_name = "{}.zip".format(file)
with open(zip_name, "rb") as f:
bytes=f.read()
b64 = base64.b64encode(bytes).decode()
href = f'Click Here To Download'
st.markdown(href, unsafe_allow_html=True)

It sounds like you want to create the zip file in memory and use it later to build a base64 encoding. You can use an io.BytesIO() object with ZipFile, rewind it, and read the data back for base64 encoding.
import io
def downloader(list_df, datafile, file_type):
file = datafile.name.split(".")[0]
#create zip file
zip_buf = io.BytesIO()
with zipfile.ZipFile(zip_buf, 'w', zipfile.ZIP_DEFLATED) as file_zip:
for i in range(len(list_df)):
file_zip.writestr(file+"_group_{}".format(i)+".csv", pd.DataFrame(list_df[i]).to_csv())
zip_buf.seek(0)
#pass it to front end for download
zip_name = "{}.zip".format(file)
b64 = base64.b64encode(zip_buf.read()).decode()
del zip_buf
href = f'Click Here To download'
st.markdown(href, unsafe_allow_html=True)

Related

How to read contents of zip file in memory on a file upload in python?

I have a zip file that I receive when the user uploads a file. The zip essentially contains a json file which I want to read and process without having to create the zip file first, then unzipping it and then reading the content of the inner file.
Currently I only the longer process which is something like below
import json
import zipfile
#csrf_exempt
def get_zip(request):
try:
if request.method == "POST":
try:
client_file = request.FILES['file']
file_path = "/some/path/"
# first dump the zip file to a directory
with open(file_path + '%s' % client_file.name, 'wb+') as dest:
for chunk in client_file.chunks():
dest.write(chunk)
# unzip the zip file to the same directory
with zipfile.ZipFile(file_path + client_file.name, 'r') as zip_ref:
zip_ref.extractall(file_path)
# at this point we get a json file from the zip say `test.json`
# read the json file content
with open(file_path + "test.json", "r") as fo:
json_content = json.load(fo)
doSomething(json_content)
return HttpResponse(0)
except Exception as e:
return HttpResponse(1)
As you can see, this involves 3 steps to finally get the content from the zip file into memory. What I want is get the content of the zip file and load directly into memory.
I did find some similar questions in stack overflow like this one https://stackoverflow.com/a/2463819 . But I am not sure at what point do I invoke this operation mentioned in the post
How can I achieve this?
Note: I am using django in backend.
There will always be one json file in the zip.
From what I understand, what #jason is trying to say here is to first open a zipFile just like you have done here with zipfile.ZipFile(file_path + client_file.name, 'r') as zip_ref:.
class zipfile.ZipFile(file[, mode[, compression[, allowZip64]]])
Open a ZIP file, where file can be either a path to a file (a string) or a file-like object.
And then use BytesIO read in the bytes of a file-like object. But from above you are reading in r mode and not rb mode. So change it as follows.
with open(filename, 'rb') as file_data:
bytes_content = file_data.read()
file_like_object = io.BytesIO(bytes_content)
zipfile_ob = zipfile.ZipFile(file_like_object)
Now zipfile_ob can be accessed from memory.
The first argument to zipfile.ZipFile() can be a file object rather than a pathname. I think the Django UploadedFile object supports this use, so you can read directly from that rather than having to copy into a file.
You can also open the file directly from the zip archive rather than extracting that into a file.
import json
import zipfile
#csrf_exempt
def get_zip(request):
try:
if request.method == "POST":
try:
client_file = request.FILES['file']
# unzip the zip file to the same directory
with zipfile.ZipFile(client_file, 'r') as zip_ref:
first = zip_ref.infolist()[0]
with zip_ref.open(first, "r") as fo:
json_content = json.load(fo)
doSomething(json_content)
return HttpResponse(0)
except Exception as e:
return HttpResponse(1)

how to save the json file as csv file by the name of the json file with .csv extension

I have a set of 200 JSON file in a folder, i have written a code to take each file from the folder and then convert the JSON files to data-frame do the necessary step and finally save the data-frame as a csv file,the problem i face is to save the csv file, i wanted to save the file as csv in the name of the JSON file.
Since i am taking the folder and processing the files one by one how can i do that
i tried this form
df.to_csv(filename)
but i have to give the filename
Assuming you are not accessing the file by manually calling its name:
with open('whatever.json', 'rb') as file
And using something like glob. I would do something like this:
import os
#File = to whatever variable name you have assigned to the opened json file
filename = os.path.basename(File.name)
filename = filename.split('.')[0]
filename += '.csv'
as requested:
with open(filename, 'w') as file:
file.write(your csv data)
file.close()

Extract particular file from zip blob stored in azure container with python using Jupyter notebook

I had uploaded zip file in my azure account as a blob in azure container.
Zip file contains .csv, .ascii files and many other formats.
I need to read specific file, lets say ascii file data containing in zip file. I am using python for this case.
How to read particular file data from this zip file without downloading it on local? I would like to handle this process in memory only.
I am also trying with jypyter notebook provided by azure for ML functionality
I am using ZipFile python package for this case.
Request you to assist in this matter to read the file
Please find following code snippet.
blob_service=BlockBlobService(account_name=ACCOUNT_NAME,account_key=ACCOUNT_KEY)
blob_list=blob_service.list_blobs(CONTAINER_NAME)
allBlobs = []
for blob in blob_list:
allBlobs.append(blob.name)
sampleZipFile = allBlobs[0]
print(sampleZipFile)
The below code should work. This example accesses an Azure Container using an Account URL and Key combination.
from azure.storage.blob import BlobServiceClient
from io import BytesIO
from zipfile import ZipFile
key = r'my_key'
service = BlobServiceClient(account_url="my_account_url",
credential=key
)
container_client = service.get_container_client('container_name')
zipfilename = 'myzipfile.zip'
blob_data = container_client.download_blob(zipfilename)
blob_bytes = blob_data.content_as_bytes()
inmem = BytesIO(blob_bytes)
myzip = ZipFile(inmem)
otherfilename = 'mycontainedfile.csv'
filetoread = BytesIO(myzip.read(otherfilename))
Now all you have to do is pass filetoread into whatever method you would normally use to read a local file (eg. pandas.read_csv())
you could use below code for reading file inside .zip file without extracting in python
import zipfile
archive = zipfile.ZipFile('images.zip', 'r')
imgdata = archive.read('img_01.png')
For details , you can refer to ZipFile docs here
Alternatively, you can do something like this
-- coding: utf-8 --
"""
Created on Mon Apr 1 11:14:56 2019
#author: moverm
"""
import zipfile
zfile = zipfile.ZipFile('C:\\LAB\Pyt\sample.zip')
for finfo in zfile.infolist():
ifile = zfile.open(finfo)
line_list = ifile.readlines()
print(line_list)
Here is the output for the same
Hope it helps.

File upload at web2py

I am using the web2py framework.
I have uploaded txt a file via SQLFORM and the file is stored in the "upload folder", now I need to read this txt file from the controller, what is the file path I should use in the function defined in the default.py ?
def readthefile(uploaded_file):
file = open(uploaded_file, "rb")
file.read()
....
You can do join of application directory and upload folder to build path to file.
Do something like this:
import os
filepath = os.path.join(request.folder, 'uploads', uploaded_file_name)
file = open(filepath, "rb")
request.folder: the application directory. For example if the
application is "welcome", request.folder is set to the absolute path
"/path/to/welcome". In your programs, you should always use this
variable and the os.path.join function to build paths to the files you
need to access.
Read request.folder
The transformed name of the uploaded file is stored in the upload field of your database table, so you need a way to query the specific record that was inserted via the SQLFORM submission in order to get the name of the stored file. Here is how it would look assuming you know the record ID:
stored_filename = db.mytable(record_id).my_upload_field
original_filename, stream = db.mytable.my_upload_field.retrieve(stored_filename)
stream.read()
When you pass a filename to the .retrieve method of an upload field, it will return a tuple containing the original filename as well as the open file object (called stream in the code above).

Fetch a remote zip file and list the files within

I'm working on a small Google App Engine project in which I need to fetch a remote zip file from a URL and then list the files contained in the zip archive.
I'm using the zipfile module.
Here's what I've come up with so far:
# fetch the zip file from its remote URL
result = urlfetch.fetch(zip_url)
# store the contents in a stream
file_stream = StringIO.StringIO(result.content)
# create the ZipFile object
zip_file = zipfile.ZipFile(file_stream, 'w')
# read the files by name
archive_files = zip_file.namelist()
Unfortunately the archive_files list is always of length 0.
Any ideas what I'm doing wrong?
You are opening the file with w permissions, which truncates it. Change it to r permissions for reading:
zip_file = zipfile.ZipFile(file_stream, 'r')
Reference: http://docs.python.org/library/zipfile.html#zipfile-objects
You're opening the ZipFile for writing. Try reading instead.

Categories

Resources