how can i open a file which is in a zip file - python

I want to open a html file and that html file is in a zip file(both name is same) and i'm trying to open that html file.
old_file = input("DRAG:") #dir C:\Users\GG\PycharmProjects\pythonProject\f1dbef77-342b-4026-85d8-7f30fe691a63_f.zip
file_parts = old_file.split(".") #[C:\Users\GG\PycharmProjects\pythonProject\f1dbef77-342b-4026-85d8-7f30fe691a63_f] [zip]
first= file_parts[0]
direcs = first.split("\\")
file_itself = direcs[-1] # the file name that i need to use
last = file_parts[1]
file = open(f'{first}.zip\\{file_itself}.html', encoding="UTF-8").read()

You should first unzip the archive in a temporary folder, then you should open the file from there, and when everything is done, you may delete the folder in which you have extracted your data.
You may use python ZipFile as library and the extract() call to unzip your html.
See ZipFile Docs

Related

How to unzip a file on flask/Python

Hi is it possible for a user to upload a file using flask;
user would select it from there computer, select submit, which would be downloaded to a ZIP file folder on webserver(local host) and unzip that file, and search for a certain file within that unzip file directory
I have the functionality of the form down to upload it can’t figuire out how to unzip the file and save its content in a folder
This may work for your problem:
You'd used this first then,
response = make_response(log_file.text)
this for the second
handle.write(response.content)
.content is "Content of the response, in bytes." .text is "Content of the response, in unicode."
If you want a byte stream, use .content.
for further comprehension go to:
Can't unzip file retrieved with Requests in Flask app
or to:
Unzipping files in Python
One of them should work
its pure python unzip file
import zipfile
with zipfile.ZipFile(path_to_zip_file, 'r') as zip_ref:
zip_ref.extractall(directory_to_extract_to)
then you can use it in your flask app
import zipfile
#app.route("/",methods=["POST"])
def page_name_post():
file = request.files['data_zip_file']
file_like_object = file.stream._file
zipfile_ob = zipfile.ZipFile(file_like_object)
file_names = zipfile_ob.namelist()
# Filter names to only include the filetype that you want:
file_names = [file_name for file_name in file_names if file_name.endswith(".txt")]
files = [(zipfile_ob.open(name).read(),name) for name in file_names]
return str(files)

How to temporarily re-name a file or Create a re-named temp-file in Python before zipping it

In the below code I am trying to zip a list list of files, I am trying to rename the files before zipping it. So the file name will be in a more readable format for the user.
It works for the first time, but when I do it again It fails with the error the file name already exists
Returning the response via Django Rest Framework via FileResponse.
Is there any more simplistic way to achieve this?
filenames_list=['10_TEST_Comments_12/03/2021','10_TEST_Posts_04/10/2020','10_TEST_Likes_04/09/2020']
with zipfile.ZipFile(fr"reports/downloads/reports.zip", 'w') as zipF:
for file in filenames_list:
friendly_name = get_friendly_name(file)
if friendly_name is not None:
os.rename(file,fr"/reports/downloads/{friendly_name}")
file = friendly_name
zipF.write(fr"reports/downloads/{file}", file, compress_type=zipfile.ZIP_DEFLATED)
zip_file = open(fr"reports/downloads/reports.zip", 'rb')
response = FileResponse(zip_file)
return response
ZipFile.write has a second parameter, arcname, which allows you to rename files without any copying. You don't need to move file to a separate folder, or actually rename it.
from os.path import basename
for file in filenames_list:
if (name := get_friendly_name(file)) is None:
name = basename(file)
zipF.write(file, name, compress_type=zipfile.ZIP_DEFLATED)
By stripping off the basename, you avoid the need to move to a common folder at all.

rename the folder of an extracted file

I have some .tar.gz files and I can extract them with:
if (fname.endswith("tar.gz")):
tar = tarfile.open(fname, "r:gz")
tar.extractall()
tar.close()
But I want to add all the info of the extracted file in a .txt file, but I don't know the folders' name that .tar.gz files have inside to do it. Is it possible to know/rename the folders if you don't know the names and extract them? Thank you.
Each entry in the tarfile has a TarInfo header. You can get that info several ways, the easiest is just by iteration. That includes the path name which you can manage with os.posixpath functions. For example, given a tgz file I happen to have on hand:
>>> tf = tarfile.open("Downloads/dbutil-0.5.0.tar.gz", "r:gz")
>>> for info in tf:
... print(info.name, "DIR" if info.isdir() else "FILE")
...
dbutil-0.5.0 DIR
dbutil-0.5.0/setup.py FILE
dbutil-0.5.0/dbutil DIR
dbutil-0.5.0/dbutil/connection.py FILE
dbutil-0.5.0/dbutil/__init__.py FILE
dbutil-0.5.0/dbutil/row.py FILE
dbutil-0.5.0/PKG-INFO FILE
dbutil-0.5.0/dbutil.egg-info DIR
dbutil-0.5.0/dbutil.egg-info/dependency_links.txt FILE
dbutil-0.5.0/dbutil.egg-info/PKG-INFO FILE
dbutil-0.5.0/dbutil.egg-info/SOURCES.txt FILE
dbutil-0.5.0/dbutil.egg-info/top_level.txt FILE
dbutil-0.5.0/setup.cfg FILE
I would suggest comparing the list of files in the directory before and after archive extracting. Additional files and folders will be those from tar file.

Find files that had been stored from a single text file

Is there any way for me to read file that I saved inside a text file using python?
For example I have a file called filenames.txt. The content of the file should have name of other files such as:
/home/ikhwan/acespc.c
/home/ikhwan/trloc.cpp
/home/ikhwan/Makefile.sh
/home/ikhwan/Readme.txt
So, theoretically what I want to do is I have a Python script to change some header of the file. So filenames.txt will act as a platform for me whenever I want to run the script to change only selected file. The reason is I have so many files inside directory and subdirectories and I just want python to read only the files that I put inside filenames.txt and only change that particular file. In the future, if I want to run the script on other files, I just can add or replace filenames in filenames.txt
So the flow of the script will be as follows:
Run script-->script start search for the filenames inside filenames.txt-->script will add or change header of the file.
Current, i used os.walk but it will search within all directory and subdirectory. Here are my current function.
def read_file(file):
skip = 0
headStart = None
headEnd = None
yearsLine = None
haveLicense = False
extension = os.path.splitext(file)[1]
logging.debug("File extension is %s",extension)
type = ext2type.get(extension)
logging.debug("Type for this file is %s",type)
if not type:
return None
settings = typeSettings.get(type)
with open(file,'r') as f:
lines = f.readlines()
You don't need to walk through the file system if you already have your file paths listed in the filenames.txt, just open it, read it line by line and then process each file path from it, e.g.
# this is your method that will be called with each file path from the filenames.txt
def process_file(path):
# do whatever you want with `path` in terms of processing
# let's just print it to STDOUT as an example
with open(path, "r") as f:
print(f.read())
with open("filenames.txt", "r") as f: # open filenames.txt for reading
for line in f: # read filenames.txt line by line
process_file(line.rstrip()) # send the path stored on the line to process_file()

File writing is not working with pyPdf?

I am newer to python. I was try open the pdf files and write its content into the
new text files. That the text files name are generate by the pdf name. I tried so far but it is not give what i expect. How can i achieve it
import glob, os
import pyPdf
os.chdir("pdf/")
for file in glob.glob("*.pdf"):
filena = file
filename = "c:/documents/"+filena+".txt"
target = open(filename,'w')
pdf = pyPdf.PdfFileReader(open(filena,"rb"))
for page in pdf.pages:
target.write (page.extractText())
target.close()
Results the Error
File "c:/documents/atpkinase.pdf.txt",line 7, in <module>
target = open(filename,'w')
IOError: [Errno 2] No such file or directory: "c:/documents/atpkinase.pdf.txt"
Looks like if the directory "c:/documents/" does not exist. To write file to it you must create directory first. To check directory existent (and create it if needed) you can use
dir = "c:/documents"
if not os.path.exists(dir):
os.makedirs(dir)
Also, filea contains file name with extension, and when you create filename you need only a file name of old file without extension.

Categories

Resources