I need to put the contents of a folder into a ZIP archive (In my case a JAR but they're the same thing).
Though I cant just do the normal adding every file individually, because for one... they change each time depending on what the user put in and it's something like 1,000 files anyway!
I'm using ZipFile, it's the only really viable option.
Basically my code makes a temporary folder, and puts all the files in there, but I can't find a way to add the contents of the folder and not just the folder itself.
Python allrady support zip and rar archife. Personaly I learn it form here https://docs.python.org/3/library/zipfile.html
And to get all files form that folder try somethign like
for r, d, f in os.walk(path):
for file in f:
#enter code here
Related
How to open files in a particular folder with randomly generated names? I have a folder named 2018 and the files within that folder are named randomly. I want to iterate through all of the files and open them up.
I will post three names of the files as an example but note that there are over a thousand files in this folder so it has to work on a large scale without any hard coding.
0a2ec2da-628d-417d-9520-b0889886e2ac_1.xml
00a6b260-951d-46b5-ab27-b2e8729e664d_1.xml
00a6b260-951d-46b5-ab27-b2e8729e664d_2.xml
You're looking for os.walk().
In general, if you want to do something with files, it's worth glancing at the os, os.path, pathlib and other built-in modules. They're all documented.
You could also use glob expansion to expand "folder/*" into a list of all the filenames, but os.walk is probably better.
With os.listdir() or os.walk(), depending on whether you want to do it recursively or not.
You can go through the python doc
https://docs.python.org/3/library/os.html#os.walk
https://docs.python.org/3/library/os.html#os.listdir
One you have list of files you can read it simply -
for file in files:
with open(file, "r") as f:
# perform file operations
I'am really new to this python scripting thing, but pretty sure, that there is the way to copy files from one folder (via given path in .txt file) to another.
there would be directly path to the folder, which contains photo files
I'am working with huge amounts of photos, which contains gps metadata (so i need not to lose it).
Really apreciate any help, thanks.
Here is a short and simple solution:
import shutil
import os
# replace file_list.txt with your files
file_list = open("file_list.txt", "r")
# replace my_dir with your copy dir
copy_dir = "my_dir"
for f in file_list.read().splitlines():
print(f"copying file: {f}")
shutil.copyfile(f, f"{copy_dir}/{os.path.split(f)[1]}")
file_list.close()
print("done")
It loops over all the files in the file list, and copies them. It should be fast enough.
I have been trying to figure out for a while how to check if there are .pkl files in a given directory. I checked the website and I could find ways to find if there are files in the directory and list them, but I just want to check if they are there.
In my directory are a total of 7 .pkl files, as soon as I create one, the others are created so to check if the seven of them exist, it will be enough to check if one exists. Therefore, I would like to check if there is any .pkl file.
This is working if I do:
os.path.exists('folder1/folder2/filename.pkl')
But I had to write one of my file names. I would like to do so without searching for a specific file. I also tried
os.path.exists('folder1/folder2/*.pkl'),
but it is not working neither as I don't have any file named *.pkl.
You can use the python module glob (https://docs.python.org/3/library/glob.html)
Specifically, glob.glob('folder1/folder2/*.pkl') will return a list of all .pkl files in folder2.
You can use :
for dir_path, dir_names, file_names in os.walk(search_dir):
# Go over all files and folders
for file_name in file_names:
if (file_name.endswith(".pkl")):
# do something like break after the first one you find
Note : This can be used if you want to search entire directory with sub directories also
In case you want to search only one directory , you can run the "for" on os.listdir(path)
I'm using python 2.7 and try to archive a directory:
the object is to: archive the whole folder's content
subject: C:\User\blah\blah\parentfolder
structure:C...parentfolder\ 1.docx
2.xlxs
4.pdf
subfolder\5.pdf
6.txt
Now I know that shutil.make_archive() can zip me the folder, but it turned out to be like this
C:\\User\\blah\\blah\\zipfile_name\\parentfolder\ 1.docx
2.xlxs
4.pdf
subfolder\5.pdf
6.txt
but I want a to skip the parentfolder layer in the zipfile_name folder, that is:(This is where the problem is different, I did learn shutil.make_archive from that question, but as I open the shutil module, I found out that make_archive has four intakes. I didn't know what makes them different and used the four input one, that's where I had my question. The previous one didn't not point out why it used just three inputs, and the other answers were using very complicated self-defined functions or so. The answer I chose in this problem compared and explained the difference, which I really appreciate. As a non-computer science majored, I think this kind of details help a lot of people.)
C:\\User\\blah\\blah\\zipfile_name\\ 1.docx
2.xlxs
4.pdf
subfolder\5.pdf
6.txt
if I use os.path.walk and so on to write every file to zipfile_name, it seems it will be super complicated, as I will need to first 1.list everything in the folder--while:1 file?--write into zip folder while:0 directory?--make_directory , and list all items, and while loop again this process( do until no directory exists in the last directory)
Plus I don't know how to make a directory in a .zip folder. So all this just seems complicated.
So is there a way to change a folder into an archive folder while keeping the folder structure?
It's just a matter of how you pass the information about the parentfolder folder in the parameters of make_archive.
Consider the following two examples:
shutil.make_archive('myarchive', 'zip', '/User/blah/blah', 'parentfolder')
This will create the zip archive myarchive.zip. If you open myarchive.zip you will first see the folder parentfolder and if you open that you will see the contents of the original parentfolder from which the archive was made.
shutil.make_archive('myarchive', 'zip', '/User/blah/blah/parentfolder')
This will create the zip archive myarchive.zip. If you open myarchive.zip you will immediately see the contents of the original parentfolder from which the archive was made. The folder name parentfolder will not appear in the archive.
The parameters are documented in https://docs.python.org/2/library/shutil.html, though to be honest it is not easy for me to see where the behavior I illustrated above is documented. I found it by trial and error
(Python 2.7.14 on Windows).
Im trying to convert all the pdf stored in one file, say 60 pdfs into text documents and store them in different folders. the folder should have unique names.
i tried this code.The folders where created, but the pdftotext conversion command doesnt work in the loop:
import os
def listfiles(path):
for root, dirs, files in os.walk(path):
for f in files:
print(f)
newpath = r'/home/user/files/'
p=f.replace("pdf","")
newpath=newpath+p
if not os.path.exists(newpath): os.makedirs(newpath)
os.system("pdftotext f f.txt")
f=listfiles("/home/user/reports")
One problem here is the os.system("pdftotext f f.txt") call. I assume you want the f's here replaced with the current file in the loop. If that is the case you need to change this to os.system("pdftotext {0} {0}.txt".format(f))
Another issue may be that the working directory is not being set up so the call to system is looking for the file in the wrong place. Try using os.chdir every time you change folders.
to place the text file in a diffrent folder try:
os.system("pdftotext {0} {1}/{0}.txt".format(f, newpath))
I don't know Python, but I think I can clearly see a mistake there. It looks like you are just replacing the ".pdf" with a ".txt". Since a PDF isn't just plain text, this won't work.
For the convertion look at the top answer of this post:
Python module for converting PDF to text