on Creating zip file all folders from root directory are added - python

I am trying to zip all the files and folders present in a folder3 using python.
I have used zipFile for this. The zip contains all the folders from the root directory to the directory I want to create zip folder of.
def CreateZip(dir_name):
os.chdir(dir_name)
zf = zipfile.ZipFile("temp.zip", "w")
for dirname, subdirs, files in os.walk(dir_name):
zf.write(dirname)
for filename in files:
file=os.path.join(dirname, filename)
zf.write(file)
zf.printdir()
zf.close()
Expected output:
toBeZippedcontent1\toBeZippedFile1.txt
toBeZippedcontent1\toBeZippedFile2.txt
toBeZippedcontent1\toBeZippedFile1.txt
toBeZippedcontent2\toBeZippedFile2.txt
Current output (folder structure inside zip file):
folder1\folder2\folder3\toBeZippedcontent1\toBeZippedFile1.txt
folder1\folder2\folder3\toBeZippedcontent1\toBeZippedFile2.txt
folder1\folder2\folder3\toBeZippedcontent2\toBeZippedFile1.txt
folder1\folder2\folder3\toBeZippedcontent2\toBeZippedFile2.txt

walk() gives absolute path for dirname so join() create absolut path for your files.
You may have to remove folder1\folder2\folder3 from path to create relative path.
file = os.path.relpath(file)
zf.write(file)
You could try to slice it
file = file[len("folder1\folder2\folder3\\"):]
zf.write(file)
but relpath() should be better.
You can also use second argument to change path/name inside zip file
z.write(file, 'path/filename.ext')
It can be useful if you run code from different folder and you don't use os.chdir() so you can't create relative path.

Related

Python script to export the all subfolders in a folder into separate .ZIP folders, but ignoring individual files?

I have a directory of subfolders that gets populated with another script. Each of those subfolders in the directory need to be compressed into a .ZIP folder.
However in that directory is also a number of files (PDFs, .TXTs etc) that are not in subfolders. I'm trying to create a script that will create zip folders out of the individual sub folders, but totally ignore the individual files.
import os
import zipfile
path = r"E:\Test\XYZ L48"
path = os.path.abspath(os.path.normpath(os.path.expanduser(path)))
for folder in os.listdir(path):
zipf = zipfile.ZipFile('{0}.zip'.format(os.path.join(path, folder)), 'w', zipfile.ZIP_DEFLATED)
for root, dirs, files in os.wal k(os.path.join(path, folder)):
for filename in files:
zipf.write(os.path.abspath(os.path.join(root, filename)), arcname=filename)
zipf.close()
I tried this, which worked to create ZIPs out the subfolders, but also archives all the files.
Is there a way to modify this to ignore files in the directory, and only zip the sub folders?
Thanks!
Use scandir instead of listdir. Then you can check to see if each is a file, a directory, or a symbolic link.

Python: Finding files in directory but ignoring folders and their contents

So my program search_file.py is trying to look for .log files in the directory it is currently placed in. I used the following code to do so:
import os
# This is to get the directory that the program is currently running in
dir_path = os.path.dirname(os.path.realpath(__file__))
# for loop is meant to scan through the current directory the program is in
for root, dirs, files in os.walk(dir_path):
for file in files:
# Check if file ends with .log, if so print file name
if file.endswith('.log')
print(file)
My current directory is as follows:
search_file.py
sample_1.log
sample_2.log
extra_file (this is a folder)
And within the extra_file folder we have:
extra_sample_1.log
extra_sample_2.log
Now, when the program runs and prints the files out it also takes into account the .log files in the extra_file folder. But I do not want this. I only want it to print out sample_1.log and sample_2.log. How would I approach this?
Try this:
import os
files = os.listdir()
for file in files:
if file.endswith('.log'):
print(file)
The problem in your code is os.walk traverses the whole directory tree and not just your current directory. os.listdir returns a list of all filenames in a directory with the default being your current directory which is what you are looking for.
os.walk documentation
os.listdir documentation
By default, os.walk does a root-first traversal of the tree, so you know the first emitted data is the good stuff. So, just ask for the first one. And since you don't really care about root or dirs, use _ as the "don't care" variable name
# get root files list.
_, _, files = next(os.walk(dir_path))
for file in files:
# Check if file ends with .log, if so print file name
if file.endswith('.log')
print(file)
Its also common to use glob:
from glob import glob
dir_path = os.path.dirname(os.path.realpath(__file__))
for file in glob(os.path.join(dir_path, "*.log")):
print(file)
This runs the risk that there is a directory that ends in ".log", so you could also add a testing using os.path.isfile(file).

Run the for loop for each file in directory using Python

I want to run for loop in python for each file in a directory. The directory names will be passed through a separate file (folderlist.txt).
Inside my main folder (/user/), new folders get added daily. So I want to run for loop for each file in the given folder. And don't want to run against folder which files have already been run through the loop. I'm thinking of maintaining folderlist.txt which will have folder names of only newly added folders each day which will be then passed to for loop.
For example under my main path (/user/) we see below folders :
(file present inside each folder are listed below folder name just to give the idea)
(day 1)
folder1
file1, file2, file3
folder2
file4, file5
folder3
file6
(day 2)
folder4
file7, file8, file9, file10
folder5
file11, file12
import os
with open('/user/folderlist.txt') as f:
for line in f:
line=line.strip("\n")
dir='/user/'+line
for files in os.walk (dir):
for file in files:
print(file)
# for filename in glob.glob(os.path.join (dir, '*.json')):
# print(filename)
I tried using os.walk and glob modules in the above code but looks like the loop is running more number of times than files in the folder. Please provide inputs.
Try changing os.walk(dir) for os.listdir(dir). This will give you a list of all the elements in the directory.
import os
with open('/user/folderlist.txt') as f:
for line in f:
line = line.strip("\n")
dir = '/user/' + line
for file in os.listdir(dir):
if file.endswith("fileExtension"):
print(file)
Hope it helps
*Help on function walk in module os:
walk(top, topdown=True, onerror=None, followlinks=False)
Directory tree generator.
For each directory in the directory tree rooted at top (including top
itself, but excluding '.' and '..'), yields a 3-tuple
dirpath, dirnames, filenames
dirpath is a string, the path to the directory. dirnames is a list of
the names of the subdirectories in dirpath (excluding '.' and '..').
filenames is a list of the names of the non-directory files in dirpath.
Note that the names in the lists are just names, with no path components.
To get a full path (which begins with top) to a file or directory in
dirpath, do os.path.join(dirpath, name).*
Therefore the files in the second loop is iterating on dirpath(string), dirnames(list), filenames(list).
Using os.listdir(dir) gives a list of all the files and folders in the dir as list.

Why does this code not back-up/ZIP directories?

I'm making a program to ZIP files. In this scenario, I am trying to ZIP a directory, with a subdirectory inside of it. I'm using the following function if the program has to ZIP a directory, yet it doesn't ZIP subdirectories, it just takes the files from the subdirectory and puts them with all the others.
zipper = zipfile.ZipFile(systemDate + ".zip", "w")
def zipdir(path, ziph):
logging.info("ZIP function has been called.")
for root, dirs, files in os.walk(path):
for file in files:
fileNom = os.path.join(root, file)
print("file nom: " + fileNom)
zipper.write(fileNom, basename(fileNom))
Thanks.
The second argument to ZipFile.write is the archive name, i.e. the filename of the file inside the archive. Since the ZIP file does not contain any folder information on its own, that is where that has to go. So in order to put a file inside a subdirectory, you have to adjust the arcname to include a directory name.
You can use os.path.relpath to calculate a path relative to your path which appears to be the root of the ZIP file:
zipper.write(fileNom, os.path.relpath(fileNom, path))

in python how do I figure out the current path of the os.walk()?

So I let a user to set a path to a directory that may contain subdirectories (more levels), and files.
I use os.walk() in my code to scan the whole directory:
for root, subdirs, files in os.walk(thispath):
for myfile in files:
shutil.move(os.path.realpath(myfile), os.path.join(thispath,filename))
but "os.path.realpath(myfile)" instead of giving me the absolute path of "myfile" (I also tried "os.path.abspath(myfile)" but does the same thing basically), gives me the path from where the script is running; just like a os.chdir() with attached the myfile filename. basically os.path.realpath(myfile) = os.path.join(os.chdir(),myfile), whereas myfile is obviously in a any other random directory, so it shouldn't be.
When I try to move that file it says that it doesn't exist, and it's true, it is not in the path it goes to look.
How do I get the absolute path of the file I am walking on ("myfile")?
for root, subdirs, files in os.walk(this_path):
for my_file in files:
shutil.move(os.path.join(root, my_file), os.path.join(this_path, filename))

Categories

Resources