I'm making a program to ZIP files. In this scenario, I am trying to ZIP a directory, with a subdirectory inside of it. I'm using the following function if the program has to ZIP a directory, yet it doesn't ZIP subdirectories, it just takes the files from the subdirectory and puts them with all the others.
zipper = zipfile.ZipFile(systemDate + ".zip", "w")
def zipdir(path, ziph):
logging.info("ZIP function has been called.")
for root, dirs, files in os.walk(path):
for file in files:
fileNom = os.path.join(root, file)
print("file nom: " + fileNom)
zipper.write(fileNom, basename(fileNom))
Thanks.
The second argument to ZipFile.write is the archive name, i.e. the filename of the file inside the archive. Since the ZIP file does not contain any folder information on its own, that is where that has to go. So in order to put a file inside a subdirectory, you have to adjust the arcname to include a directory name.
You can use os.path.relpath to calculate a path relative to your path which appears to be the root of the ZIP file:
zipper.write(fileNom, os.path.relpath(fileNom, path))
Related
So my program search_file.py is trying to look for .log files in the directory it is currently placed in. I used the following code to do so:
import os
# This is to get the directory that the program is currently running in
dir_path = os.path.dirname(os.path.realpath(__file__))
# for loop is meant to scan through the current directory the program is in
for root, dirs, files in os.walk(dir_path):
for file in files:
# Check if file ends with .log, if so print file name
if file.endswith('.log')
print(file)
My current directory is as follows:
search_file.py
sample_1.log
sample_2.log
extra_file (this is a folder)
And within the extra_file folder we have:
extra_sample_1.log
extra_sample_2.log
Now, when the program runs and prints the files out it also takes into account the .log files in the extra_file folder. But I do not want this. I only want it to print out sample_1.log and sample_2.log. How would I approach this?
Try this:
import os
files = os.listdir()
for file in files:
if file.endswith('.log'):
print(file)
The problem in your code is os.walk traverses the whole directory tree and not just your current directory. os.listdir returns a list of all filenames in a directory with the default being your current directory which is what you are looking for.
os.walk documentation
os.listdir documentation
By default, os.walk does a root-first traversal of the tree, so you know the first emitted data is the good stuff. So, just ask for the first one. And since you don't really care about root or dirs, use _ as the "don't care" variable name
# get root files list.
_, _, files = next(os.walk(dir_path))
for file in files:
# Check if file ends with .log, if so print file name
if file.endswith('.log')
print(file)
Its also common to use glob:
from glob import glob
dir_path = os.path.dirname(os.path.realpath(__file__))
for file in glob(os.path.join(dir_path, "*.log")):
print(file)
This runs the risk that there is a directory that ends in ".log", so you could also add a testing using os.path.isfile(file).
I am trying to zip all the files and folders present in a folder3 using python.
I have used zipFile for this. The zip contains all the folders from the root directory to the directory I want to create zip folder of.
def CreateZip(dir_name):
os.chdir(dir_name)
zf = zipfile.ZipFile("temp.zip", "w")
for dirname, subdirs, files in os.walk(dir_name):
zf.write(dirname)
for filename in files:
file=os.path.join(dirname, filename)
zf.write(file)
zf.printdir()
zf.close()
Expected output:
toBeZippedcontent1\toBeZippedFile1.txt
toBeZippedcontent1\toBeZippedFile2.txt
toBeZippedcontent1\toBeZippedFile1.txt
toBeZippedcontent2\toBeZippedFile2.txt
Current output (folder structure inside zip file):
folder1\folder2\folder3\toBeZippedcontent1\toBeZippedFile1.txt
folder1\folder2\folder3\toBeZippedcontent1\toBeZippedFile2.txt
folder1\folder2\folder3\toBeZippedcontent2\toBeZippedFile1.txt
folder1\folder2\folder3\toBeZippedcontent2\toBeZippedFile2.txt
walk() gives absolute path for dirname so join() create absolut path for your files.
You may have to remove folder1\folder2\folder3 from path to create relative path.
file = os.path.relpath(file)
zf.write(file)
You could try to slice it
file = file[len("folder1\folder2\folder3\\"):]
zf.write(file)
but relpath() should be better.
You can also use second argument to change path/name inside zip file
z.write(file, 'path/filename.ext')
It can be useful if you run code from different folder and you don't use os.chdir() so you can't create relative path.
I want to run for loop in python for each file in a directory. The directory names will be passed through a separate file (folderlist.txt).
Inside my main folder (/user/), new folders get added daily. So I want to run for loop for each file in the given folder. And don't want to run against folder which files have already been run through the loop. I'm thinking of maintaining folderlist.txt which will have folder names of only newly added folders each day which will be then passed to for loop.
For example under my main path (/user/) we see below folders :
(file present inside each folder are listed below folder name just to give the idea)
(day 1)
folder1
file1, file2, file3
folder2
file4, file5
folder3
file6
(day 2)
folder4
file7, file8, file9, file10
folder5
file11, file12
import os
with open('/user/folderlist.txt') as f:
for line in f:
line=line.strip("\n")
dir='/user/'+line
for files in os.walk (dir):
for file in files:
print(file)
# for filename in glob.glob(os.path.join (dir, '*.json')):
# print(filename)
I tried using os.walk and glob modules in the above code but looks like the loop is running more number of times than files in the folder. Please provide inputs.
Try changing os.walk(dir) for os.listdir(dir). This will give you a list of all the elements in the directory.
import os
with open('/user/folderlist.txt') as f:
for line in f:
line = line.strip("\n")
dir = '/user/' + line
for file in os.listdir(dir):
if file.endswith("fileExtension"):
print(file)
Hope it helps
*Help on function walk in module os:
walk(top, topdown=True, onerror=None, followlinks=False)
Directory tree generator.
For each directory in the directory tree rooted at top (including top
itself, but excluding '.' and '..'), yields a 3-tuple
dirpath, dirnames, filenames
dirpath is a string, the path to the directory. dirnames is a list of
the names of the subdirectories in dirpath (excluding '.' and '..').
filenames is a list of the names of the non-directory files in dirpath.
Note that the names in the lists are just names, with no path components.
To get a full path (which begins with top) to a file or directory in
dirpath, do os.path.join(dirpath, name).*
Therefore the files in the second loop is iterating on dirpath(string), dirnames(list), filenames(list).
Using os.listdir(dir) gives a list of all the files and folders in the dir as list.
I have a directory path and in this path there are several folders. So i am am trying to build a script which would find all the xml files and the file name must start with report. I have been so far able to iterate over all the directories but further i do not know how to proceed. Here is my code:
def search_xml_report(rootdir):
for subdir, dirs, files in os.walk(rootdir):
for file in files:
print os.path.join(subdir,file) # print statement just for testing
You can use str.startswith:
def search_xml_report(rootdir):
for subdir, dirs, files in os.walk(rootdir):
for file in files:
if file.startswith('report'):
yield subdir, file
use str.startswith with os.path.splitext
os.path.splitext: Split the extension from a pathname. Extension is everything from the last dot to the end, ignoring leading dots. Returns "(root, ext)"; ext may be empty.
if file.startswith('report') and os.path.splitext(filepath+filename)[-1] == '.xml':
return file
I have a script that I am using to compare and sort the files in two directories. I am currently trying to compare all of the files in one directory to a list of files in the other, and then copy those files into a "match" or "unique" directory.
I've managed to match the file name against the list and then copy the file, but I can't quite get it to copy that file into a target directory while keeping the name.
Here is what I have:
input2_only = [file1.mp3, file2.mp3, etc]
for root, dirs, files in os.walk("input2", topdown=False):
for filename in files:
print(filename)
if filename in input2_only:
print('yay')
shutil.copy(os.path.join(root, filename), "outputs")
I think there is something that I can change in the shutil line to make this work, but every tweak I've tried so far has lead to heartache. Just to be clear, in this snippet I want it to copy the file being compared against the list to a directory called "outputs". Once I can do that I'm reasonably confident I can fill in the rest of the logic.
thanks!
OK, figured this out and posting in case it is helpful to someone else. The key is creating the output target before calling shutil. The correct snippet is:
for root, dirs, files in os.walk("input2", topdown=False):
for filename in files:
print(filename)
if filename in input2_only:
print('yay')
out = "outputs/" + filename
shutil.copy(os.path.join(root, filename), out)
that "out" line makes the target of the shutil output the name of the file in the folder "outputs"