python directory recursive traversal program - python

my program does not believe that folders are directory, assuming theyre files, and because of this, the recursion prints the folders as files, then since there are no folders waiting to be traversed through, the program finishes.
import os
import sys
class DRT:
def dirTrav(self, dir, buff):
newdir = []
for file in os.listdir(dir):
print(file)
if(os.path.isdir(file)):
newdir.append(os.path.join(dir, file))
for f in newdir:
print("dir: " + f)
self.dirTrav(f, "")
dr = DRT()
dr.dirTrav(".", "")

See os.walk from there:
This example displays the number of bytes taken by non-directory files in each directory under the starting directory, except that it doesn’t look under any CVS subdirectory:
import os
from os.path import join, getsize
for root, dirs, files in os.walk('python/Lib/email'):
print root, "consumes",
print sum(getsize(join(root, name)) for name in files),
print "bytes in", len(files), "non-directory files"
if 'CVS' in dirs:
dirs.remove('CVS') # don't visit CVS directories

The problem is that you're not checking the right thing. file is just the filename, not the pathname. That's why you need os.path.join(dir, file), on the next line, right? So you need it in the isdir call, too. But you're just passing file.
So, instead of asking "is .foo/bar/baz a directory?" you're just asking "is baz a directory?" It interprets just baz as ./baz, as you'd expect. And, since there (probably) is no "./baz", you get back False.
So, change this:
if(os.path.isdir(file)):
newdir.append(os.path.join(dir, file))
to:
path = os.path.join(dir, file)
if os.path.isdir(path):
newdir.append(path)
All that being said, using os.walk as sotapme suggested is simpler than trying to build it yourself.

Related

Python: How to check folders within folders? [duplicate]

This question already has answers here:
Python Deleting Certain File Extensions
(2 answers)
Closed 4 years ago.
First off, let me apologize if the title is unclear.
To simplify a task I do at work, I've started writing this script to automate the removal of files from a certain path.
My issue is that in its current state, this script does not check the contents of the folders within the folder provided by the path.
I'm not sure how to fix this, because from what I can tell, it should be checking those files?
import os
def depdelete(path):
for f in os.listdir(path):
if f.endswith('.exe'):
os.remove(os.path.join(path, f))
print('Dep Files have been deleted.')
else:
print('No Dep Files Present.')
def DepInput():
print('Hello, Welcome to DepDelete!')
print('What is the path?')
path = input()
depdelete(path)
DepInput()
Try using os.walk to traverse the directory tree, like this:
def depdelete(path):
for root, _, file_list in os.walk(path):
print("In directory {}".format(root))
for file_name in file_list:
if file_name.endswith(".exe"):
os.remove(os.path.join(root, file_name))
print("Deleted {}".format(os.path.join(root, file_name)))
Here are the docs (there are some usage examples towards the bottom): https://docs.python.org/3/library/os.html#os.walk
Currently, your code just loops over all files and folders in the provided folder and checks each one for its name. In order to also check the contents of folders within path, you have to make your code recursive.
You can use os.walk to go through the directory tree in path and then check its contents.
You'll find a more detailed answer with code examples at Recursive sub folder search and return files in a list python.
Take a look at os.walk()
This function will iterate through sub-directories for you. The loop will look like this.
for subdir, dirs, files in os.walk(path):
for f in files:
if f.endswith('.exe'):
fullFile = os.path.join(subdir, f)
os.remove(fullFile)
print (fullFile + " was deleted")
You're looking for os.walk(). In your case, it could work like so:
import os
def dep_delete(path):
for path, dirs, files in os.walk(path):
for f in files:
if f.endswith('.exe'):
os.remove(os.path.join(path, f))
print('Dep files have been deleted.')
def dep_input():
print('Hello, Welcome to dep_delete!')
print('What is the path?')
path = input()
dep_delete(path)
dep_input()
Also see: List directory tree structure in python?

program to traverse directory structure in python

I need to create a program in which I have been given a directory path, in that directory there can be n number of tree type directory structure, and in any directory there can be any number of .py File. Some files were executed are some dnt. So I need to create a script in python that only run those files which are not executed till now. Can someone please suggest the way.
Please take a look at os.walk().
import os
directory = '/tmp'
for (dirpath, dirnames, filenames) in os.walk(directory):
# Do something with dirpath, dirnames, and filenames.
pass
The usual approach is to use os.walk and to compose complete paths using os.path.join:
import os
import os.path
def find_all_files(directory):
for root, _, filenames in os.walk(directory):
for filename in filenames:
fullpath = os.path.join(root, filename)
yield fullpath
if __name__ == '__main__':
for fullpath in find_all_files('/tmp'):
print(fullpath)
In my experience, dirnames return value of os.walk is rarely used, so I omitted it with _.
As for your question about files being executed or not -- I don't get it. Please explain.

Python - Open All Text Files in All Subdirectories Unless Text File Is In Specified Directory

I have a directory (named "Top") that contains ten subdirectories (named "1", "2", ... "10"), and each of those subdirectories contains a large number of text files. I would like to be able to open all of the files in subdirectories 2-10 without opening the files in the subdirectory 1. (Then I will open files in subdirectories 1 and 3-10 without opening the files in the subdirectory 2, and so forth). Right now, I am attempting to read the files in subdirectories 2-10 without reading the files in subdirectory 1 by using the following code:
import os, fnmatch
def findfiles (path, filter):
for root, dirs, files in os.walk(path):
for file in fnmatch.filter(files, filter):
yield os.path.join(root, file)
for textfile in findfiles(r'C:\\Top', '*.txt'):
if textfile in findfiles(r'C:\\Top\\1', '*.txt'):
pass
else:
filename = os.path.basename(textfile)
print filename
The trouble is, the if statement here ("if textfile in findfiles [...]") does not allow me to exclude the files in subdirectory 1 from the textfile list. Do any of you happen to know how I might modify my code so as to only print the filenames of those files in subdirectories 2-10? I would be most grateful for any advice you can lend on this question.
EDIT:
In case others might find it helpful, I wanted to post the code I ultimately ended up using to solve this problem:
import os, fnmatch, glob
for file in glob.glob('C:\\Text\\Digital Humanities\\Packages and Tools\\Stanford Packages\\training-the-ner-tagger\\fixed\*\*'):
if not file.startswith('C:\\Text\\Digital Humanities\\Packages and Tools\\Stanford Packages\\training-the-ner-tagger\\fixed\\1\\'):
print file
Change your loop to this:
for textfile in findfiles(r'C:\\Top', '*.txt'):
if not textfile.startswith(r'C:\\Top\\1'):
filename = os.path.basename(textfile)
print filename
The problem is as simple as that you are using extra \s in your constants. Write instead:
for textfile in findfiles(r'C:\Top', '*.txt'):
if textfile in findfiles(r'C:\Top\1', '*.txt'):
pass
else:
filename = os.path.basename(textfile)
print filename
The \\ would be correct if you hadn't used raw (r'') strings.
If the performance of this code is too bad, try:
exclude= findfiles(r'C:\Top\1', '*.txt')
for textfile in findfiles(r'C:\Top', '*.txt'):
if textfile in exclude:
pass
else:
filename = os.path.basename(textfile)
print filename

Python - empty dirs & subdirs after a shutil.copytree function

This is part of a program I'm writing. The goal is to extract all the GPX files, say at G:\ (specified with -e G:\ at the command line). It would create an 'Exports' folder and dump all files with matching extensions there, recursively that is. Works great, a friend helped me write it!! Problem: empty directories and subdirectories for dirs that did not contain GPX files.
import argparse, shutil, os
def ignore_list(path, files): # This ignore list is specified in the function below.
ret = []
for fname in files:
fullFileName = os.path.normpath(path) + os.sep + fname
if not os.path.isdir(fullFileName) \
and not fname.endswith('gpx'):
ret.append(fname)
elif os.path.isdir(fullFileName) \ # This isn't doing what it's supposed to.
and len(os.listdir(fullFileName)) == 0:
ret.append(fname)
return ret
def gpxextract(src,dest):
shutil.copytree(src,dest,ignore=ignore_list)
Later in the program we have the call for extractpath():
if args.extractpath:
path = args.extractpath
gpxextract(extractpath, 'Exports')
So the above extraction does work. But the len function call above is designed to prevent the creation of empty dirs and does not. I know the best way is to os.rmdir somehow after the export, and while there's no error, the folders remain.
So how can I successfully prune this Exports folder so that only dirs with GPXs will be in there? :)
If I understand you correctly, you want to delete empty folders? If that is the case, you can do a bottom up delete folder operation -- which will fail for any any folders that are not empty. Something like:
for root, dirs, files in os.walk('G:/', topdown=true):
for dn in dirs:
pth = os.path.join(root, dn)
try:
os.rmdir(pth)
except OSError:
pass

How to recursively loop through a file structure and rename directories in python

I would like to resursively rename directories by changing the last character to lowercase (if it is a letter)
I have done this with the help of my previous posts (sorry for the double posting and not acknowledging the answers)
This code works for Files, but how can I adapt it for directories as well?
import fnmatch
import os
def listFiles(dir):
rootdir = dir
for root, subFolders, files in os.walk(rootdir):
for file in files:
yield os.path.join(root,file)
return
for f in listFiles(r"N:\Sonstiges\geoserver\IM_Topo\GIS\MAPTILEIMAGES_0\tiles_2"):
if f[-5].isalpha():
os.rename(f,f[:-5]+f[-5].lower() + ".JPG")
print "Renamed " + "---to---" + f[:-5]+f[-5].lower() + ".JPG"
The problem is that the default of os.walk is topdown. If you try to rename directories while traversing topdown, the results are unpredictable.
Try setting os.walk to go bottom up:
for root, subFolders, files in os.walk(rootdir,topdown=False):
Edit
Another problem you have is listFiles() is returning, well, files not directories.
This (untested) sub returns directories from bottom up:
def listDirs(dir):
for root, subFolders, files in os.walk(dir, topdown=False):
for folder in subFolders:
yield os.path.join(root,folder)
return

Categories

Resources