This question already has answers here:
Python Deleting Certain File Extensions
(2 answers)
Closed 4 years ago.
First off, let me apologize if the title is unclear.
To simplify a task I do at work, I've started writing this script to automate the removal of files from a certain path.
My issue is that in its current state, this script does not check the contents of the folders within the folder provided by the path.
I'm not sure how to fix this, because from what I can tell, it should be checking those files?
import os
def depdelete(path):
for f in os.listdir(path):
if f.endswith('.exe'):
os.remove(os.path.join(path, f))
print('Dep Files have been deleted.')
else:
print('No Dep Files Present.')
def DepInput():
print('Hello, Welcome to DepDelete!')
print('What is the path?')
path = input()
depdelete(path)
DepInput()
Try using os.walk to traverse the directory tree, like this:
def depdelete(path):
for root, _, file_list in os.walk(path):
print("In directory {}".format(root))
for file_name in file_list:
if file_name.endswith(".exe"):
os.remove(os.path.join(root, file_name))
print("Deleted {}".format(os.path.join(root, file_name)))
Here are the docs (there are some usage examples towards the bottom): https://docs.python.org/3/library/os.html#os.walk
Currently, your code just loops over all files and folders in the provided folder and checks each one for its name. In order to also check the contents of folders within path, you have to make your code recursive.
You can use os.walk to go through the directory tree in path and then check its contents.
You'll find a more detailed answer with code examples at Recursive sub folder search and return files in a list python.
Take a look at os.walk()
This function will iterate through sub-directories for you. The loop will look like this.
for subdir, dirs, files in os.walk(path):
for f in files:
if f.endswith('.exe'):
fullFile = os.path.join(subdir, f)
os.remove(fullFile)
print (fullFile + " was deleted")
You're looking for os.walk(). In your case, it could work like so:
import os
def dep_delete(path):
for path, dirs, files in os.walk(path):
for f in files:
if f.endswith('.exe'):
os.remove(os.path.join(path, f))
print('Dep files have been deleted.')
def dep_input():
print('Hello, Welcome to dep_delete!')
print('What is the path?')
path = input()
dep_delete(path)
dep_input()
Also see: List directory tree structure in python?
Related
This question already has answers here:
Using os.walk() to recursively traverse directories in Python
(14 answers)
Closed 4 years ago.
I am new to Python and I am trying to write a function that will be able to enter inside a folder if there all files it should just print their names if it is a folder it should go inside it and print it's files, if there is a folder inside this folder it should also go inside and do that until there is nothing left. For now I haven't found a way to go that deep. Is there a way to do that recursively? How should I proceed my code for some reason doesn't enter all subdirectories. Thanks in advance
def list_files(startpath, d):
for root, dirs, files in os.walk(startpath):
for f in files:
print (f)
for di in dirs:
print (di)
list_files(di, d + 1)
list_files(path, 0)
May be you can check this answer:
Using os.walk() to recursively traverse directories in Python
which employs os.walk() method like this.
import os
# traverse root directory, and list directories as dirs and files as files
for root, dirs, files in os.walk("."):
path = root.split(os.sep)
print((len(path) - 1) * '---', os.path.basename(root))
for file in files:
print(len(path) * '---', file)
This is part of a program I'm writing. The goal is to extract all the GPX files, say at G:\ (specified with -e G:\ at the command line). It would create an 'Exports' folder and dump all files with matching extensions there, recursively that is. Works great, a friend helped me write it!! Problem: empty directories and subdirectories for dirs that did not contain GPX files.
import argparse, shutil, os
def ignore_list(path, files): # This ignore list is specified in the function below.
ret = []
for fname in files:
fullFileName = os.path.normpath(path) + os.sep + fname
if not os.path.isdir(fullFileName) \
and not fname.endswith('gpx'):
ret.append(fname)
elif os.path.isdir(fullFileName) \ # This isn't doing what it's supposed to.
and len(os.listdir(fullFileName)) == 0:
ret.append(fname)
return ret
def gpxextract(src,dest):
shutil.copytree(src,dest,ignore=ignore_list)
Later in the program we have the call for extractpath():
if args.extractpath:
path = args.extractpath
gpxextract(extractpath, 'Exports')
So the above extraction does work. But the len function call above is designed to prevent the creation of empty dirs and does not. I know the best way is to os.rmdir somehow after the export, and while there's no error, the folders remain.
So how can I successfully prune this Exports folder so that only dirs with GPXs will be in there? :)
If I understand you correctly, you want to delete empty folders? If that is the case, you can do a bottom up delete folder operation -- which will fail for any any folders that are not empty. Something like:
for root, dirs, files in os.walk('G:/', topdown=true):
for dn in dirs:
pth = os.path.join(root, dn)
try:
os.rmdir(pth)
except OSError:
pass
my program does not believe that folders are directory, assuming theyre files, and because of this, the recursion prints the folders as files, then since there are no folders waiting to be traversed through, the program finishes.
import os
import sys
class DRT:
def dirTrav(self, dir, buff):
newdir = []
for file in os.listdir(dir):
print(file)
if(os.path.isdir(file)):
newdir.append(os.path.join(dir, file))
for f in newdir:
print("dir: " + f)
self.dirTrav(f, "")
dr = DRT()
dr.dirTrav(".", "")
See os.walk from there:
This example displays the number of bytes taken by non-directory files in each directory under the starting directory, except that it doesn’t look under any CVS subdirectory:
import os
from os.path import join, getsize
for root, dirs, files in os.walk('python/Lib/email'):
print root, "consumes",
print sum(getsize(join(root, name)) for name in files),
print "bytes in", len(files), "non-directory files"
if 'CVS' in dirs:
dirs.remove('CVS') # don't visit CVS directories
The problem is that you're not checking the right thing. file is just the filename, not the pathname. That's why you need os.path.join(dir, file), on the next line, right? So you need it in the isdir call, too. But you're just passing file.
So, instead of asking "is .foo/bar/baz a directory?" you're just asking "is baz a directory?" It interprets just baz as ./baz, as you'd expect. And, since there (probably) is no "./baz", you get back False.
So, change this:
if(os.path.isdir(file)):
newdir.append(os.path.join(dir, file))
to:
path = os.path.join(dir, file)
if os.path.isdir(path):
newdir.append(path)
All that being said, using os.walk as sotapme suggested is simpler than trying to build it yourself.
What is the simplest way to get the full recursive list of files inside a folder with python? I know about os.walk(), but it seems overkill for just getting the unfiltered list of all files. Is it really the only option?
There's nothing preventing you from creating your own function:
import os
def listfiles(folder):
for root, folders, files in os.walk(folder):
for filename in folders + files:
yield os.path.join(root, filename)
You can use it like so:
for filename in listfiles('/etc/'):
print filename
os.walk() is not overkill by any means. It can generate your list of files and directories in a jiffy:
files = [os.path.join(dirpath, filename)
for (dirpath, dirs, files) in os.walk('.')
for filename in (dirs + files)]
You can turn this into a generator, to only process one path at a time and safe on memory.
You could also use the find program itself from Python by using sh
import sh
text_files = sh.find(".", "-iname", "*.txt")
Either that or manually recursing with isdir() / isfile() and listdir() or you could use subprocess.check_output() and call find .. Bascially os.walk() is highest level, slightly lower level is semi-manual solution based on listdir() and if you want the same output find . would give you for some reason you can make a system call with subprocess.
pathlib.Path.rglob is pretty simple. It lists the entire directory tree
(The argument is a filepath search pattern. "*" means list everything)
import pathlib
for path in pathlib.Path("directory_to_list/").rglob("*"):
print(path)
os.walk() is hard to use, just kick it and use pathlib instead.
Here is a python function mimicking a similar function of list.files in R language.
def list_files(path,pattern,full_names=False,recursive=True):
if(recursive):
files=pathlib.Path(path).rglob(pattern)
else:
files=pathlib.Path(path).glob(pattern)
if full_names:
files=[str(f) for f in files]
else:
files=[f.name for f in files]
return(files)
import os
path = "path/to/your/dir"
for (path, dirs, files) in os.walk(path):
print files
Is this overkill, or am I missing something?
One can use os.listdir('somedir') to get all the files under somedir. However, if what I want is just regular files (excluding directories) like the result of find . -type f under shell.
I know one can use [path for path in os.listdir('somedir') if not os.path.isdir('somedir/'+path)] to achieve similar result as in this related question: How to list only top level directories in Python?. Just wondering if there are more succinct ways to do so.
You could use os.walk, which returns a tuple of path, folders and files:
files = next(os.walk('somedir'))[2]
I have a couple of ways that i do such tasks. I cannot comment on the succinct nature of the solution. FWIW here they are:
1.the code below will take all files that end with .txt. you may want to remove the ".endswith" part
import os
for root, dirs, files in os.walk('./'): #current directory in terminal
for file in files:
if file.endswith('.txt'):
#here you can do whatever you want to with the file.
2.This code here will assume that the path is provided to the function and will append all .txt files to a list and if there are subdirectories in the path, it will append those files in the subdirectories to subfiles
def readFilesNameList(self, path):
basePath = path
allfiles = []
subfiles = []
for root, dirs, files in os.walk(basePath):
for f in files:
if f.endswith('.txt'):
allfiles.append(os.path.join(root,f))
if root!=basePath:
subfiles.append(os.path.join(root, f))
I know the code is just skeletal in nature but i think you can get the general picture.
post if you find the succinct way! :)
The earlier os.walk answer is perfect if you only want the files in the top-level directory. If you want subdirectories' files too, though (a la find), you need to process each directory, e.g.:
def find_files(path):
for prefix, _, files in os.walk(path):
for name in files:
yield os.path.join(prefix, name)
Now list(find_files('.')) is a list of the same thing find . -type f -print would have given you (the list is because find_files is a generator, in case that's not obvious).