Verifying the size of a sub directory - python

I want to write a shell script that aims to traverse a root directory C:\xyz and find whether a sub directory "Demo" is available in any directories or not. If "Demo" is available & not empty, print 'OK', if "Demo" directory is empty, print 'NOT_OK'.

(Assuming you want a Python script to be executed in the terminal and not an actual shell script)
You can use os.walk to recursively list all the files and subdirectories in a given directory. This creates a generator with items in the form ("path/from/root", [directories], [files]) which you can then check whether any of those (a) is the desired Demo directory, and (b) is not empty.
import os
exists_and_not_empty = any(path.endswith("/Demo") and (dirs or files)
for (path, dirs, files) in os.walk(root_dir))
This will also return True if the target directory is contained directly in the root directory, or if it is nested deeper in subdirectories of subdirectories. Not entirely clear from your question if this is intended, or if the target directory has to be in a direct subdirectory of the root directory.

To see if there are any directories in folder xyz that contains another directory named 'Demo', and if so, whether or not 'Demo' is empty, you can use the glob module
from glob import glob
if glob("C:\\xyz\\**\\Demo\\*", recursive=True):
print("OK")
else:
print("NOT_OK")

Related

How to use data files of sub-directories and perform iterative operation in python

I have my jupyter notebook (python script) in current directory. In current directory, I have two subfolders, namely a and b. In both directories a and b I have equal number of .dat files with same names. For example, directory a contains files, namely x1-x1-val_1, x1-x1-val_5, x1-x1-val_11...x1-x1-val_86 and x1-x2-val_1, x1-x2-val_5, x1-x2-val_11...x1-x2-val_86, i.e. values are in range(1,90,5). Likewise I have files in directory b.
I want to use my python script to access files in a and b to perform iterative operations on .dat files. My present code works only if I keep files of directory a or b in current directory. For example, my script uses following function.
def get_info(test):
my_dict = {'test':test}
c = []
for i in range(1,90,5):
x_val = 'x_val_'+test+'-val_'+str(i)
y_val = 'y_val_'+test+'-val_'+str(i)
my_dict[x_val],my_dict[y_val]= np.loadtxt(test+'-val_'+str(i)+'.dat'
,usecols= (1,2),unpack=True)
dw = compute_yy(my_dict[x_val],my_dict[y_val],test)
c.append(dw)
my_dict.update({test+'_c'+:np.array(c)})
return my_dict
I call get_info() by using following:
tests = ['x1-x1', 'x1-x2']
new_dict = {}
for i in tests:
new_dict.update({i:get_info(i)})
How can I use my code to access files in either directory a and/or b? I know its about providing correct path, but I am unsure how can I do so. One way I thought is following;
ext = '.dat'
for files in os.listdir(path_to_dir):
if files.endswith(ext):
print(files) # do operations
Alternative could be to make use of os.path.join(). However, I am unable to solve it such that I can use same python script (with minimum changes perhaps) that can use files and iterate on them which are in subfolders a and b. Thanks for your feedback in advance!
If you want to run get_info() on every folder separatelly then you have two methods:
First: described by #medium-dimensional in comment
You can use os.chdir(folder) to change Current Working Directory. And then code will run with files in this folder
You can see current working directory with print( os.getcwd() )
os.chdir("a")
get_info(i)
os.chdir("..") # move back to parent folder
os.chdir("b")
get_info(i)
os.chdir("..") # move back to parent folder
chdir() (similar to command cd in console) can use relative path (r"a") full path (r"C:\full\path\to\a") and .. to move to parent folder (r"a\..\b")
If files can be in nested folders then .. may not go back you can use getcwd()
cwd = os.getcwd()
os.chdir("folder1/folder2/a")
get_info(i)
os.chdir(cwd) # move back to previous folder
os.chdir("folder1/folder2/b")
get_info(i)
os.chdir(cwd) # move back to previous folder
(BTW: in console on Linux you can use cd - to move back to previous folder)
Second: use folder when you open file
Every command which gets filename can also get path with folder\filename (it can be relative path, full path, and path with ..) like
r"a\filename.dat"
r"C:\full\path\to\b\filename.dat"
r"a\..\b\filename.dat"
So you could define function with extra option folder
def get_info(text, folder):
and use this folder when you read file
loadtxt(folder + r'\' + test+'-val_'+str(i)+'.dat', ...)
or more readable with f-string
loadtxt(rf'{folder}\{test}-val_{i}.dat', ...)
And later you run it as
get_info(i, "a")
get_info(i, "b")

Deleting folder only if it does not contain any files

Is there a way to delete a folder in python if it does not contain any files? I can do the foll:
os.rmdir() will remove an empty directory.
shutil.rmtree() will delete a directory and all its contents.
If the folder has empty sub-folders, it should be deleted too
os.removedirs(path)
Remove directories recursively. Works like rmdir() except that, if the
leaf directory is successfully removed, removedirs() tries to
successively remove every parent directory mentioned in path until an
error is raised (which is ignored, because it generally means that a
parent directory is not empty).
e.g.
import os
if not os.listdir(dir):
os.removedirs(dir)
See more details from os.removedirs.
Hope this helps.

Google app engine os.path functions not working

I am seeing a strange and frustrating behavior in Google App Engine. I am launching google app engine from a particular folder like this: (dev_appserver .) In this folder I have a folder called data with a few subfolders under it (one of them is json_folder which contains a file temp.json). From inside my main python src that contains the MainHandler class I do this:
print os.getcwd(), os.path.isfile("data/json_folder/temp.json")
It prints the expected cwd and False even though the json_folder/temp.json exists and if I launch a regular python shell from the same directory it correctly prints True. Why does it work in regular Python but not in GAE python?
I also tried the following. I walk through my current dir and list the subfolders under data but the isdir() returns false even for directories! Why is python thinking they are not directories? ls -al shows them to be directories it makes no sense:
(this prints json_folder, False as one of the outputs, all subfolders are returning False):
for root, dirs, files in os.walk("data"):
for file in files:
print file, os.path.isdir(file)
You need to os.path.join the path, it works if you only use files from the cwd as you are in the same directory as the file, once you hit a file folder outside the cwd you are checking if the file from that outside directory is a file in your cwd, you also want isfile to check for files:
print file, os.path.isfile(os.path.join(root, file))
If you want to check for directories you need to iterate over dirs not files:
for root, dirs, files in os.walk("data"):
for _dir in dirs:
print _dir, os.path.isdir(os.path.join(root, _dir)

recursive script to rename folders ending with a space or period

We just switched over our storage server to a new file system. The old file system allowed users to name folders with a period or space at the end. The new system considers this an illegal character. How can I write a python script to recursively loop through all directories and rename and folder that has a period or space at the end?
Use os.walk. Give it a root directory path and it will recursively iterate over it. Do something like
for root, dirs, files in os.walk('root path'):
for dir in dirs:
if dir.endswith(' ') or dir.endswith('.'):
os.rename(...)
EDIT:
We should actually rename the leaf directories first - here is the workaround:
alldirs = []
for root, dirs, files in os.walk('root path'):
for dir in dirs:
alldirs.append(os.path.join(root, dir))
# the following two lines make sure that leaf directories are renamed first
alldirs.sort()
alldirs.reverse()
for dir in alldirs:
if ...:
os.rename(...)
You can use os.listdir to list the folders and files on some path. This returns a list that you can iterate through. For each list entry, use os.path.join to combine the file/folder name with the parent path and then use os.path.isdir to check if it is a folder. If it is a folder then check the last character's validity and, if it is invalid, change the folder name using os.rename. Once the folder name has been corrected, you can repeat the whole process with that folder's full path as the base path. I would put the whole process into a recursive function.

Visiting multiple folders with extensions

I'm working on something here, and I'm completely confused. Basically, I have the script in my directory, and that script has to run on multiple folders with a particular extension. Right now, I have it up and running on a single folder. Here's the structure, I have a main folder say, Python, inside that I have multiple folders all with the same .ext, and inside each sub-folder I again have few folders, inside which I have the working file.
Now, I want the script to visit the whole path say, we are inside the main folder 'python', inside which we have folder1.ext->sub-folder1->working-file, come out of this again go back to the main folder 'Python' and start visiting the second directory.
Now there are so many things in my head, the glob module, os.walk, or the for loop. I'm getting the logic wrong. I desperately need some help.
Say, Path=r'\path1'
How do I start about? Would greatly appreciate any help.
I'm not sure if this is what you want, but this main function with a recursive helper function gets a dictionary of all of the files in a main directory:
import os, os.path
def getFiles(path):
'''Gets all of the files in a directory'''
sub = os.listdir(path)
paths = {}
for p in sub:
print p
pDir = os.path.join(path, p)
if os.path.isdir(pDir):
paths.update(getAllFiles(pDir, paths))
else:
paths[p] = pDir
return paths
def getAllFiles(mainPath, paths = {}):
'''Helper function for getFiles(path)'''
subPaths = os.listdir(mainPath)
for path in subPaths:
pathDir = os.path.join(path, p)
if os.path.isdir(pathDir):
paths.update(getAllFiles(pathDir, paths))
else:
paths[path] = pathDir
return paths
This returns a dictionary of the form {'my_file.txt': 'C:\User\Example\my_file.txt', ...}.
Since you distinguish first level directories from its sub-directories, you could do something like this:
# this is a generator to get all first level directories
dirs = (d for d in os.listdir(my_path) if os.path.isdir(d)
and os.path.splitext(d)[-1] == my_ext)
for d in dirs:
for root, sub_dirs, files in os.walk(d):
for f in files:
# call your script on each file f
You could use Formic (disclosure: I am the author). Formic allows you to specify one multi-directory glob to match your files so eliminating directory walking:
import formic
fileset = formic.FileSet(include="*.ext/*/working-file", directory=r"path1")
for file_name in fileset:
# Do something with file_name
A couple of points to note:
/*/ matches every subdirectory, while /**/ recursively descends into every subdirectory, their subdirectories and so on. Some options:
If the working file is precisely one directory below your *.ext, then use /*/
If the working file is at any depth under *.ext, then use /**/ instead.
If the working file is at least one directory, then you might use /*/**/
Formic starts searching in the current working directory. If this is the correct directory, you can omit the directory=r"path1"
I am assuming the working file is literally called working-file. If not, substitute a glob that matches it, like *.sh or script-*.

Categories

Resources