How to iterate over folders in Python [duplicate] - python

This question already has answers here:
How can I iterate over files in a given directory?
(11 answers)
Closed 5 months ago.
Have repository folder in which I have 100 folders of images. I want to iterate over each folder and then do the same over images inside these folders.
for example : repository --> folder1 --> folder1_images ,folder2 --> folder2_images ,folder3 --> folder3_images
May someone know elegante way of doing it?
P.S my OS is MacOS (have .DS_Store files of metadata inside)

You can do use os.walk to visit every subdirectory, recursively. Here's a general starting point:
import os
parent_dir = '/home/example/folder/'
for subdir, dirs, files in os.walk(parent_dir):
for file in files:
print os.path.join(subdir, file)
Instead of print, you can do whatever you want, such as checking that the file type is image or not, as required here.

Have a look at os.walk which is meant exactly to loop through sub-directories and the files in them.
More info at : https://www.tutorialspoint.com/python/os_walk.htm

Everyone has covered how to iterate through directories recursively, but if you don't need to go through all directories recursively, and you just want to iterate over the subdirectories in the current folder, you could do something like this:
dirs = list(filter(lambda d: os.path.isdir(d), os.listdir(".")))
Annoyingly, the listdir function doesn't only list directories, it also lists files as well, so you have to apply the os.path.isdir() function to conditionally 'extract' the elements from the list only if they're a directory and not a file.
(Tested with Python 3.10.6)

Related

How can i have multiple loop to reach list of all files from multiple determined folders?

I have multiple folders. I will read some of them; each folder includes various files (images) that I want access to their paths. I used the below code:
[os.path.join(folder,fname) for fname in os.listdir(folder) for folder in selected_train]
previously I had a list of folders in folders.
but I get the below error:
name 'folder' is not defined
How can I correct it?
You can use os.walk function to navigate through the folder.
import os
subfiles=[x[2] for x in os.walk(".")]
print(subfiles)
And also, if you want to just one list you can do this;
import os
subfiles=[]
for x in os.walk("."):
subfiles.extend(x[2])#x[2] represent the all files in one directory
print(subfiles)

Get all subdirectories quickly with Python

I know the question of how to list all sub-directories in a given directories is answered in this question from 2011. It includes this accepted solution:
subdirs = [x[0] for x in os.walk(dirToSearch)]
That works fine when there are only a few files in the directory. However I am trying to use this on folders that contain thousands of files, and os.walk is apparently iterating over all of them, meaning it takes a really long time to run. Is there a way to do this (identify all subdirectories) without getting bogged down by the files? An alternative to os.walk that ignores files?
I'm trying to do this on a Windows network directory.
Thanks,
Alex
You can use pathlib for this.
This will get all immediate subdirectories:
from pathlib import Path
p = Path('.')
subdirs = [x for x in p.iterdir() if x.is_dir()]
This will get all nested subdirectories:
for subdir in p.glob('**/'):
print(subdir.name)

Rename part of the name in any files or directories in Python [duplicate]

This question already has answers here:
How to rename a file using Python
(17 answers)
Closed 3 years ago.
I have a root folder with several folders and files and I need to use Python to rename all matching correspondences. For example, I want to rename files and folders that contain the word "test" and replace with "earth"
I'm using Ubuntu Server 18.04. I already tried some codes. But I'll leave the last one I tried. I think this is really easy to do but I don't have almost any knowledge in py and this is the only solution I have currently.
import os
def replace(fpath, test, earth):
for path, subdirs, files in os.walk(fpath):
for name in files:
if(test.lower() in name.lower()):
os.rename(os.path.join(path,name), os.path.join(path,
name.lower().replace(test,earth)))
Is expected to go through all files and folders and change the name from test to earth
Here's some working code for you:
def replace(fpath):
filenames = os.listdir()
os.chdir(fpath)
for file in filenames:
if '.' not in file:
replace(file)
os.rename(file, file.replace('test', 'earth'))
Here's an explanation of the code:
First we get a list of the filenames in the directory
After that we switch to the desired folder
Then we iterate through the filenames
The program will try to replace any instances of 'test' in each filename with 'earth'
Then it will rename the files with 'test' in the name to the version with 'test' replaced
If the file it is currently iterating over is a folder, it runs the function again with the new folder, but after that is done it will revert back to the original
Edited to add recursive iteration through subfolders.

Getting the absolute paths of all files in a folder, without traversing the subfolders

Let
my_dir = "/raid/user/my_dir"
be a folder on my filesystem, which is not the current folder (i.e., it's not the result of os.getcwd()). I want to retrieve the absolute paths of all files at the first level of hierarchy in my_dir (i.e., the absolute paths of all files which are in my_dir, but not in a subfolder of my_dir) as a list of strings absolute_paths. I need it, in order to later delete those files with os.remove().
This is nearly the same use case as
Get absolute paths of all files in a directory
but the difference is that I don't want to traverse the folder hierarchy: I only need the files at the first level of hierarchy (at depth 0? not sure about terminology here).
It's easy to adapt that solution: Call os.walk() just once, and don't let it continue:
root, dirs, files = next(os.walk(my_dir, topdown=True))
files = [ os.path.join(root, f) for f in files ]
print(files)
You can use the os.path module and a list comprehension.
import os
absolute_paths= [os.path.abspath(f) for f in os.listdir(my_dir) if os.path.isfile(f)]
You can use os.scandir which returns an os.DirEntry object that has a variety of options including the ability to distinguish files from directories.
with os.scandir(somePath) as it:
paths = [entry.path for entry in it if entry.is_file()]
print(paths)
If you want to list directories as well, you can, of course, remove the condition from the list comprehension if you want to see them in the list.
The documentation also has this note under listDir:
See also The scandir() function returns directory entries along with file attribute information, giving better performance for many common use cases.

Check if there are .format files in a directory

I have been trying to figure out for a while how to check if there are .pkl files in a given directory. I checked the website and I could find ways to find if there are files in the directory and list them, but I just want to check if they are there.
In my directory are a total of 7 .pkl files, as soon as I create one, the others are created so to check if the seven of them exist, it will be enough to check if one exists. Therefore, I would like to check if there is any .pkl file.
This is working if I do:
os.path.exists('folder1/folder2/filename.pkl')
But I had to write one of my file names. I would like to do so without searching for a specific file. I also tried
os.path.exists('folder1/folder2/*.pkl'),
but it is not working neither as I don't have any file named *.pkl.
You can use the python module glob (https://docs.python.org/3/library/glob.html)
Specifically, glob.glob('folder1/folder2/*.pkl') will return a list of all .pkl files in folder2.
You can use :
for dir_path, dir_names, file_names in os.walk(search_dir):
# Go over all files and folders
for file_name in file_names:
if (file_name.endswith(".pkl")):
# do something like break after the first one you find
Note : This can be used if you want to search entire directory with sub directories also
In case you want to search only one directory , you can run the "for" on os.listdir(path)

Categories

Resources