I'm using os.walk to traverse my directories. The problem is I want to recognize if a file is a symbolic link, not following through with the link. This code:
for root, dirs, files in os.walk(PROJECT_PATH):
for f in files:
# I want os.path.islink(f) to return true for symlink here
# instead of ignoring them by default
will not give me symlinks, while this code
for root, dirs, files in os.walk(PROJECT_PATH, followlinks=True):
for f in files
will walk the directories that the symlinks point to but doesn't give me the symlinks themselves. Thanks.
os.walk() does give you symlinks. There are three things to take into account:
os.path.islink(f) is incorrect — you have to call os.path.islink on os.path.join(root, f).
Symlinks that point to directories will be included in dirs (but not followed, unless you also specify followlinks=True, which you don't need to do, since you don't need to actually follow them).
Symlinks that point to non-directories will be included in files.
Related
Let
my_dir = "/raid/user/my_dir"
be a folder on my filesystem, which is not the current folder (i.e., it's not the result of os.getcwd()). I want to retrieve the absolute paths of all files at the first level of hierarchy in my_dir (i.e., the absolute paths of all files which are in my_dir, but not in a subfolder of my_dir) as a list of strings absolute_paths. I need it, in order to later delete those files with os.remove().
This is nearly the same use case as
Get absolute paths of all files in a directory
but the difference is that I don't want to traverse the folder hierarchy: I only need the files at the first level of hierarchy (at depth 0? not sure about terminology here).
It's easy to adapt that solution: Call os.walk() just once, and don't let it continue:
root, dirs, files = next(os.walk(my_dir, topdown=True))
files = [ os.path.join(root, f) for f in files ]
print(files)
You can use the os.path module and a list comprehension.
import os
absolute_paths= [os.path.abspath(f) for f in os.listdir(my_dir) if os.path.isfile(f)]
You can use os.scandir which returns an os.DirEntry object that has a variety of options including the ability to distinguish files from directories.
with os.scandir(somePath) as it:
paths = [entry.path for entry in it if entry.is_file()]
print(paths)
If you want to list directories as well, you can, of course, remove the condition from the list comprehension if you want to see them in the list.
The documentation also has this note under listDir:
See also The scandir() function returns directory entries along with file attribute information, giving better performance for many common use cases.
So I have a file system that I want to be able to check and update using python. my solution was os.walk but it becomes problematic with my needs and my file system. This is how the directories are laid out:
Root
dir1
subdir
1
2
3...
file1
file2
dir2
subdir
1
2
3...
file1
file2
...
The main directories have different names hence "dir1" and "dir2" but the directories inside those have the same name as each other and contain a lot of different files and directories. The sub directories are the ones I want to exclude from os.walk as they add unnecessary computing.
Is there a way to exclude directories from os.walk based on the directory's name instead of path or will I need to do something else?
os.walk allows you to modify the list of directories it gives you. If you take some out, it won't descend into those directories.
for dirpath, dirnames, filenames in os.walk("/root/path"):
if "subdir" in dirnames:
dirnames.remove("subdir")
# process the files here
(Note that this doesn't work if you use the bottom-up style of scanning. The top-down style is the default.)
See the documentation
I'm relatively new to python and I'm trying my hand at a weekend project. I want to navigate through my music directories and get the artist name of each music file and export that to a csv so that I can upgrade my music collection (a lot of it is from when I was younger and didn't care about quality).
Anyway, I'm trying to get the path of each music file in its respective directory, so I can pass it to id3 tag reading module to get the artist name.
Here is what I'm trying:
import os
def main():
for subdir, dirs, files in os.walk(dir):
for file in files:
if file.endswith(".mp3") or file.endswith(".m4a"):
print(os.path.abspath(file))
However, .abspath() doesn't do what I think it should. If I have a directory like this:
music
--1.mp3
--2.mp3
--folder
----a.mp3
----b.mp3
----c.mp3
----d.m4a
----e.m4a
and I run my code, I get this output:
C:\Users\User\Documents\python_music\1.mp3
C:\Users\User\Documents\python_music\2.mp3
C:\Users\User\Documents\python_music\a.mp3
C:\Users\User\Documents\python_music\b.mp3
C:\Users\User\Documents\python_music\c.mp3
C:\Users\User\Documents\python_music\d.m4a
C:\Users\User\Documents\python_music\e.m4a
I'm confused why it doesn't show the 5 files being inside of a folder.
Aside from that, am I even going about this in the easiest or best way? Again, I'm new to python so any help is appreciated.
You are passing just the filename to os.path.abspath(), which has no context but your current working directory.
Join the path with the subdir parameter:
print(os.path.join(subdir, file))
From the os.path.abspath() documentation:
On most platforms, this is equivalent to calling the function normpath() as follows: normpath(join(os.getcwd(), path)).
so if your current working directory is C:\Users\User\Documents\python_music all your files are joined relative to that.
But os.walk gives you the correct location to base filenames off instead; from the documentation:
For each directory in the tree rooted at directory top (including top itself), it yields a 3-tuple (dirpath, dirnames, filenames).
dirpath is a string, the path to the directory. [...] filenames is a list of the names of the non-directory files in dirpath. Note that the names in the lists contain no path components. To get a full path (which begins with top) to a file or directory in dirpath, do os.path.join(dirpath, name).
Emphasis mine.
What would be the best method of getting sub directories of a drive including files located within them? Would it be best to use os.listdir() and filter out directories from files by checking if they have a '.' in them?
Any ideas would be helpful, and i would much prefer that i use only the standard library for this task.
Take a look at os.walk(), it allows you to visit each directory and get a list of files and a list of sub directories for each directory that you visit.
Here is how you could only go down a single level:
for root, dirs, files in os.walk(path):
# do whatever you want to with dirs and files
if root != path:
# one level down, modify dirs in place so we don't go any deeper
del dirs[:]
One can use os.listdir('somedir') to get all the files under somedir. However, if what I want is just regular files (excluding directories) like the result of find . -type f under shell.
I know one can use [path for path in os.listdir('somedir') if not os.path.isdir('somedir/'+path)] to achieve similar result as in this related question: How to list only top level directories in Python?. Just wondering if there are more succinct ways to do so.
You could use os.walk, which returns a tuple of path, folders and files:
files = next(os.walk('somedir'))[2]
I have a couple of ways that i do such tasks. I cannot comment on the succinct nature of the solution. FWIW here they are:
1.the code below will take all files that end with .txt. you may want to remove the ".endswith" part
import os
for root, dirs, files in os.walk('./'): #current directory in terminal
for file in files:
if file.endswith('.txt'):
#here you can do whatever you want to with the file.
2.This code here will assume that the path is provided to the function and will append all .txt files to a list and if there are subdirectories in the path, it will append those files in the subdirectories to subfiles
def readFilesNameList(self, path):
basePath = path
allfiles = []
subfiles = []
for root, dirs, files in os.walk(basePath):
for f in files:
if f.endswith('.txt'):
allfiles.append(os.path.join(root,f))
if root!=basePath:
subfiles.append(os.path.join(root, f))
I know the code is just skeletal in nature but i think you can get the general picture.
post if you find the succinct way! :)
The earlier os.walk answer is perfect if you only want the files in the top-level directory. If you want subdirectories' files too, though (a la find), you need to process each directory, e.g.:
def find_files(path):
for prefix, _, files in os.walk(path):
for name in files:
yield os.path.join(prefix, name)
Now list(find_files('.')) is a list of the same thing find . -type f -print would have given you (the list is because find_files is a generator, in case that's not obvious).