So i'm trying to rename a list of files with set renames like so:
import os
import time
for fileName in os.listdir("."):
os.rename(fileName, fileName.replace("0001", "00016.5"))
os.rename(fileName, fileName.replace("0002", "00041"))
os.rename(fileName, fileName.replace("0003", "00042"))
...
but that gives me this error os.rename(fileName, fileName.replace("0002", "00041"))``OSError: [Errno 2] No such file ordirectory (the file is in the directory)
So next i tried
import os
import time
for fileName in os.listdir("."):
os.rename(fileName, fileName.replace("0001", "00016.5"))
for fileName in os.listdir("."):
os.rename(fileName, fileName.replace("0002", "00041"))
for fileName in os.listdir("."):
os.rename(fileName, fileName.replace("0003", "00042"))
...
But that renames the files very strangely with a lot on extra characters,
what im i doing wrong here?
The fact that multi-pass renaming works while single pass renaming doesn't means that some of your files contain the 0001 pattern as well as 0002 pattern.
So when doing only one loop, you're renaming files but you're given the old list of files (listdir returns a list, so it's outdated as soon as you rename a file) => some source files cannot be found.
When doing in multi-pass, you're applying multiple renames on some files.
That could work (and is more compact):
for fileName in os.listdir("."):
for before,after in (("0001", "00016.5"),("0002", "00041"),("0003", "00042")):
if os.path.exists(fileName):
newName = fileName.replace(before,after)
# file hasn't been renamed: rename it (only if different)
if newName != fileName:
os.rename(fileName,newName)
basically I won't rename a file if it doesn't exist (which means it has been renamed in a previous iteration). So there's only one renaming possible. You just have to prioritize which one.
listdir returns all object's names (files, directories, ...) not a full path. You can construct a full path using: os.path.join().
Your for loop renames, all found objects first to 00016.5, then to 00041 ...
One way to rename the files, could the following:
import os
import time
currentDir = os.pathdirname(__file__)
for fileName in os.listdir(currentDir):
if '0001' in fileName:
oldPath = os.path.join(currentDir, fileName)
newPath = os.path.join(currentDir, fileName.replace("0001", "00016.5"))
elif '0002' in fileName:
oldPath = os.path.join(currentDir, fileName)
newPath = os.path.join(currentDir, fileName.replace("0002", "00041"))
else:
continue
os.rename(oldPath, newPath)
I am trying the script below to rename all files in a folder.It is working fine,But when i am trying to run it outside the folder.It shows error.
import os
path=os.getcwd()
path=os.path.join(path,'it')
filenames = os.listdir(path)
i=0
for filename in filenames:
os.rename(filename, "%d.jpg"%i)
i=i+1
'it' is the name of the folder in which files lie.
Error:FileNotFoundError: [Errno 2] No such file or directory: '0.jpg' -> '0.jpg'
Print is showing names of files
When you do os.listdir(path) you get the filenames of files in the folder, but not the complete paths to those files. When you call os.rename you need the path to the file rather than just the filename.
You can join the filename to its parent folder's path using os.path.join.
E.g. os.path.join(path, file).
Something like this might work:
for filename in filenames:
old = os.path.join(path, filename)
new = os.path.join(path, "%d.jpg"%i)
os.rename(old, new)
i=i+1
You need to mention complete or relative path to file.
In this case, it should be
path + '/' + filename
or more generally,
newpath = os.path.join(path, filename)
I have following directory structure:
DICOMs:
Dir_1
/sub-dir 1
/sub-dir 2
/file1.dcm
Dir_2
/sub-dir 1
/sub-dir 2
/file1.dcm
I have written following code to read first file of every sub_dir.
dire_str = '/DICOMs:/
for dirname,dirnames,filenames in os.walk(dire_str,topdown=True):
for subdirname in dirnames:
print(os.path.join(dirname,subdirname))
a = 1
for filename in filenames:
firstfilename = os.path.join(dirname, filename)
dcm_info = dicom.read_file(firstfilename, force=True)
If i run this on python console it gives me
dirname = dir_1
dirnames=[subdir_1, subdir_2]
filenames = .DSstore
FOr this error in filenames array i am not able to get filename of first file. Can some help me if there is error in code or if syntax is wrong ?
I have file1.dcm under subdir_2 and subdir_1. But still file shown is .DS_Store.
What i am trying to implement is:
1) go into Dir(say dir_1)
go inside subdir_1
look for first .dcm file to read header tag
if tag is present in first file (yes) then call a function which will excute code on this subdir.(note i want to check just first file)
if not go out of this subdir
in this way check every subdir
once done with one dir
repeat these steps for dir2
Many thanks!
This is probably closer to what you meant to do:
for dirpath, dirnames, filenames in os.walk(dire_str, topdown=True):
# Do some stuff that is per directory
for filename in filenames:
# Do some stuff that is per file
pass
And if you wish to only operate on files that end in .dcm, something along these lines might be good:
for dirpath, dirnames, filenames in os.walk(dire_str, topdown=True):
# Do some stuff that is per directory
for filename in filenames:
if filename.endswith('.dcm')
# Do some stuff that is per file
pass
To specifically address the new bit of pseudo-code you added to the question, I believe you will want something like the following:
for dirpath, dirnames, filenames in os.walk(dire_str, topdown=True):
for filename in filenames:
if filename.endswith('.dcm') and tag_present_in_dcm(
os.path.join(dirpath, filename)
):
execute_code_on_subdir(dirpath)
break
I'm wanting to use os.walk to search the cwd and subdirectories to locate a specific file and when found immediately break and change to that dir. I've seen many examples where it breaks after locating the file, but I can't figure out how to retrieve the path location so I can change dir.
Something like this?
f = 'filename'
for path, dirs, files in os.walk('.'):
if f in files:
os.chdir(path)
break
import os
required_file = "somefile.txt"
cwd = '.'
def get_dir_name(cwd, required_file):
for dirName, subdirList, fileList in os.walk(cwd):
for fname in fileList:
if fname == required_file:
change_to_dir = os.path.abspath(dirName)
return change_to_dir
change_to_dir = get_dir_name(cwd, required_file)
os.chdir(change_to_dir)
I have a C++/Obj-C background and I am just discovering Python (been writing it for about an hour).
I am writing a script to recursively read the contents of text files in a folder structure.
The problem I have is the code I have written will only work for one folder deep. I can see why in the code (see #hardcoded path), I just don't know how I can move forward with Python since my experience with it is only brand new.
Python Code:
import os
import sys
rootdir = sys.argv[1]
for root, subFolders, files in os.walk(rootdir):
for folder in subFolders:
outfileName = rootdir + "/" + folder + "/py-outfile.txt" # hardcoded path
folderOut = open( outfileName, 'w' )
print "outfileName is " + outfileName
for file in files:
filePath = rootdir + '/' + file
f = open( filePath, 'r' )
toWrite = f.read()
print "Writing '" + toWrite + "' to" + filePath
folderOut.write( toWrite )
f.close()
folderOut.close()
Make sure you understand the three return values of os.walk:
for root, subdirs, files in os.walk(rootdir):
has the following meaning:
root: Current path which is "walked through"
subdirs: Files in root of type directory
files: Files in root (not in subdirs) of type other than directory
And please use os.path.join instead of concatenating with a slash! Your problem is filePath = rootdir + '/' + file - you must concatenate the currently "walked" folder instead of the topmost folder. So that must be filePath = os.path.join(root, file). BTW "file" is a builtin, so you don't normally use it as variable name.
Another problem are your loops, which should be like this, for example:
import os
import sys
walk_dir = sys.argv[1]
print('walk_dir = ' + walk_dir)
# If your current working directory may change during script execution, it's recommended to
# immediately convert program arguments to an absolute path. Then the variable root below will
# be an absolute path as well. Example:
# walk_dir = os.path.abspath(walk_dir)
print('walk_dir (absolute) = ' + os.path.abspath(walk_dir))
for root, subdirs, files in os.walk(walk_dir):
print('--\nroot = ' + root)
list_file_path = os.path.join(root, 'my-directory-list.txt')
print('list_file_path = ' + list_file_path)
with open(list_file_path, 'wb') as list_file:
for subdir in subdirs:
print('\t- subdirectory ' + subdir)
for filename in files:
file_path = os.path.join(root, filename)
print('\t- file %s (full path: %s)' % (filename, file_path))
with open(file_path, 'rb') as f:
f_content = f.read()
list_file.write(('The file %s contains:\n' % filename).encode('utf-8'))
list_file.write(f_content)
list_file.write(b'\n')
If you didn't know, the with statement for files is a shorthand:
with open('filename', 'rb') as f:
dosomething()
# is effectively the same as
f = open('filename', 'rb')
try:
dosomething()
finally:
f.close()
If you are using Python 3.5 or above, you can get this done in 1 line.
import glob
# root_dir needs a trailing slash (i.e. /root/dir/)
for filename in glob.iglob(root_dir + '**/*.txt', recursive=True):
print(filename)
As mentioned in the documentation
If recursive is true, the pattern '**' will match any files and zero or more directories and subdirectories.
If you want every file, you can use
import glob
for filename in glob.iglob(root_dir + '**/**', recursive=True):
print(filename)
Agree with Dave Webb, os.walk will yield an item for each directory in the tree. Fact is, you just don't have to care about subFolders.
Code like this should work:
import os
import sys
rootdir = sys.argv[1]
for folder, subs, files in os.walk(rootdir):
with open(os.path.join(folder, 'python-outfile.txt'), 'w') as dest:
for filename in files:
with open(os.path.join(folder, filename), 'r') as src:
dest.write(src.read())
TL;DR: This is the equivalent to find -type f to go over all files in all folders below and including the current one:
for currentpath, folders, files in os.walk('.'):
for file in files:
print(os.path.join(currentpath, file))
As already mentioned in other answers, os.walk() is the answer, but it could be explained better. It's quite simple! Let's walk through this tree:
docs/
└── doc1.odt
pics/
todo.txt
With this code:
for currentpath, folders, files in os.walk('.'):
print(currentpath)
The currentpath is the current folder it is looking at. This will output:
.
./docs
./pics
So it loops three times, because there are three folders: the current one, docs, and pics. In every loop, it fills the variables folders and files with all folders and files. Let's show them:
for currentpath, folders, files in os.walk('.'):
print(currentpath, folders, files)
This shows us:
# currentpath folders files
. ['pics', 'docs'] ['todo.txt']
./pics [] []
./docs [] ['doc1.odt']
So in the first line, we see that we are in folder ., that it contains two folders namely pics and docs, and that there is one file, namely todo.txt. You don't have to do anything to recurse into those folders, because as you see, it recurses automatically and just gives you the files in any subfolders. And any subfolders of that (though we don't have those in the example).
If you just want to loop through all files, the equivalent of find -type f, you can do this:
for currentpath, folders, files in os.walk('.'):
for file in files:
print(os.path.join(currentpath, file))
This outputs:
./todo.txt
./docs/doc1.odt
The pathlib library is really great for working with files. You can do a recursive glob on a Path object like so.
from pathlib import Path
for elem in Path('/path/to/my/files').rglob('*.*'):
print(elem)
import glob
import os
root_dir = <root_dir_here>
for filename in glob.iglob(root_dir + '**/**', recursive=True):
if os.path.isfile(filename):
with open(filename,'r') as file:
print(file.read())
**/** is used to get all files recursively including directory.
if os.path.isfile(filename) is used to check if filename variable is file or directory, if it is file then we can read that file.
Here I am printing file.
If you want a flat list of all paths under a given dir (like find . in the shell):
files = [
os.path.join(parent, name)
for (parent, subdirs, files) in os.walk(YOUR_DIRECTORY)
for name in files + subdirs
]
To only include full paths to files under the base dir, leave out + subdirs.
I've found the following to be the easiest
from glob import glob
import os
files = [f for f in glob('rootdir/**', recursive=True) if os.path.isfile(f)]
Using glob('some/path/**', recursive=True) gets all files, but also includes directory names. Adding the if os.path.isfile(f) condition filters this list to existing files only
For my taste os.walk() is a little too complicated and verbose. You can do the accepted answer cleaner by:
all_files = [str(f) for f in pathlib.Path(dir_path).glob("**/*") if f.is_file()]
with open(outfile, 'wb') as fout:
for f in all_files:
with open(f, 'rb') as fin:
fout.write(fin.read())
fout.write(b'\n')
use os.path.join() to construct your paths - It's neater:
import os
import sys
rootdir = sys.argv[1]
for root, subFolders, files in os.walk(rootdir):
for folder in subFolders:
outfileName = os.path.join(root,folder,"py-outfile.txt")
folderOut = open( outfileName, 'w' )
print "outfileName is " + outfileName
for file in files:
filePath = os.path.join(root,file)
toWrite = open( filePath).read()
print "Writing '" + toWrite + "' to" + filePath
folderOut.write( toWrite )
folderOut.close()
os.walk does recursive walk by default. For each dir, starting from root it yields a 3-tuple (dirpath, dirnames, filenames)
from os import walk
from os.path import splitext, join
def select_files(root, files):
"""
simple logic here to filter out interesting files
.py files in this example
"""
selected_files = []
for file in files:
#do concatenation here to get full path
full_path = join(root, file)
ext = splitext(file)[1]
if ext == ".py":
selected_files.append(full_path)
return selected_files
def build_recursive_dir_tree(path):
"""
path - where to begin folder scan
"""
selected_files = []
for root, dirs, files in walk(path):
selected_files += select_files(root, files)
return selected_files
I think the problem is that you're not processing the output of os.walk correctly.
Firstly, change:
filePath = rootdir + '/' + file
to:
filePath = root + '/' + file
rootdir is your fixed starting directory; root is a directory returned by os.walk.
Secondly, you don't need to indent your file processing loop, as it makes no sense to run this for each subdirectory. You'll get root set to each subdirectory. You don't need to process the subdirectories by hand unless you want to do something with the directories themselves.
Try this:
import os
import sys
for root, subdirs, files in os.walk(path):
for file in os.listdir(root):
filePath = os.path.join(root, file)
if os.path.isdir(filePath):
pass
else:
f = open (filePath, 'r')
# Do Stuff
If you prefer an (almost) Oneliner:
from pathlib import Path
lookuppath = '.' #use your path
filelist = [str(item) for item in Path(lookuppath).glob("**/*") if Path(item).is_file()]
In this case you will get a list with just the paths of all files located recursively under lookuppath.
Without str() you will get PosixPath() added to each path.
This worked for me:
import glob
root_dir = "C:\\Users\\Scott\\" # Don't forget trailing (last) slashes
for filename in glob.iglob(root_dir + '**/*.jpg', recursive=True):
print(filename)
# do stuff
If just the file names are not enough, it's easy to implement a Depth-first search on top of os.scandir():
stack = ['.']
files = []
total_size = 0
while stack:
dirname = stack.pop()
with os.scandir(dirname) as it:
for e in it:
if e.is_dir():
stack.append(e.path)
else:
size = e.stat().st_size
files.append((e.path, size))
total_size += size
The docs have this to say:
The scandir() function returns directory entries along with file attribute information, giving better performance for many common use cases.