Searching for a File in all Subdirectories in Python - python

I seem to be stuck on some logic within my code and would appreciate any insight. My objective is to find an excel file in two different sub folders. The user inputs the ID number in the terminal (which is the name of the root folder) and create a file path with the ID. Now im not sure why my if statement isn't detecting the file in either folder.
If anyone can look at my code, it would be greatly appreciated.
#ask user to input the ID
ID = input("Please enter folder ID: ")
#path of excel directory and use glob
path = "/Users/one/Downloads/" + str(ID) + "/"
for (dir,subdirs,files) in os.walk(path):
if "Filey1_*.xlsx" in files:
print("File Found:", os.path.join(dir, "Filey1_*.xlsx"))

To answer your question directly: the reason your if statement is not working is because the use of the keyword "in" is not like using glob or a regex and the asterisk you're including (*) is not doing what you think it is doing. In fact it's not really doing anything.
The result is that you're searching specifically for a file called exactly "Filey1_*.xlsx" rather than a file that matches the glob regex (* being a wild card), which is presumably what you want.
What you could do is add this import at the top:
from pathlib import Path
and then replace your if statement with:
temp = Path(path).rglob("Filey1_*.xlsx")
temp = list(temp)
if len(temp) > 0:
print("File Found:", os.path.join(dir, str(temp[0])))
the first line does a recursive glob search through all subfolders of path and if it finds a file, then the list length is larger than 0.

So the issue is with your if statement as it searches for exact "Filey1_*.xlsx" match in the file names.
You can try using something like this:
for (root, subdirs, files) in os.walk(path):
for f in files:
if "Filey1_" in f and ".xlsx" in f:
print("File Found:", os.path.join(root, f))

I found a really simple solution to my own problem lol I'll share it with everyone!
files = glob.glob(path + "/**/Filey1_*.xlsx", recursive = True)

Related

Segregate files based on filename

I've got a directory containing multiple images, and I need to separate them into two folders based on a portion of the file name. Here's a sample of the file names:
22DEC167603520981127600_03.jpg
13NOV162302999230157801_07.jpg
08JAN147603811108236510_02.jpg
21OCT152302197661710099_07.jpg
07MAR172302551529900521_01.jpg
19FEB173211074174309177_09.jpg
19FEB173211881209232440_02.jpg
19FEB172302491000265198_04.jpg
I need to move the files into two folders according to the numbers in bold after the date - so files containing 2302 and 3211 would go into an existing folder named "panchromatic" and files with 7603 would go into another folder named "sepia".
I've tried multiple examples from other questions, and none seem to fit this problem. I'm very new to Python, so I'm not sure what example to post. Any help would be greatly appreciated.
You can do this the easy way or the hard way.
Easy way
Test if your filename contains the substring you're looking for.
import os
import shutil
files = os.listdir('.')
for f in files:
# skip non-jpeg files
if not f.endswith('.jpg'):
continue
# move if panchromatic
if '2302' in f or '3211' in f:
shutil.move(f, os.path.join('panchromatic', f))
# move if sepia
elif '7603' in f:
shutil.move(f, os.path.join('sepia', f))
# notify if something else
else:
print('Could not categorize file with name %s' % f)
This solution in its current form is susceptible to mis-classification, as the number we're looking for can appear by chance later in the string. I'll leave you to find ways to mitigate this.
Hard way
Regular expressions. Match the four letter digits after the date with a regular expression. Left for you to explore!
Self explanative, with Python 3, or Python 2 + backport pathlib:
import pathlib
import shutil
# Directory paths. Tailor this to your files layout
# see https://docs.python.org/3/library/pathlib.html#module-pathlib
source_dir = pathlib.Path('.')
sepia_dir = source_dir / 'sepia'
panchro_dir = source_dir / 'panchromatic'
assert sepia_dir.is_dir()
assert panchro_dir.is_dir()
destinations = {
('2302', '3211'): panchro_dir,
('7603',): sepia_dir
}
for filename in source_dir.glob('*.jpg'):
marker = str(filename)[7:11]
for key, value in destinations.items():
if marker in key:
filepath = source_dir / filename
shutil.move(str(filepath), str(value))

Python: get a complete file name based on a partial file name

In a directory, there are two files that share most of their names:
my_file_0_1.txt
my_file_word_0_1.txt
I would like to open my_file_0_1.txt
I need to avoid specifying the exact filename, and instead need to search the directory for a filename that matches the partial string my_file_0.
From this answer here, and this one, I tried the following:
import numpy as np
import os, fnmatch, glob
def find(pattern, path):
result = []
for root, dirs, files in os.walk(path):
for name in files:
if fnmatch.fnmatch(name, pattern):
result.append(os.path.join(root, name))
return result
if __name__=='__main__':
#filename=find('my_file_0*.txt', '/path/to/file')
#print filename
print glob.glob('my_file_0' + '*' + '.txt')
Neither of these would print the actual filename, for me to read in later using np.loadtxt.
How can I find and store the name of a file, based on the result of a string match?
glob.glob() needs a path to be efficient, if you are running the script in another directory, it will not find what you expect.
(you can check the current directory with os.getcwd())
It should work with the line below :
print glob.glob('path/to/search/my_file_0' + '*.txt')
or
print glob.glob(r'C:\path\to\search\my_file_0' + '*.txt') # for windows
A solution using os.listdir()
Could you not also use the os module to search through os.listdir()? So for instance:
import os
partialFileName = "my_file_0"
for f in os.listdir():
if partialFileName = f[:len(partialFileName)]:
print(f)
I just developed the approach below and was doing a search to see if there was a better way and came across your question. I think you may like this approach. I needed pretty much the same thing you are asking for and came up with this clean one liner using list comprehension and a sure expectation that there would only be one file name matching my criteria. I modified my code to match your question.
import os
file_name = [n for n in os.listdir("C:/Your/Path") if 'my_file_0' in n][0]
print(file)
Now, if this is in a looping / repeated call situation, you can modify as below:
for i in range(1, 4):
file = [n for n in os.listdir("C:/Your/Path") if f'my_file_{i}' in n][0]
print(file)
or, probably more practically ...
def get_file_name_with_number(num):
file = [n for n in os.listdir("C:/Your/Path") if f'my_file_{num}' in n][0]
return file
print(get_file_name_with_number(0))

Renaming multiple files at once with Python

I am new to programming. I usually learn for a while , then take a long break and forget most of what I learned. Nevermind that back info.
I tried to create a function which would help me rename files inside a folder and then add an increment at the end of the new name (e.g. blueberry1, blueberry 2,...)
import os
def rename_files(loc,new_name):
file_list= os.listdir(loc)
for file_name in file_list:
count=1
if count <= len(file_list):
composite_name = new_name+str(count)
os.rename(file_name, composite_name)
count+= 1
Well apparently this code doesn't work. Any idea how to fix it?
You need to join the file to the path:
os.rename(os.path.join(loc, file_name), composite_name)
You can also use enumerate for the count:
import os
def rename_files(loc,new_name):
file_list= os.listdir(loc)
for ind, file_name in enumerate(file_list,1):
composite_name = new_name+str(ind)
os.rename(os.path.join(loc, file_name), os.path.join(loc, composite_name)
listdir just returns the file names, not the path so python would have no way of knowing where the original file actually came from unless your cwd was the same directory.

Using Python to add incrementing integers to files with the same name in a directory

I have thousands of files in a directory all named "Elev_Contour" and I need to figure out how to write a Python script that will add incrementing integers to these files (ex: Elev_Contour1, Elev_Contour2, Elev_Contour3, etc) for all of the files. These files are within subdirectories within the main directory (path: C:\DEM Files\State_Folder\State_location.gdb\featuredataset\Elev_Contour). I need unique names for each file so that I can add them into the table of contents in an ArcMap document (using an Arcpy script that I have already written).
Any help with this will be greatly appreciated! Thanks!
You should be able to do most of this using the os library
I would take a look at
import os
and in particular
os.path
and its associations
I think you're going to need to use the os.walk() method to walk the directory tree.
E.g.
import os
num = 0
for root, dirs, files in os.walk("C:\\DEM Files\\State_Folder\\State_location.gdb\\featuredataset\\Elev_Contour"):
for filename in files:
num = num + 1
old_name = os.path.realpath(os.path.join(root, filename))
print old_name
new_name = os.path.realpath(os.path.join(root, "Elev_Contour" + str(num)))
print new_name
os.rename(old_name, new_name)
(Note the double backslashes to escape the backslashes - yay windows)
You could simply traverse directory tree and rename matching files one by one
import os
counter = 0
for root, dirs, files in os.walk('.'):
if "Elev_Contour" in files:
fname = os.path.join(root, "Elev_Contour")
os.rename(fname, "{}{}".format(fname, counter))
counter += 1

Generalize a program using iglob

I have a program which scans a folder or path for a given type of files and then analyze them.
import glob
path = raw_input("ENTER PATH TO SEARCH: ")
list_of_file = glob.iglob('*.txt')
for filename in list_of_file:
print filename`
But in this script the program will scan the directory only in which it was stored in path statement gets no value.
Now if I write:
list_of_file = glob.iglob(path + '*.txt')
this also do not make my work going.
So please suggest a way in which whatever path I enter the program follows that path and search for particular file types, no matter where I kept my script.
I assume English is not your first language, so please forgive me if I have misunderstood your question. It seems to me that you want to be able to enter a directory and then find all the ".txt" files in that directory?
If so, then the following code should suffice to give you an example:
"""Find all .py files is a given directory."""
import os
import glob
path = raw_input("Enter path: ")
fullglob = os.path.join(path, "*.py")
for fn in glob.iglob(fullglob):
print os.path.split(fn)[1]
The last line takes off the path you put in to ensure iglob() finds the right directory. You can use the help() function to get documentation on os.path.join() and os.path.split().

Categories

Resources