Filtering files using os.walk() - python

I'm looking to modify the program to print the contents of any file called log.txt in a given year's subdirectory, ignoring any other file.
import os
year = input('Enter year: ')
path = os.path.join('logs', year)
print()
for dirname, subdirs, files in os.walk(path):
print(dirname, 'contains subdirectories:', subdirs, end=' ')
print('and the files:', files)

Here's what you need to do. I won't provide the complete code:
Iterate over files and check for 'log.txt'.
Get the path to the file. Hint: os.path.join.
open the file and read it.
print out the content

Related

how to avoid searching a folder

how do I avoid searching a folder? This script goes through every folder a searches it for your file, how do I avoid searching Applications? or only search the folders I tell it to. I've been trying for at least 3 hours
from PIL import Image
user_path = ("/Users/" + getpass.getuser())
FileName = input("file name, please, including the exsention: ")
print("working?")
for folder, sub_folder, files in os.walk(user_path):
print(f"folder is {folder}")
for sub_fold in sub_folder:
print(f"sub folder is {sub_fold}")
for f in files:
print(f"file: {f}")
if FileName == f:
print("file found")
print(os.path.abspath(os.path.join(root, name)))
Create array with folders excluded.
When loops entering into folder check is that folder name are in array created above. If it is just ignore.
I made a sample code. Please check and respond to me.
import os
ext = ["a", "b", "c"] # I assume these are unnecessary folders.
for folder, sub_folder, files in os.walk(user_path):
print(f"folder is {folder}")
for sub_fold in sub_folder:
if sub_fold in ext:
continue
else:
print(f"sub folder is {sub_fold}")
for f in files:
print(f"file: {f}")
if FileName == f:
print("file found")
print(os.path.abspath(os.path.join(root, name)))
os.walk walks the entire directory tree, presenting the current directory, its immediate subfolders and its immediate files on each iteration. As long as you are walking top-down (the default) you can stop a subfolder from being iterated by removing it from the folders list. In this example, I made the blacklist a canned list in the source, but you could prompt for it if you'd like. On each folder iteration all you need to do is see if the wanted filename is in the list of file names in that iteration.
from PIL import Image
import getpass
import os
# blacklist folders paths relative to user_path
blacklist = ["Applications", "Downloads"]
# get user root and fix blacklist
# user_path = ("/Users/" + getpass.getuser())
user_path = os.path.expanduser("~")
blacklist = [os.path.join(user_path, name) for name in blacklist]
FileName = input("file name, please, including the exsention: ")
print("working?")
for folder, sub_folders, files in os.walk(user_path):
# eliminate top level folders and their subfolders with inplace
# remove of subfolders
if folder in blacklist:
del sub_folders[:]
continue
# in-place remove of blacklisted folders below top level
for sub_folder in sub_folders[:]:
if os.path.join(folder, sub_folder) in blacklist:
sub_folders.remove(sub_folder)
if FileName in files:
print("file found")
print(os.path.abspath(os.path.join(folder, FileName)))

Python - filename.replace does not change the filename

I Have a directory with a series of txt files and I want to replace a part of the filenames. Here is an example of the filenames in the directory;
2017 Q2 txt WdCt.txt
2017 Q3 txt WdCt.txt
I want to replace txt WdCt with WdFreq in each file name. Here is the code I wrote to do this:
import os.path
sourcedir = 'C:/Users/Public/EnvDef/Proj/1ErnCls/IOUErnCls/Wd Ct by Qrtr/All Word Count'
os.chdir(sourcedir)
cwd = os.getcwd()
print(' 2 Working Directory is %s' % cwd)
print(' ')
for dirPath, subdirNames, fileList in os.walk(cwd):
for filename in fileList:
print("Old File Name")
print(filename)
filename=filename.replace('txt WdCT','WdFreq')
print ("New File Name")
print(filename)
And the following is an example of the output. It appears that the script does walk through the directory and the output shows the files to be renamed as desired. However, the file names in the directory are NOT changed. I have searched online and found many examples that are like what I am trying to do but I cannot determine why my code does not make a change in the name of the files. Any help and or suggestions will be appreciated.
Working Directory: C:\Users\Public\EnvDef\Proj\1ErnCls\IOUErnCls\Wd Ct by Qrtr\All Word Count
Old File Name:
2017 Q2 txt WdCt.txt
New File Name:
2017 Q2 WdFreq.txt
The problem is that filename=filename.replace('txt WdCT','WdFreq') only changes the string representing the file name. You then need to rename the file using os.rename or shutil.move
import os
sourcedir = 'C:/Users/Public/EnvDef/Proj/1ErnCls/IOUErnCls/Wd Ct by Qrtr/All Word Count'
os.chdir(sourcedir)
cwd = os.getcwd()
print(' 2 Working Directory is %s' % cwd)
print(' ')
for dirPath, subdirNames, fileList in os.walk(cwd):
for filename in fileList:
os.rename(os.path.join(dirPath, filename), os.path.join(dirPath, filename.replace('txt WdCT','WdFreq')))
I second #mechanical_meat's suggestion, you would have to use the method os.rename(source file, destination file). Also, if you are interested on filenames from only a specific directory, I would suggest to take a look at this method os.listdir(directory path) since os.walk() iterates through all files and sub-directories under the specified directory.
A modified version of your code using os.listdir() -
import os
sourcedir = '<your directory path>'
os.chdir(sourcedir)
path = os.getcwd()
filenames = os.listdir(path)
for filename in filenames:
os.rename(filename, filename.replace("txt WdCt.txt", "WdFreq.txt"))
List of changes:
Replaced the use of % string formatting, as was done in Python 2.6.
Replaced os.path with the great pathlib.
Changed variable names to follow proper style conventions.
Removed the superfluous (at best) changing of the working directory.
Code:
import pathlib
source_dir = pathlib.Path("/Users/alexandercecile/Documents/Projects/AdHoc/resources/temp")
for curr_path in source_dir.iterdir():
print(f"{curr_path=}")
new_path = curr_path.rename(source_dir.joinpath(curr_path.name.replace("txt WdCt", "WdFreq")))
print(f"{new_path=}\n")
Output:
curr_path=PosixPath('/Users/alexandercecile/Documents/Projects/AdHoc/resources/temp/2017 Q2 txt WdCt.txt')
new_path=PosixPath('/Users/alexandercecile/Documents/Projects/AdHoc/resources/temp/2017 Q2 WdFreq.txt')
curr_path=PosixPath('/Users/alexandercecile/Documents/Projects/AdHoc/resources/temp/2017 Q3 txt WdCt.txt')
new_path=PosixPath('/Users/alexandercecile/Documents/Projects/AdHoc/resources/temp/2017 Q3 WdFreq.txt')

Search through multiple files/dirs for string, then print content of text file

I'm trying to make a small script that will allow me to search through text files located in a specific directory and folders nested inside that one. I've managed to get it to list all files in that path, but can't seem to get it to search for a specific string in those files and then print the full text file.
Code:
import os
from os import listdir
from os.path import isfile, join
path = "<PATH>"
for root, dirs, files in os.walk(path):
for file in files:
if file.endswith('.txt'):
dfiles = str(file)
sTerm = input("Search: ")
for files in os.walk(path):
for file in files:
with open(dfiles) as f:
if sTerm in f.read():
print(f.read())
First part was from a test I did to list all the files, once that worked I tried using the second part to search through all of them for a matching string and then print the full file if it finds one. There's probably an easier way for me to do this.
Here is a solution with Python 3.4+ because of pathlib:
from pathlib import Path
path = Path('/some/dir')
search_string = 'string'
for o in path.rglob('*.txt'):
if o.is_file():
text = o.read_text()
if search_string in text:
print(o)
print(text)
The code above will look for all *.txt in path and its sub-directories, read the content of each file in text, search for search_string in text and, if it matches, print the file name and its contents.

How to access more than one file in a directory without knowing the name of each file in python

I want to access around 5000 files and work on them one by one. Is there any way to access each in succession without hard-coding the name of each file?
The following example is from this tutorial.
import os, sys
# Open a file
path = "/var/www/html/"
dirs = os.listdir( path )
# This would print all the files and directories
for file in dirs:
print file
And if your directory contains parent directories and files, you could use os.walk() like -
# Example taken from os.walk documentation
import os
from os.path import join, getsize
for root, dirs, files in os.walk('python/Lib/email'):
print(root, "consumes", end=" ")
print(sum(getsize(join(root, name)) for name in files), end=" ")
print("bytes in", len(files), "non-directory files")
if 'CVS' in dirs:
dirs.remove('CVS')
Or you could use scandir -
for entry in os.scandir(path):
...

I don't think I understand Os.walk fully

I've been trying to build a program for work that deletes unneeded files generated by a software when we export stills.
I was quite happy with how it's working. You just drop a folder that you want and it will delete all the files in that folder. But my boss saw me using it and asked if he could just drop the top directory folder in and it would go into each folder and delete the DRX to save him time of doing it manually for each folder.
This is my current program -
#!/usr/bin/env python3
import os
import sys
import site
import threading
import time
from os import path
from os import listdir
from os.path import isfile, join
while True:
backboot = 'n'
while backboot == 'n':
print ("")
file = (input("Please drag and drop the folder containing DRX files you wish to delete : "))
path = file[:-1]
os.chdir(path)
drx = [x for x in os.listdir() if x.endswith(".drx")]
amount = (str(len(drx)))
print("")
print("")
print("")
print ('I have found ' + amount + ' files with the .drx extension and these will now be deleted')
print("")
print("")
print(*drx,sep='\n')
print("")
print("")
print("")
exts = ('.drx')
for item in drx:
if item.endswith(".drx"):
os.remove(item)
print ('Deleted ' + amount + ' files.')
print('')
What I understand about OS.walk it's generating the trees or folders in a given directory by going up or down the tree. So far, I have the user's input for a path location -
file = (input("Please drag and drop the folder containing DRX files you wish to delete : "))
path = file[:-1]
os.chdir(path
I then scan that directory for DRX files
drx = [x for x in os.listdir() if x.endswith(".drx")]
and turn that into a string as well in order to tell the user how many files I found.
amount = (str(len(drx)))
So, I'm guessing, would I need to implement the OS.walk before or during the DRX scan? Would this be better done whit a function? I'm just trying to wrap my head around OS.walk so any help would be amazing. :)
I guess, I'm quite stuck on how to get OS.walk to read my path variable.
for root, dirs, items in os.walk(path):
root is the absolute path for your path input, dirs and items are lists which contain every dir' and file's relative path to the root inside root.
for root, dirs, items in os.walk(path):
for file in filter(lambda x: x.endswith(".drx"), items):
file_path = os.path.join(root, file)
#do what you like to do

Categories

Resources