I don't think I understand Os.walk fully - python

I've been trying to build a program for work that deletes unneeded files generated by a software when we export stills.
I was quite happy with how it's working. You just drop a folder that you want and it will delete all the files in that folder. But my boss saw me using it and asked if he could just drop the top directory folder in and it would go into each folder and delete the DRX to save him time of doing it manually for each folder.
This is my current program -
#!/usr/bin/env python3
import os
import sys
import site
import threading
import time
from os import path
from os import listdir
from os.path import isfile, join
while True:
backboot = 'n'
while backboot == 'n':
print ("")
file = (input("Please drag and drop the folder containing DRX files you wish to delete : "))
path = file[:-1]
os.chdir(path)
drx = [x for x in os.listdir() if x.endswith(".drx")]
amount = (str(len(drx)))
print("")
print("")
print("")
print ('I have found ' + amount + ' files with the .drx extension and these will now be deleted')
print("")
print("")
print(*drx,sep='\n')
print("")
print("")
print("")
exts = ('.drx')
for item in drx:
if item.endswith(".drx"):
os.remove(item)
print ('Deleted ' + amount + ' files.')
print('')
What I understand about OS.walk it's generating the trees or folders in a given directory by going up or down the tree. So far, I have the user's input for a path location -
file = (input("Please drag and drop the folder containing DRX files you wish to delete : "))
path = file[:-1]
os.chdir(path
I then scan that directory for DRX files
drx = [x for x in os.listdir() if x.endswith(".drx")]
and turn that into a string as well in order to tell the user how many files I found.
amount = (str(len(drx)))
So, I'm guessing, would I need to implement the OS.walk before or during the DRX scan? Would this be better done whit a function? I'm just trying to wrap my head around OS.walk so any help would be amazing. :)
I guess, I'm quite stuck on how to get OS.walk to read my path variable.
for root, dirs, items in os.walk(path):

root is the absolute path for your path input, dirs and items are lists which contain every dir' and file's relative path to the root inside root.
for root, dirs, items in os.walk(path):
for file in filter(lambda x: x.endswith(".drx"), items):
file_path = os.path.join(root, file)
#do what you like to do

Related

how to avoid searching a folder

how do I avoid searching a folder? This script goes through every folder a searches it for your file, how do I avoid searching Applications? or only search the folders I tell it to. I've been trying for at least 3 hours
from PIL import Image
user_path = ("/Users/" + getpass.getuser())
FileName = input("file name, please, including the exsention: ")
print("working?")
for folder, sub_folder, files in os.walk(user_path):
print(f"folder is {folder}")
for sub_fold in sub_folder:
print(f"sub folder is {sub_fold}")
for f in files:
print(f"file: {f}")
if FileName == f:
print("file found")
print(os.path.abspath(os.path.join(root, name)))
Create array with folders excluded.
When loops entering into folder check is that folder name are in array created above. If it is just ignore.
I made a sample code. Please check and respond to me.
import os
ext = ["a", "b", "c"] # I assume these are unnecessary folders.
for folder, sub_folder, files in os.walk(user_path):
print(f"folder is {folder}")
for sub_fold in sub_folder:
if sub_fold in ext:
continue
else:
print(f"sub folder is {sub_fold}")
for f in files:
print(f"file: {f}")
if FileName == f:
print("file found")
print(os.path.abspath(os.path.join(root, name)))
os.walk walks the entire directory tree, presenting the current directory, its immediate subfolders and its immediate files on each iteration. As long as you are walking top-down (the default) you can stop a subfolder from being iterated by removing it from the folders list. In this example, I made the blacklist a canned list in the source, but you could prompt for it if you'd like. On each folder iteration all you need to do is see if the wanted filename is in the list of file names in that iteration.
from PIL import Image
import getpass
import os
# blacklist folders paths relative to user_path
blacklist = ["Applications", "Downloads"]
# get user root and fix blacklist
# user_path = ("/Users/" + getpass.getuser())
user_path = os.path.expanduser("~")
blacklist = [os.path.join(user_path, name) for name in blacklist]
FileName = input("file name, please, including the exsention: ")
print("working?")
for folder, sub_folders, files in os.walk(user_path):
# eliminate top level folders and their subfolders with inplace
# remove of subfolders
if folder in blacklist:
del sub_folders[:]
continue
# in-place remove of blacklisted folders below top level
for sub_folder in sub_folders[:]:
if os.path.join(folder, sub_folder) in blacklist:
sub_folders.remove(sub_folder)
if FileName in files:
print("file found")
print(os.path.abspath(os.path.join(folder, FileName)))

Filtering files using os.walk()

I'm looking to modify the program to print the contents of any file called log.txt in a given year's subdirectory, ignoring any other file.
import os
year = input('Enter year: ')
path = os.path.join('logs', year)
print()
for dirname, subdirs, files in os.walk(path):
print(dirname, 'contains subdirectories:', subdirs, end=' ')
print('and the files:', files)
Here's what you need to do. I won't provide the complete code:
Iterate over files and check for 'log.txt'.
Get the path to the file. Hint: os.path.join.
open the file and read it.
print out the content

removing substrings from subdirectory names using values held in list

I have a parent directory that contains a lot of subdirectories. I want to create a script that loops through all of the subdirectories and removes any key words that I have specified in the list variable.
I am not entirely sure how to acheive this.
Currently I have this:
import os
directory = next(os.walk('.'))[1]
stringstoremove = ['string1','string2','string3','string4','string5']
for folders in directory:
os.rename
And maybe this type of logic to check to see if the string exists within the subdirectory name:
if any(words in inputstring for words in stringstoremove):
print ("TRUE")
else:
print ("FALSE")
Trying my best to to deconstruct the task, but I'm going round in circles now
Thanks guys
Startng from your existing code:
import os
directory = next(os.walk('.'))[1]
stringstoremove = ['string1','string2','string3','string4','string5']
for folder in directory :
new_folder = folder
for r in stringstoremove :
new_folder = new_folder.replace( r, '')
if folder != new_folder : # don't rename if it's the same
os.rename( folder, new_folder )
If you want to rename those sub directories which match in your stringstoremove list then following script will be helpful.
import os
import re
path = "./" # parent directory path
sub_dirs = os.listdir(path)
stringstoremove = ['string1','string2','string3','string4','string5']
for directory_name in sub_dirs:
if os.path.isdir(path + directory):
for string in stringstoremove:
if string in directory_name:
try:
new_name = re.sub(string, "", directory_name)
os.rename(path + directory, path + new_name) # rename this directory
except Exception as e:
print (e)

How to access more than one file in a directory without knowing the name of each file in python

I want to access around 5000 files and work on them one by one. Is there any way to access each in succession without hard-coding the name of each file?
The following example is from this tutorial.
import os, sys
# Open a file
path = "/var/www/html/"
dirs = os.listdir( path )
# This would print all the files and directories
for file in dirs:
print file
And if your directory contains parent directories and files, you could use os.walk() like -
# Example taken from os.walk documentation
import os
from os.path import join, getsize
for root, dirs, files in os.walk('python/Lib/email'):
print(root, "consumes", end=" ")
print(sum(getsize(join(root, name)) for name in files), end=" ")
print("bytes in", len(files), "non-directory files")
if 'CVS' in dirs:
dirs.remove('CVS')
Or you could use scandir -
for entry in os.scandir(path):
...

Delete multiple text files in one folder

from random import *
import os
def createFile(randomNumber):
with open("FileName{}.txt".format(randomNumber), "w") as f:
f.write("Hello mutha funsta")
def deleteFile():
directory = os.getcwd()
os.chdir(directory)
fileList = [f for f in directory if f.endswith(".txt")]
for f in fileList:
os.remove(f)
print ("All gone!")
fileName = input("What is the name of the file you want to create? ")
contents = input("What are the contents of the file? ")
start = input("Press enter to start the hax. Enter 1 to delete the products. ")
randomNumber = randint(0, 1)
while True:
if start == (""):
for i in range(0):
createFile(randomNumber)
randomNumber = randint(0,9999)
break
elif start == ("1"):
deleteFile()
break
else:
print ("That input was not valid")
Above is code I've made to create as many text files as I specify (currently set to 0). I am currently adding a feature to remove all the text files created, as my folder now has over 200,000 text files. However, it doesn't work, it runs through without any errors but doesn't actually delete any of the files.
that is very wrong:
def deleteFile():
directory = os.getcwd()
os.chdir(directory)
fileList = [f for f in directory if f.endswith(".txt")]
for f in fileList:
os.remove(f)
you change the directory: not recommended unless you want to run a system call, mostly you change it to the current directory: it has no effect.
your list comprehension doesn't scan the directory but a string => f is a character! Since it doesn't end with .txt, your listcomp is empty
To achieve what you want you may just use glob (no need to change the directory and pattern matching is handled automatically):
import glob,os
def deleteFile():
for f in glob.glob("*.txt"):
os.remove(f)
this method is portable (Windows, Linux) and does not issue system calls.
For deleting all the files in the directory with name as FileName{some_thing}.txt, you may use os.system() as:
>>> import os
>>> os.system("rm -rf /path/to/directory/FileName*.txt")

Categories

Resources