Remove certain filetypes in Python

Remove certain filetypes in Python - python

I am running a script that walks a directory structure and generates new files in each folder in the directory. I want to delete some of the files right after creation. This is my idea, but it is quite wrong I imagine:
directory = os.path.dirname(obj)
m = MeshExporterApplication(directory)
os.remove(os.path.join(directory,"*.mesh.xml"))
How to you put wildcards in a path? I guess not like /home/me/*.txt, but that is what I am trying.
Thanks,
Gareth

You can use the glob module:
import glob
glob.glob("*.mesh.xml")
to get a list of matching files. Then you delete them, one by one.
directory = os.path.dirname(obj)
m = MeshExporterApplication(directory)
# you can use absolute pathes in the glob
# to ensure, that you're purging the files in
# the right directory, e.g. "/tmp/*.mesh.xml"
for f in glob.glob("*.mesh.xml"):
os.remove(f)

do a for loop with the list of files as the thing you are looping over.
directory = os.path.dirname(obj)
m = MeshExporterApplication(directory)
for filename in os.listdir(dir):
if not(re.match(".*\.mesh\".xml ,filename) is None):
os.remove(directory + "/" + file)

Related

Move files one by one to newly created directories for each file with Python 3

What I have is an initial directory with a file inside D:\BBS\file.x and multiple .txt files in the work directory D:\
What I am trying to do is to copy the folder BBS with its content and incrementing it's name by number, then copy/move each existing .txt file to the newly created directory to make it \BBS1, \BBS2, ..., BBSn (depends on number of the txt).
Visual example of the Before and After:
Initial view of the \WorkFolder
Desired view of the \WorkFolder
Right now I have reached only creating of a new directory and moving txt in it but all at once, not as I would like to. Here's my code:
from pathlib import Path
from shutil import copy
import shutil
import os
wkDir = Path.cwd()
src = wkDir.joinpath('BBS')
count = 0
for content in src.iterdir():
addname = src.name.split('_')[0]
out_folder = wkDir.joinpath(f'!{addname}')
out_folder.mkdir(exist_ok=True)
out_path = out_folder.joinpath(content.name)
copy(content, out_path)
files = os.listdir(wkDir)
for f in files:
if f.endswith(".txt"):
shutil.move(f, out_folder)
I kindly request for assistance with incrementing and copying files one by one to the newly created directory for each as mentioned.
Not much skills with python in general. Python3 OS Windows
Thanks in advance

Now, I understand what you want to accomplish. I think you can do it quite easily by only iterating over the text files and for each one you copy the BBS folder. After that you move the file you are currently at. In order to get the folder_num, you may be able to just access the file name's characters at the particular indexes (e.g. f[4:6]) if the name is always of the pattern TextXX.txt. If the prefix "Text" may vary, it is more stable to use regular expressions like in the following sample.
Also, the function shutil.copytree copies a directory with its children.
import re
import shutil
from pathlib import Path
wkDir = Path.cwd()
src = wkDir.joinpath('BBS')
for f in os.listdir(wkDir):
if f.endswith(".txt"):
folder_num = re.findall(r"\d+", f)[0]
target = wkDir.joinpath(f"{src.name}{folder_num}")
# copy BBS
shutil.copytree(src, target)
# move .txt file
shutil.move(f, target)

Python loop through directories

I am trying to use python library os to loop through all my subdirectories in the root directory, and target specific file name and rename them.
Just to make it clear this is my tree structure
My python file is located at the root level.
What I am trying to do, is to target the directory 942ba loop through all the sub directories and locate the file 000000 and rename it to 000000.csv
the current code I have is as follow:
import os
root = '<path-to-dir>/942ba956-8967-4bec-9540-fbd97441d17f/'
for dirs, subdirs, files in os.walk(root):
for f in files:
print(dirs)
if f == '000000':
dirs = dirs.strip(root)
f_new = f + '.csv'
os.rename(os.path.join(r'{}'.format(dirs), f), os.path.join(r'{}'.format(dirs), f_new))
But this is not working, because when I run my code, for some reasons the code strips the date from the subduers
can anyone help me to understand how to solve this issue?

A more efficient way to iterate through the folders and only select the files you are looking for is below:
source_folder = '<path-to-dir>/942ba956-8967-4bec-9540-fbd97441d17f/'
files = [os.path.normpath(os.path.join(root,f)) for root,dirs,files in os.walk(source_folder) for f in files if '000000' in f and not f.endswith('.gz')]
for file in files:
os.rename(f, f"{f}.csv")
The list comprehension stores the full path to the files you are looking for. You can change the condition inside the comprehension to anything you need. I use this code snippet a lot to find just images of certain type, or remove unwanted files from the selected files.
In the for loop, files are renamed adding the .csv extension.

I would use glob to find the files.
import os, glob
zdir = '942ba956-8967-4bec-9540-fbd97441d17f'
files = glob.glob('*{}/000000'.format(zdir))
for fly in files:
os.rename(fly, '{}.csv'.format(fly))

Iterate over files located in different folders

I’d like to write a function to iterate over excel files that are in different folders. Parts of the path of each file are the same, for instance:
C:\Main\Division\Reports\Year\Data.xls
The only part of each path that changes is ‘Year’. The files all have the same name.
Is there a way to do this with a placeholder for Year? If not, what approach should I take?

You can use os.listdir function
directory = "C:\Main\Division\Reports"
root_dir = os.path.dirname(directory)
for data in os.listdir(directory):
file_name = os.path.join(root_dir, data, 'Data.xls')
# do something

You could try os.walk
import os
parent = "C:\Main\Division\Reports"
for root, directory, files in os.walk(parent):
print root
print directory
print files

python script for selecting and deleting files

I accidently made copies of a lot of files on my computer.
But one thing I noticed was that they all ended with the suffix ".copy", so in order to delete them I would like to write a python script to select these files and then deleting them.
How do i go about doing that?

import os
dir = 'C:\\Path\\To\\Directory' # if using Windows
#dir = '/path/to/directory' # if using Linux/OS X
files = [os.path.join(dir, f) for dir, subdir, files in os.walk(dir) for f in files if f.endswith('.copy')]
for f in files:
print f
# os.remove.path(f)
This will iterate through all files and folders starting at the root dir
Remove the hash tag # in front of os.remove.path after the first run once you've verified the correct files are being removed.

import os
for file_name in os.listdir("path/to/the/folder_with_files"):
if file_name.endswith('.copy'):
os. delete(file_name)
Hopes that work.

Visiting multiple folders with extensions

I'm working on something here, and I'm completely confused. Basically, I have the script in my directory, and that script has to run on multiple folders with a particular extension. Right now, I have it up and running on a single folder. Here's the structure, I have a main folder say, Python, inside that I have multiple folders all with the same .ext, and inside each sub-folder I again have few folders, inside which I have the working file.
Now, I want the script to visit the whole path say, we are inside the main folder 'python', inside which we have folder1.ext->sub-folder1->working-file, come out of this again go back to the main folder 'Python' and start visiting the second directory.
Now there are so many things in my head, the glob module, os.walk, or the for loop. I'm getting the logic wrong. I desperately need some help.
Say, Path=r'\path1'
How do I start about? Would greatly appreciate any help.

I'm not sure if this is what you want, but this main function with a recursive helper function gets a dictionary of all of the files in a main directory:
import os, os.path
def getFiles(path):
'''Gets all of the files in a directory'''
sub = os.listdir(path)
paths = {}
for p in sub:
print p
pDir = os.path.join(path, p)
if os.path.isdir(pDir):
paths.update(getAllFiles(pDir, paths))
else:
paths[p] = pDir
return paths
def getAllFiles(mainPath, paths = {}):
'''Helper function for getFiles(path)'''
subPaths = os.listdir(mainPath)
for path in subPaths:
pathDir = os.path.join(path, p)
if os.path.isdir(pathDir):
paths.update(getAllFiles(pathDir, paths))
else:
paths[path] = pathDir
return paths
This returns a dictionary of the form {'my_file.txt': 'C:\User\Example\my_file.txt', ...}.

Since you distinguish first level directories from its sub-directories, you could do something like this:
# this is a generator to get all first level directories
dirs = (d for d in os.listdir(my_path) if os.path.isdir(d)
and os.path.splitext(d)[-1] == my_ext)
for d in dirs:
for root, sub_dirs, files in os.walk(d):
for f in files:
# call your script on each file f

You could use Formic (disclosure: I am the author). Formic allows you to specify one multi-directory glob to match your files so eliminating directory walking:
import formic
fileset = formic.FileSet(include="*.ext/*/working-file", directory=r"path1")
for file_name in fileset:
# Do something with file_name
A couple of points to note:
/*/ matches every subdirectory, while /**/ recursively descends into every subdirectory, their subdirectories and so on. Some options:
If the working file is precisely one directory below your *.ext, then use /*/
If the working file is at any depth under *.ext, then use /**/ instead.
If the working file is at least one directory, then you might use /*/**/
Formic starts searching in the current working directory. If this is the correct directory, you can omit the directory=r"path1"
I am assuming the working file is literally called working-file. If not, substitute a glob that matches it, like *.sh or script-*.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Remove certain filetypes in Python - python

do a for loop with the list of files as the thing you are looping over. directory = os.path.dirname(obj) m = MeshExporterApplication(directory) for filename in os.listdir(dir): if not(re.match(".*\.mesh\".xml ,filename) is None): os.remove(directory + "/" + file)

Related

Move files one by one to newly created directories for each file with Python 3

Python loop through directories

Iterate over files located in different folders

python script for selecting and deleting files

Visiting multiple folders with extensions

Categories

Resources