Deleting the useless output files using Python

Deleting the useless output files using Python - python

After I execute a python script from a particular directory, I get many output files but apart from 5-6 files I want to delete the rest from that directory. What I have done is, I have taken those 5-6 useful files inside a list and deleted all the other files which are not there in that list. Below is my code:
list1=['prog_1.py', 'prog_2.py', 'prog_3.py'] #Extend
import os
dir = '/home/dev/codes' #Change accordingly
for f in os.listdir(dir):
if f not in list1:
os.remove(os.path.join(dir, f))
Now here I just want to add one more thing, if the output files start with output_of_final, then I don't want them to be deleted. How can I do it? Should I use regex?

You could use Regex, but that's overkill here. Just use the str.startswith method.
Also, it's bad practice to use reserved keywords, built-in types and functions as variable names. I have renamed dir to directory. (https://docs.python.org/3/library/functions.html#dir)
list1 = ['prog_1.py', 'prog_2.py', 'prog_3.py'] # Extend
import os
directory = '/home/dev/codes' # Change accordingly
for f in os.listdir(directory):
if f not in list1 and not f.startswith('output_of_final'):
os.remove(os.path.join(directory, f))

yes the regex works here, but there are easier options like using startswith method for strings
list1=['prog_1.py', 'prog_2.py', 'prog_3.py'] #Extend
import os
dir = '/home/dev/codes' #Change accordingly
for f in os.listdir(dir):
if (f not in list1) and (not f.startswith('output_of_final')):
os.remove(os.path.join(dir, f))

Related

Find file in a directory with python and if multiple files show up matching decide which to open

basically what the title says, what is the best approach to do this?
I was looking at a few tools like the os.walk and scandir but then I am not sure how I would store them and decide which file to open if they are multiples. I was thinking I would need to store in a dictionary and then decide which numbered item I want.

you can use
list_of_files = os.listdit(some_directory)
which returns a list of names of the files that exist in that directory, you can easily add some of these names to a dictionary based on their index in this list.

Here is a function that implements the specifications you have outlined. It may require some tinkering as your specs evolve, but it's an ok start. See the docs for the os builtin package for more info :)
import os
def my_files_dict(directory, filename):
myfilesdict = []
with os.scandir(directory) as myfiles:
for f in myfiles:
if f.name == filename and f.is_file:
myfilesdict.append(f.name)
return dict(enumerate(myfilesdict))

How can I read files with similar names on python, rename them and then work with them?

I've already posted here with the same question but I sadly I couldn't come up with a solution (even though some of you guys gave me awesome answers but most of them weren't what I was looking for), so I'll try again and this time giving more information about what I'm trying to do.
So, I'm using a program called GMAT to get some outputs (.txt files with numerical values). These outputs have different names, but because I'm using them to more than one thing I'm getting something like this:
GMATd_1.txt
GMATd_2.txt
GMATf_1.txt
GMATf_2.txt
Now, what I need to do is to use these outputs as inputs in my code. I need to work with them in other functions of my script, and since I will have a lot of these .txt files I want to rename them as I don't want to use them like './path/etc'.
So what I wanted was to write a loop that could get these files and rename them inside the script so I can use these files with the new name in other functions (outside the loop).
So instead of having to this individually:
GMATds1= './path/GMATd_1.txt'
GMATds2= './path/GMATd_2.txt'
I wanted to write a loop that would do that for me.
I've already tried using a dictionary:
import os
import fnmatch
dict = {}
for filename in os.listdir('.'):
if fnmatch.fnmatch(filename, 'thing*.txt'):
examples[filename[:6]] = filename
This does work but I can't use the dictionary key outside the loop.

If I understand correctly, you try to fetch files with similar names (at least a re-occurring pattern) and rename them. This can be accomplished with the following code:
import glob
import os
all_files = glob.glob('path/to/directory/with/files/GMAT*.txt')
for file in files:
new_path = create_new_path(file) # possibly split the file name, change directory and/or filename
os.rename(file, new_path)
The glob library allows for searching files with * wildcards and makes it hence possible to search for files with a specific pattern. It lists all the files in a certain directory (or multiple directories if you include a * wildcard as a directory). When you iterate over the files, you could either directly work with the input of the files (as you apparently intend to do) or rename them as shown in this snippet. To rename them, you would need to generate a new path - so you would have to write the create_new_path function that takes the old path and creates a new one.

Since python 3.4 you should be using the built-in pathlib package instead of os or glob.
from pathlib import Path
import shutil
for file_src in Path("path/to/files").glob("GMAT*.txt"):
file_dest = str(file_src.resolve()).replace("ds", "d_")
shutil.move(file_src, file_dest)

you can use
import os
path='.....' # path where these files are located
path1='.....' ## path where you want these files to store
i=1
for file in os.listdir(path):
if file.endswith(end='.txt'):
os.rename(path + "/" + file, path1 + "/"+str(i) + ".txt")
i+=1
it will rename all the txt file in the source folder to 1,2,3,....n.txt

How to find files and skip directories in os.listdir

I use os.listdir and it works fine, but I get sub-directories in the list also, which is not what I want: I need only files.
What function do I need to use for that?
I looked also at os.walk and it seems to be what I want, but I'm not sure of how it works.

You need to filter out directories; os.listdir() lists all names in a given path. You can use os.path.isdir() for this:
basepath = '/path/to/directory'
for fname in os.listdir(basepath):
path = os.path.join(basepath, fname)
if os.path.isdir(path):
# skip directories
continue
Note that this only filters out directories after following symlinks. fname is not necessarily a regular file, it could also be a symlink to a file. If you need to filter out symlinks as well, you'd need to use not os.path.islink() first.
On a modern Python version (3.5 or newer), an even better option is to use the os.scandir() function; this produces DirEntry() instances. In the common case, this is faster as the direntry loaded already has cached enough information to determine if an entry is a directory or not:
basepath = '/path/to/directory'
for entry in os.scandir(basepath):
if entry.is_dir():
# skip directories
continue
# use entry.path to get the full path of this entry, or use
# entry.name for the base filename
You can use entry.is_file(follow_symlinks=False) if only regular files (and not symlinks) are needed.
os.walk() does the same work under the hood; unless you need to recurse down subdirectories, you don't need to use os.walk() here.

Here is a nice little one-liner in the form of a list comprehension:
[f for f in os.listdir(your_directory) if os.path.isfile(os.path.join(your_directory, f))]
This will return a list of filenames within the specified your_directory.

import os
directoryOfChoice = "C:\\" # Replace with a directory of choice!!!
filter(os.path.isfile, os.listdir(directoryOfChoice))
P.S: os.getcwd() returns the current directory.

for fname in os.listdir('.'):
if os.path.isdir(fname):
pass # do your stuff here for directory
else:
pass # do your stuff here for regular file

The solution with os.walk() would be:
for r, d, f in os.walk('path/to/dir'):
for files in f:
# This will list all files given in a particular directory

Even though this is an older post, let me please add the pathlib library introduced in 3.4 which provides an OOP style of handling directories and files for sakes of completeness. To get all files in a directory, you can use
def get_list_of_files_in_dir(directory: str, file_types: str ='*') -> list:
return [f for f in Path(directory).glob(file_types) if f.is_file()]
Following your example, you could use it like this:
mypath = '/path/to/directory'
files = get_list_of_files_in_dir(mypath)
If you only want a subset of files depending on the file extension (e.g. "only csv files"), you can use:
files = get_list_of_files_in_dir(mypath, '*.csv')

Note PEP 471 DirEntry object attributes is: is_dir(*, follow_symlinks=True)
so...
from os import scandir
folder = '/home/myfolder/'
for entry in scandir(folder):
if entry.is_dir():
# do code or skip
continue
myfile = folder + entry.name
#do something with myfile

If list object contains something remove from list

I have a python script that checks a certain folder for new files and then copies the new files to another directory. The files are in such a format 1234.txt and 1234_status.txt. It should only move 1234.txt and leave the 1234_status.txt unattended.
Here's a little piece of my code in python
while 1:
#retrieves listdir
after = dict([(f, None) for f in os.listdir (path_to_watch)])
#if after has more files than before, then it adds the new files to an array "added"
added = [f for f in after if not f in before]
My idea is that after it fills added, then it checks it for values that have status in it and pops it from the array. Couldn't find a way to do this though : /

If I understand your problem correctly:
while 1:
for f in os.listdir(path_to_watch):
if 'status' not in f: # or a more appropriate condition
move_file_to_another_directory(f)
# wait
or check pyinotify if using Linux to avoid useless checks.

added = [f for f in after if not f in before and '_status' not in f]
I do however recommend to refrain from long one line statements as they make the code almost impossible to read

files_in_directory = [filename for filename in os.listdir(directory_name)]
files_to_move = filter(lambda filename: '_status' not in filename, files_in_directory)

You can use set logic since order doesn't matter here:
from itertools import filterfalse
def is_status_file(filename):
return filename.endswith('_status.txt')
# ...
added = set(after) - set(before)
without_status = filterfalse(is_status_file, added)

Remove certain filetypes in Python

I am running a script that walks a directory structure and generates new files in each folder in the directory. I want to delete some of the files right after creation. This is my idea, but it is quite wrong I imagine:
directory = os.path.dirname(obj)
m = MeshExporterApplication(directory)
os.remove(os.path.join(directory,"*.mesh.xml"))
How to you put wildcards in a path? I guess not like /home/me/*.txt, but that is what I am trying.
Thanks,
Gareth

You can use the glob module:
import glob
glob.glob("*.mesh.xml")
to get a list of matching files. Then you delete them, one by one.
directory = os.path.dirname(obj)
m = MeshExporterApplication(directory)
# you can use absolute pathes in the glob
# to ensure, that you're purging the files in
# the right directory, e.g. "/tmp/*.mesh.xml"
for f in glob.glob("*.mesh.xml"):
os.remove(f)

do a for loop with the list of files as the thing you are looping over.
directory = os.path.dirname(obj)
m = MeshExporterApplication(directory)
for filename in os.listdir(dir):
if not(re.match(".*\.mesh\".xml ,filename) is None):
os.remove(directory + "/" + file)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Deleting the useless output files using Python - python

Related

Find file in a directory with python and if multiple files show up matching decide which to open

How can I read files with similar names on python, rename them and then work with them?

How to find files and skip directories in os.listdir

If list object contains something remove from list

Remove certain filetypes in Python

Categories

Resources