Remove all files in a directory matching regular expression in Python - python

I have two files in the directory home/documents/ named 2018-06-rs.csv000 and 2018-06-rs.csv001. I want to remove both the files from the directory.
Following is my code:
import datetime
import os
now = datetime.datetime.now()
file_date = now.strftime("%Y-%m")
os.remove("/home/documents/"+file_date+"-rs.csv*")
The error I'm getting is :
OSError: [Errno 2] No such file or directory: '/home/documents/201806-rs.csv*'
Listing the above path directs to the actual file though.
ls /home/documents/201806-rs.csv*
Appreciate any feedback.

Try this:
import os, re
def purge(dir, pattern):
for f in os.listdir(dir):
if re.search(pattern, f):
os.remove(os.path.join(dir, f))
Make sure dir is the correct path to the directory that contains your files, and pattern is a valid regex.

Related

Get files from specific folders in python

I have the following directory structure with the following files:
Folder_One
├─file1.txt
├─file1.doc
└─file2.txt
Folder_Two
├─file2.txt
├─file2.doc
└─file3.txt
I would like to get only the .txt files from each folder listed. Example:
Folder_One-> file1.txt and file2.txt
Folder_Two-> file2.txt and file3.txt
Note: This entire directory is inside a folder called dataset. My code looks like this, but I believe something is missing. Can someone help me.
path_dataset = "./dataset/"
filedataset = os.listdir(path_dataset)
for i in filedataset:
pasta = ''
pasta = pasta.join(i)
for file in glob.glob(path_dataset+"*.txt"):
print(file)
from pathlib import Path
for path in Path('dataset').rglob('*.txt'):
print(path.name)
Using glob
import glob
for x in glob.glob('dataset/**/*.txt', recursive=True):
print(x)
You can use re module to check that filename ends with .txt.
import re
import os
path_dataset = "./dataset/"
l = os.listdir(path_dataset)
for e in l:
if os.path.isdir("./dataset/" + e):
ll = os.listdir(path_dataset + e)
for file in ll:
if re.match(r".*\.txt$", file):
print(e + '->' + file)
One may use an additional option to check and find all files by using the os module (this is of advantage if you already use this module):
import os
#get current directory, you may also provide an absolute path
path=os.getcwd()
#walk recursivly through all folders and gather information
for root, dirs, files in os.walk(path):
#check if file is of correct type
check=[f for f in files if f.find(".txt")!=-1]
if check!=[]:print(root,check)

Trouble renaming files to customized names

Can't rename old files located in a folder in desktop. There are three files there item.pdf,item1.pdf and item2.pdf. What I wish to do now is rename those files to new_item.pdf,new_item1.pdf and new_item2.pdf.
I tried with the below script:
import os
filepath = "/Users/WCS/Desktop/all_files/"
for item in os.listdir(filepath):
os.rename(item,"new_name"+".pdf")
Executing the above script throws the following error. Whereas the folder address is accurate:
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'item.pdf' -> 'new_name.pdf'
How can I rename these three files item.pdf,item1.pdf and item2.pdf to new_item.pdf,new_item1.pdf and new_item2.pdf from a folder?
Try this:
import os
import re
filepath = "/Users/WCS/Desktop/all_files/"
for item in os.listdir(filepath):
match = re.search(r'\d+$', item)
endnum = ""
if match:
endnum = match.group()
os.rename(os.path.join(filepath, item), os.path.join(filepath, "new_name{}.pdf".format(endnum)))
or, if you don't wanna use re
import os
filepath = "/Users/WCS/Desktop/all_files/"
for item in os.listdir(filepath):
new_name = item.replace('item', 'new_item')
os.rename(os.path.join(filepath, item), os.path.join(filepath, "new_name{}.pdf".format(new_name)))
You need to either specify the full path to your file in os.rename.
Something like:
for item in filepath:
os.rename(os.path.join(filepath, item), os.path.join(filepath, "new_item.pdf"))
Or change your current working directory to the directory where the files exist:
os.chdir("/your/file/path")
and then run your code.
See also https://docs.python.org/2/library/os.html#os.rename

Recovering filenames from a folder in linux using python

I am trying to use the listdir function from the os module in python to recover a list of filenames from a particular folder.
here's the code:
import os
def rename_file():
# extract filenames from a folder
#for each filename, rename filename
list_of_files = os.listdir("/home/admin-pc/Downloads/prank/prank")
print (list_of_files)
I am getting the following error:
OSError: [Errno 2] No such file or directory:
it seems to give no trouble in windows, where you start your directory structure from the c drive.
how do i modify the code to work in linux?
The code is correct. There should be some error with the path you provided.
You could open a terminal and enter into the folder first. In the terminal, just key in pwd, then you could get the correct path.
Hope that works.
You could modify your function to exclude that error with check of existence of file/directory:
import os
def rename_file():
# extract filenames from a folder
#for each filename, rename filename
path_to_file = "/home/admin-pc/Downloads/prank/prank"
if os.exists(path_to_file):
list_of_files = os.listdir(path_to_file)
print (list_of_files)

How can get a list of files in a specific directory ignoring the symbolic links using python?

I need to process the filenames from a directory by creating the list of the filenames.
But my resulting list contains entries for symbolic links too. How can I get pure filenames in a particular directory using python.
I have tried:os.walk,os.listdir,os.path.isfile
But all are including symbolic links of type 'filename~' to the list :(
The glob.glob adds the path to the list which I don't need.
I need to use it in a code like this:
files=os.listdir(folder)
for f in files:
dosomething(like find similar file f in other folder)
Any help? Or please redirect me to the right answer. Thanks
Edit: the tilde sign is at end
To get regular files in a directory:
import os
from stat import S_ISREG
for filename in os.listdir(folder):
path = os.path.join(folder, filename)
try:
st = os.lstat(path) # get info about the file (don't follow symlinks)
except EnvironmentError:
continue # file vanished or permission error
else:
if S_ISREG(st.st_mode): # is regular file?
do_something(filename)
If you still see 'filename~' filenames then it means that they are not actually symlinks. Just filter them using their names:
filenames = [f for f in os.listdir(folder) if not f.endswith('~')]
Or using fnmatch:
import fnmatch
filenames = fnmatch.filter(os.listdir(folder), '*[!~]')
You can use os.path.islink(yourfile) to check if yourfile is symlinked, and exclude it.
Something like this works for me:
folder = 'absolute_path_of_yourfolder' # without ending /
res = []
for f in os.listdir(folder):
absolute_f = os.path.join(folder, f)
if not os.path.islink(absolute_f) and not os.path.isdir(absolute_f):
res.append(f)
res # will get you the files not symlinked nor directory
...

How to find all files with a particular extension? [duplicate]

This question already has answers here:
Find all files in a directory with extension .txt in Python
(25 answers)
Closed 2 months ago.
I am trying to find all the .c files in a directory using Python.
I wrote this, but it is just returning me all files - not just .c files:
import os
import re
results = []
for folder in gamefolders:
for f in os.listdir(folder):
if re.search('.c', f):
results += [f]
print results
How can I just get the .c files?
try changing the inner loop to something like this
results += [each for each in os.listdir(folder) if each.endswith('.c')]
Try "glob":
>>> import glob
>>> glob.glob('./[0-9].*')
['./1.gif', './2.txt']
>>> glob.glob('*.gif')
['1.gif', 'card.gif']
>>> glob.glob('?.gif')
['1.gif']
KISS
# KISS
import os
results = []
for folder in gamefolders:
for f in os.listdir(folder):
if f.endswith('.c'):
results.append(f)
print results
There is a better solution that directly using regular expressions, it is the standard library's module fnmatch for dealing with file name patterns. (See also glob module.)
Write a helper function:
import fnmatch
import os
def listdir(dirname, pattern="*"):
return fnmatch.filter(os.listdir(dirname), pattern)
and use it as follows:
result = listdir("./sources", "*.c")
for _,_,filenames in os.walk(folder):
for file in filenames:
fileExt=os.path.splitext(file)[-1]
if fileExt == '.c':
results.append(file)
For another alternative you could use fnmatch
import fnmatch
import os
results = []
for root, dirs, files in os.walk(path)
for _file in files:
if fnmatch.fnmatch(_file, '*.c'):
results.append(os.path.join(root, _file))
print results
or with a list comprehension:
for root, dirs, files in os.walk(path)
[results.append(os.path.join(root, _file))\
for _file in files if \
fnmatch.fnmatch(_file, '*.c')]
or using filter:
for root, dirs, files in os.walk(path):
[results.append(os.path.join(root, _file))\
for _file in fnmatch.filter(files, '*.c')]
Change the directory to the given path, so that you can search files within directory. If you don't change the directory then this code will search files in your present directory location:
import os #importing os library
import glob #importing glob library
path=raw_input() #input from the user
os.chdir(path)
filedata=glob.glob('*.c') #all files with .c extenstions stores in filedata.
print filedata
import os, re
cfile = re.compile("^.*?\.c$")
results = []
for name in os.listdir(directory):
if cfile.match(name):
results.append(name)
The implementation of shutil.copytree is in the docs. I mofdified it to take a list of extentions to INCLUDE.
def my_copytree(src, dst, symlinks=False, *extentions):
""" I modified the 2.7 implementation of shutils.copytree
to take a list of extentions to INCLUDE, instead of an ignore list.
"""
names = os.listdir(src)
os.makedirs(dst)
errors = []
for name in names:
srcname = os.path.join(src, name)
dstname = os.path.join(dst, name)
try:
if symlinks and os.path.islink(srcname):
linkto = os.readlink(srcname)
os.symlink(linkto, dstname)
elif os.path.isdir(srcname):
my_copytree(srcname, dstname, symlinks, *extentions)
else:
ext = os.path.splitext(srcname)[1]
if not ext in extentions:
# skip the file
continue
copy2(srcname, dstname)
# XXX What about devices, sockets etc.?
except (IOError, os.error), why:
errors.append((srcname, dstname, str(why)))
# catch the Error from the recursive copytree so that we can
# continue with other files
except Error, err:
errors.extend(err.args[0])
try:
copystat(src, dst)
# except WindowsError: # cant copy file access times on Windows
# pass
except OSError, why:
errors.extend((src, dst, str(why)))
if errors:
raise Error(errors)
Usage: For example, to copy only .config and .bat files....
my_copytree(source, targ, '.config', '.bat')
this is pretty clean.
the commands come from the os library.
this code will search through the current working directory and list only the specified file type. You can change this by replacing 'os.getcwd()' with your target directory and choose the file type by replacing '(ext)'. os.fsdecode is so you don't get a bytewise error from .endswith(). this also sorts alphabetically, you can remove sorted() for the raw list.
import os
filenames = sorted([os.fsdecode(file) for file in os.listdir(os.getcwd()) if os.fsdecode(file).endswith(".(ext)")])
Here's yet another solution, using pathlib (and Python 3):
from pathlib import Path
gamefolder = "path/to/dir"
result = sorted(Path(gamefolder).glob("**.c"))
Notice the double asterisk (**) in the glob() argument. This will search the gamefolder as well as its subdirectories. If you only want to search the gamefolder, use a single * in the pattern: "*.c". For more details, see the documentation.
If you replace '.c' with '[.]c$', you're searching for files that contain .c as the last two characters of the name, rather than all files that contain a c, with at least one character before it.
Edit: Alternatively, match f[-2:] with '.c', this MAY be computationally cheaper than pulling out a regexp match.
Just to be clear, if you wanted the dot character in your search term, you could've escaped it too:
'.*[backslash].c' would give you what you needed, plus you would need to use something like:
results.append(f), instead of what you had listed as results += [f]
This function returns a list of all file names with the specified extension that live in the specified directory:
import os
def listFiles(path, extension):
return [f for f in os.listdir(path) if f.endswith(extension)]
print listFiles('/Path/to/directory/with/files', '.txt')
If you want to list all files with the specified extension in a certain directory and its subdirectories you could do:
import os
def filterFiles(path, extension):
return [file for root, dirs, files in os.walk(path) for file in files if file.endswith(extension)]
print filterFiles('/Path/to/directory/with/files', '.txt')
You can actually do this with just os.listdir
import os
results = [f for f in os.listdir(gamefolders/folder) if f.endswith('.c')]

Categories

Resources