I need to change file names inside a folder
My file names are
BX-002-001.pdf
DX-001-002.pdf
GH-004-004.pdf
HJ-003-007.pdf
I need to add an additional zero after '-' at the end, like this
BX-002-0001.pdf
DX-001-0002.pdf
GH-004-0004.pdf
HJ-003-0007.pdf
I tried this
all_files = glob.glob("*.pdf")
for i in all_files:
fname = os.path.splitext(os.path.basename(i))[0]
fname = fname.replace("-00","-000")
My code is not working, can anyone help?
fname = fname.replace("-00","-000") only changes the variable fname in your program. It does not change the filename on your disk.
you can use os.rename() to actully apply the changes to your files:
all_files = glob.glob("*.pdf")
for i in all_files:
fname = os.path.splitext(os.path.basename(i))[0]
fname = fname.replace("-00","-000")
os.rename(i, os.path.join(os.path.dirname(i), fname ))
Related
I have a folder named "animals"
Inside the folder I have the following files:
"cat.PNG", "dog.PNG", "horse.PNG", "sheep.PNG"
I know the following code will change the files to lowercase
files = os.listdir('.')
for f in files:
new = f.lower()
os.rename(f, new)
But how would I change this if I wanted the file type to be lower and the name of the animal to be upper of every file?
The cleanest way (which works for any directory and any extension too):
for f in os.listdir(source_dir):
name,ext = os.path.splitext()
os.rename(os.path.join(source_dir,f), os.path.join(source_dir,name+ext.lower())
split name into radix+extension
convert extension to lowercase
perform rename with full path
A really simple solution would be the following:
for f in files:
new = f.upper()
new.replace(".PNG", ".png")
os.rename(f, new)
You can split the file name, do each operation individually, then rejoin them.
files = os.listdir('.')
for f in files:
# Split the filename by '.'
split_filename = f.split('.')
filename = ".".join(split_filename[:-1])
extension = split_filename[:-1]
# Do each operation
filename = filename.upper()
extension = extension.lower()
# Rejoin the filename
new_filename = filename + '.' + extension
# Rename the file
os.rename(new_filename, new)
(base, ext) = f.split('.')
new_name = f'{c.upper()}.{d.lower()}'
os.rename(f, new_name)
You can use split and join, see this example:
file_names = ["cat.PNG", "dog.PNG", "horse.PNG", "sheep.PNG"]
for file_name in file_names:
name, extension = file_name.split('.')
print('.'.join([name.upper(), extension.lower()]))
I am trying to create a dataset using pd.DataFrame to store file name and file extension of all the files in my directory. I eventually want to have two variables named Name and Extension. The name variable will have a list of file names and the extension variable should have a file type such as xlsx, and png.
I am new to python and was only able to get to this. This gives me a list of file names but I don't know how to incorporate the file extension part. Could anyone please help?
List = pd.DataFrame()
path = 'C:/Users/documnets/'
filelist = []
filepath = []
# r=root, d=directories, f = files
for subdir, dirs, files in os.walk(path):
for file in files:
filelist.append(file)
filename, file_extension = os.path.splitext('/path/to/somefile.xlsx')
filepath.append(file_extension)
List = pd.DataFrame(flielist, filepath)
Also, for this part: os.path.splitext('/path/to/somefile.xlsx'), can I leave what's in the parenthesis as it is or should I replace with my directory path?
Thank you
You can do this:
import os
import pandas as pd
path = 'C:/Users/documnets/'
filename = []
fileext = []
for file in os.listdir(path):
name, ext = file.split('.')
filename.append(name)
fileext.append(ext)
columns = ["Name", "Extension"]
data = [filename, fileext]
df = pd.DataFrame(data, columns).transpose()
My question: Is there a way to load data from all files in a directory using Python
Input: Get all files in a given directory of mine (wow.txt, testting.txt,etc.)
Process: I want to run all the files through a def function
Output: I want the output to be all the files names and their respective content below it.For example:
/home/file/wow.txt
"all of its content"
/home/file/www.txt
"all of its content"
Here is my code:
# Import Functions
import os
import sys
# Define the file path
path="/home/my_files"
file_name="wow.txt"
#Load Data Function
def load_data(path,file_name):
"""
Input : path and file_name
Purpose: loading text file
Output : list of paragraphs/documents and
title(initial 100 words considered as title of document)
"""
documents_list = []
titles=[]
with open( os.path.join(path, file_name) ,"rt", encoding='latin-1') as fin:
for line in fin.readlines():
text = line.strip()
documents_list.append(text)
print("Total Number of Documents:",len(documents_list))
titles.append( text[0:min(len(text),100)] )
return documents_list,titles
#Output
load_data(path,file_name)
Here is my output:
My Problem is that my output only takes one file and shows its content. Obviously, i defined the path and file name in my code to one file but I am confused as to how to write the path in a way to load all the files and output each of its contents separately. Any suggestions?
Using glob:
import glob
files = glob.glob("*.txt") # get all the .txt files
for file in files: # iterate over the list of files
with open(file, "r") as fin: # open the file
# rest of the code
Using os.listdir():
import os
arr = os.listdir()
files = [x for x in arr if x.endswith('.txt')]
for file in files: # iterate over the list of files
with open(file, "r") as fin: # open the file
# rest of the code
Try this:
import glob
for file in glob.glob("test/*.xyz"):
print(file)
if my directory name was "test" and I had lots of xyz files in them...
You can use glob and pandas
import pandas as pd
import glob
path = r'some_directory' # use your path
all_files = glob.glob(path + "/*.txt")
li = []
for filename in all_files:
#read file here
# if you decide to use pandas you might need to use the 'sep' paramaeter as well
df = pd.read_csv(filename, index_col=None, header=0)
li.append(df)
# get it all together
frame = pd.concat(li, axis=0, ignore_index=True)
I will take advantage of the function you have already written, so use the following:
data = []
path="/home/my_files"
dirs = os.listdir( path )
for file in dirs:
data.append(load_data(path, file))
In this case you will have all data in the list data.
Hi you can use a for loop on a listdir:
os.listdir(<path of your directory>)
this gives you the list of files in your directory, but this gives you also the name of folders in that directory
Try generating a file list first, then passing that to a modified version of your function.
def dir_recursive(dirName):
import os
import re
fileList = list()
for (dir, _, files) in os.walk(dirName):
for f in files:
path = os.path.join(dir, f)
if os.path.exists(path):
fileList.append(path)
fList = list()
prog = re.compile('.txt$')
for k in range(len(fileList)):
binMatch = prog.search(fileList[k])
if binMatch:
fList.append(binMatch.string)
return fList
def load_data2(file_list):
documents_list = []
titles=[]
for file_path in file_list:
with open( file_path ,"rt", encoding='latin-1') as fin:
for line in fin.readlines():
text = line.strip()
documents_list.append(text)
print("Total Number of Documents:",len(documents_list))
titles.append( text[0:min(len(text),100)] )
return documents_list,titles
# Generate a file list & load the data from it
file_list = dir_recursive(path)
documents_list, titles = load_data2(file_list)
I have some files in a folder named like this test_1999.0000_seconds.vtk. What I would like to do is to is to change the name of the file to test_1999.0000.vtk.
You can use os.rename
os.rename("test_1999.0000_seconds.vtk", "test_1999.0000.vtk")
import os
currentPath = os.getcwd() # get the current working directory
unWantedString = "_seconds"
matchingFiles =[]
for path, subdirs, files in os.walk(currentPath):
for f in files:
if f.endswith(".vtk"): # To group the vtk files
matchingFiles.append(path+"\\"+ f) #
print matchingFiles
for mf in matchingFiles:
if unWantedString in mf:
oldName = mf
newName = mf.replace(unWantedString, '') # remove the substring from the string
os.rename(oldName, newName) # rename the old files with new name without the string
So I'm trying to iterate through a list of files that are within a subfolder named eachjpgfile and change the file from doc to the subfolder eachjpgfile mantaining the file's name but when I do this it adds the file to directory before eachjpgfile rather than keeping it in it. Looking at the code below, can you see why is it doing this and how can I keep it in the eachjpgfile directory?
Here is the code:
for eachjpgfile in filelist:
os.chdir(eachjpgfile)
newdirectorypath = os.curdir
list_of_files = os.listdir(newdirectorypath)
for eachfile in list_of_files:
onlyfilename = os.path.splitext(eachfile)[0]
if onlyfilename == 'doc':
newjpgfilename = eachfile.replace(onlyfilename,eachjpgfile)
os.rename(eachfile, newjpgfilename)
There is a lot of weird stuff going on in here, but I think the one that's causing your particular issue is using 'eachjpgfile' in 'eachfile.replace'.
From what I can tell, the 'eachjpgfile' you're passing in is a full-path, so you're replacing 'doc' in the filename with '/full/path/to/eachjpgfile', which puts it parallel to the 'eachjpgfile' directory regardless of your current working directory.
You could add a line to split the path/file names prior to the replace:
for eachjpgfile in filelist:
os.chdir(eachjpgfile)
newdirectorypath = os.curdir
list_of_files = os.listdir(newdirectorypath)
for eachfile in list_of_files:
onlyfilename = os.path.splitext(eachfile)[0]
if onlyfilename == 'doc':
root, pathName= os.path.split(eachjpgfile) #split out dir name
newjpgfilename = eachfile.replace(onlyfilename,pathName)
os.rename(eachfile, newjpgfilename)
which is a very dirty fix for a very dirty script. :)
try this:
import os
path = '.'
recursive = False # do not descent into subdirs
for root,dirs,files in os.walk( path ) :
for name in files :
new_name = name.replace( 'aaa', 'bbb' )
if name != new_name :
print name, "->", new_name
os.rename( os.path.join( root, name),
os.path.join( root, new_name ) )
if not recursive :
break