remove characters from every file name in directory with python - python

So I am writing a piece of code that needs to iterate through hundreds of files in a directory. With every filt it needs to filter out certain pieces of information in it then put it in a new file with a modified name.
For example, a file called 1100006_0.vcf or 5100164_12.vcf must have a file created called 1100006.vcf and 5100164.vcf respectively. Can you point me in the right direction for this?

EDIT: To make code Generic and rename file names from one directory to any other directory/folder try following. I have kept this program inside /tmp and renamed the files inside /tmp/test and it worked fine(in a Linux system).
#!/usr/bin/python3
import os
DIRNAME="/tmp/test"
files = os.listdir(DIRNAME)
for f in files:
if '.vcf' in f:
newname = f.split('_')[0]
newname = newname + '.vcf'
os.rename(os.path.join(DIRNAME,f), os.path.join(DIRNAME,newname))
Since you want to rename the files, so we could use os here. Written and tested with shown samples in python3, I have given DIRNAME as /tmp you could give your directory where you want to look for files.
#!/usr/bin/python3
import os
DIRNAME="/tmp"
files = os.listdir(DIRNAME)
for f in files:
if '.vcf' in f:
newname = f.split('_')[0]
newname = newname + '.vcf'
os.rename(f, newname)

As posted by RavinderSingh13, the code was fine, the only issue was that in renaming them, I would have two files of the same name (the difference between them was the underscore and number that I needed removed).
#!/usr/bin/python3
import os
DIRNAME="/tmp"
files = os.listdir(DIRNAME)
for f in files:
if '.vcf' in f:
newname = f.split('_')[0]
newname = newname + '.vcf'
os.rename(f, newname)

Related

CHange multiple file name python

I am having trouble with changing file name manually
I have folder with lots of file with name like
202012_34324_3643.txt
202012_89543_0292.txt
202012_01920_1922.txt
202012_23442_0928.txt
202012_21346_0202.txt
what i want it to be renamed as below removing numbers before _ and after _ leaving number in between underscore.
34324.txt
89543.txt
01920.txt
23442.txt
21346.txt
i want a script that reads all files in the folder renames it like above mentioned.
Thanks
You could try using the os library in python.
import os
# retrieve current files in the directory
fnames = os.listdir()
# split the string by '_' and access the middle index
new_names = [fnames.split('_')[1]+'.txt' for fname in fnames]
for oldname, newname in zip(fnames, new_names):
os.rename(oldname, newname)
This will do the work for the current directory.
import os
fnames = os.listdir()
for oldName in fnames:
if oldName[-4:] == '.txt' and len(oldName) - len(oldName.replace("_","")) == 2:
s = oldName.split('_')
os.rename(oldName, s[1]+'_'+s[2]+'.txt')

Renaming multiple files

So i'm trying to rename a list of files with set renames like so:
import os
import time
for fileName in os.listdir("."):
os.rename(fileName, fileName.replace("0001", "00016.5"))
os.rename(fileName, fileName.replace("0002", "00041"))
os.rename(fileName, fileName.replace("0003", "00042"))
...
but that gives me this error os.rename(fileName, fileName.replace("0002", "00041"))``OSError: [Errno 2] No such file ordirectory (the file is in the directory)
So next i tried
import os
import time
for fileName in os.listdir("."):
os.rename(fileName, fileName.replace("0001", "00016.5"))
for fileName in os.listdir("."):
os.rename(fileName, fileName.replace("0002", "00041"))
for fileName in os.listdir("."):
os.rename(fileName, fileName.replace("0003", "00042"))
...
But that renames the files very strangely with a lot on extra characters,
what im i doing wrong here?
The fact that multi-pass renaming works while single pass renaming doesn't means that some of your files contain the 0001 pattern as well as 0002 pattern.
So when doing only one loop, you're renaming files but you're given the old list of files (listdir returns a list, so it's outdated as soon as you rename a file) => some source files cannot be found.
When doing in multi-pass, you're applying multiple renames on some files.
That could work (and is more compact):
for fileName in os.listdir("."):
for before,after in (("0001", "00016.5"),("0002", "00041"),("0003", "00042")):
if os.path.exists(fileName):
newName = fileName.replace(before,after)
# file hasn't been renamed: rename it (only if different)
if newName != fileName:
os.rename(fileName,newName)
basically I won't rename a file if it doesn't exist (which means it has been renamed in a previous iteration). So there's only one renaming possible. You just have to prioritize which one.
listdir returns all object's names (files, directories, ...) not a full path. You can construct a full path using: os.path.join().
Your for loop renames, all found objects first to 00016.5, then to 00041 ...
One way to rename the files, could the following:
import os
import time
currentDir = os.pathdirname(__file__)
for fileName in os.listdir(currentDir):
if '0001' in fileName:
oldPath = os.path.join(currentDir, fileName)
newPath = os.path.join(currentDir, fileName.replace("0001", "00016.5"))
elif '0002' in fileName:
oldPath = os.path.join(currentDir, fileName)
newPath = os.path.join(currentDir, fileName.replace("0002", "00041"))
else:
continue
os.rename(oldPath, newPath)

Keeping renamed text files in original folder

This is my current (from a Jupyter notebook) code for renaming some text files.
The issue is when I run the code, the renamed files are placed in my current working Jupyter folder. I would like the files to stay in the original folder
import glob
import os
path = 'C:\data_research\text_test\*.txt'
files = glob.glob(r'C:\data_research\text_test\*.txt')
for file in files:
os.rename(file, file[-27:])
You should only change the name and keep the path the same. Your filename will not always be longer than 27 so putting this into you code is not ideal. What you want is something that just separates the name from the path, no matter the name, no matter the path. Something like:
import os
import glob
path = 'C:\data_research\text_test\*.txt'
files = glob.glob(r'C:\data_research\text_test\*.txt')
for file in files:
old_name = os.path.basename(file) # now this is just the name of your file
# now you can do something with the name... here i'll just add new_ to it.
new_name = 'new_' + old_name # or do something else with it
new_file = os.path.join(os.path.dirname(file), new_name) # now we put the path and the name together again
os.rename(file, new_file) # and now we rename.
If you are using windows you might want to use the ntpath package instead.
file[-27:] takes the last 27 characters of the filename so unless all of your filenames are 27 characters long, it will fail. If it does succeed, you've stripped off the target directory name so the file is moved to your current directory. os.path has utilities to manage file names and you should use them:
import glob
import os
path = 'C:\data_research\text_test*.txt'
files = glob.glob(r'C:\data_research\text_test*.txt')
for file in files:
dirname, basename = os.path.split(file)
# I don't know how you want to rename so I made something up
newname = basename + '.bak'
os.rename(file, os.path.join(dirname, newname))

How to rename the file extension, by removing archive dates

I am thinking this code should take all my files within the folder, and rename .pdf_(date) to .pdf. However, it is not.
import os,sys
folder = 'C:\/MattCole\/test'
for filename in os.listdir(folder):
infilename = os.path.join(folder,filename)
if not os.path.isfile(infilename): continue
oldbase = os.path.splitext(filename)
newname = infilename.replace('.pdf*', '.pdf')
output = os.rename(infilename, newname)
Example: file1.pdf_20160614-050421 renamed to file.pdf
There would be multiple files in the directory. Can someone tell me what I am doing wrong? I have also tried counting the extension and used '.pdf????????????', '.pdf'
This is a bit silly, you've got some perfectly good code here that you're not using. You should use it.
import os,sys
folder = 'C:\/MattCole\/test'
for filename in os.listdir(folder):
infilename = os.path.join(folder,filename)
if os.path.isfile(infilename):
oldbase, oldext = os.path.splitext(infilename)
if oldext.startswith('.pdf'):
output = os.rename(infilename, oldbase+'.pdf')
You want to split the old file name on _, then take the first part as new name:
>>> old_name = 'file1.pdf_20160614-050421'
>>> new_name = old_name.split('_')[0]
>>> new_name
'file1.pdf'

Renaming multiple files in a directory using Python

I'm trying to rename multiple files in a directory using this Python script:
import os
path = '/Users/myName/Desktop/directory'
files = os.listdir(path)
i = 1
for file in files:
os.rename(file, str(i)+'.jpg')
i = i+1
When I run this script, I get the following error:
Traceback (most recent call last):
File "rename.py", line 7, in <module>
os.rename(file, str(i)+'.jpg')
OSError: [Errno 2] No such file or directory
Why is that? How can I solve this issue?
Thanks.
You are not giving the whole path while renaming, do it like this:
import os
path = '/Users/myName/Desktop/directory'
files = os.listdir(path)
for index, file in enumerate(files):
os.rename(os.path.join(path, file), os.path.join(path, ''.join([str(index), '.jpg'])))
Edit: Thanks to tavo, The first solution would move the file to the current directory, fixed that.
You have to make this path as a current working directory first.
simple enough.
rest of the code has no errors.
to make it current working directory:
os.chdir(path)
import os
from os import path
import shutil
Source_Path = 'E:\Binayak\deep_learning\Datasets\Class_2'
Destination = 'E:\Binayak\deep_learning\Datasets\Class_2_Dest'
#dst_folder = os.mkdir(Destination)
def main():
for count, filename in enumerate(os.listdir(Source_Path)):
dst = "Class_2_" + str(count) + ".jpg"
# rename all the files
os.rename(os.path.join(Source_Path, filename), os.path.join(Destination, dst))
# Driver Code
if __name__ == '__main__':
main()
As per #daniel's comment, os.listdir() returns just the filenames and not the full path of the file. Use os.path.join(path, file) to get the full path and rename that.
import os
path = 'C:\\Users\\Admin\\Desktop\\Jayesh'
files = os.listdir(path)
for file in files:
os.rename(os.path.join(path, file), os.path.join(path, 'xyz_' + file + '.csv'))
Just playing with the accepted answer define the path variable and list:
path = "/Your/path/to/folder/"
files = os.listdir(path)
and then loop over that list:
for index, file in enumerate(files):
#print (file)
os.rename(path+file, path +'file_' + str(index)+ '.jpg')
or loop over same way with one line as python list comprehension :
[os.rename(path+file, path +'jog_' + str(index)+ '.jpg') for index, file in enumerate(files)]
I think the first is more readable, in the second the first part of the loop is just the second part of the list comprehension
If your files are renaming in random manner then you have to sort the files in the directory first. The given code first sort then rename the files.
import os
import re
path = 'target_folder_directory'
files = os.listdir(path)
files.sort(key=lambda var:[int(x) if x.isdigit() else x for x in re.findall(r'[^0-9]|[0-9]+', var)])
for i, file in enumerate(files):
os.rename(path + file, path + "{}".format(i)+".jpg")
I wrote a quick and flexible script for renaming files, if you want a working solution without reinventing the wheel.
It renames files in the current directory by passing replacement functions.
Each function specifies a change you want done to all the matching file names. The code will determine the changes that will be done, and displays the differences it would generate using colors, and asks for confirmation to perform the changes.
You can find the source code here, and place it in the folder of which you want to rename files https://gist.github.com/aljgom/81e8e4ca9584b481523271b8725448b8
It works in pycharm, I haven't tested it in other consoles
The interaction will look something like this, after defining a few replacement functions
when it's running the first one, it would show all the differences from the files matching in the directory, and you can confirm to make the replacements or no, like this
This works for me and by increasing the index by 1 we can number the dataset.
import os
path = '/Users/myName/Desktop/directory'
files = os.listdir(path)
index=1
for index, file in enumerate(files):
os.rename(os.path.join(path, file),os.path.join(path,''.join([str(index),'.jpg'])))
index = index+1
But if your current image name start with a number this will not work.

Categories

Resources