Issues getting file extensions using os.path.splitext() method - python

I am working on a Flask blog application, and I am trying to separate a file extension from a filename. For example, for the filename "IMG_0503.jpg," I would like to get "IMG_0503" and ".jpg" separately. I tried the pathlib module and the os.path.splitext() method, but both return an empty file extension instead of ".jpg"
Here is my code with os.path.splitext():
uploaded_file = request.files['file']
print("UPLOADED_FILE:", uploaded_file)
filename = secure_filename(uploaded_file.filename)
print("FILENAME:", filename)
filename = os.path.splitext(filename)[0]
file_ext = os.path.splitext(filename)[1]
print("OS.PATH.SPLITEXT:", os.path.splitext(filename))
print("FILE_EXT:", file_ext)
This is the output of print statements:
UPLOADED_FILE: <FileStorage: 'IMG_0503.jpg' ('image/jpeg')>
FILENAME: IMG_0503.jpg
OS.PATH.SPLITEXT: ('IMG_0503', '')
FILE_EXT:
With the pathlib, the output is exactly the same, but I am using file_ext = pathlib.Path(filename).suffix to get the extension.
Can someone please help me to resolve this issue?

Alternate solution:
import pathlib
# function to return the file extension
file_extension = pathlib.Path('my_file.txt').suffix
print("File Extension: ", file_extension)
Output:
File Extension: .txt

Related

How to only get files without extensions?

My problem is to get ONLY files without extensions.
I mean - I have a dictionary and there are some files without extensions and some files with extensions (.xml, .csv, etc)
I want that my code would only read files without extensions.
Now, it's reading every file in the dictionary "Dir".
path = 'C:/Users/STJ2TW/Desktop/Dir/'
for filename in os.listdir(path):
fullname = os.path.join(path, filename)
Thanks in advance!
You can split the filename using the splittext function and check for the ones which are not a directory and do not have an extension value (ext).
import os
path = os.getcwd()
for filename in os.listdir(path):
if not os.path.isdir(filename):
(name, ext) = os.path.splitext(filename)
if not ext:
# Your code here
If there are no dots in your files, you can do :
path = 'C:/Users/STJ2TW/Desktop/Dir/'
for filename in os.listdir(path):
if '.' not in filename:
fullname = os.path.join(path, filename)

How to rename the file extension, by removing archive dates

I am thinking this code should take all my files within the folder, and rename .pdf_(date) to .pdf. However, it is not.
import os,sys
folder = 'C:\/MattCole\/test'
for filename in os.listdir(folder):
infilename = os.path.join(folder,filename)
if not os.path.isfile(infilename): continue
oldbase = os.path.splitext(filename)
newname = infilename.replace('.pdf*', '.pdf')
output = os.rename(infilename, newname)
Example: file1.pdf_20160614-050421 renamed to file.pdf
There would be multiple files in the directory. Can someone tell me what I am doing wrong? I have also tried counting the extension and used '.pdf????????????', '.pdf'
This is a bit silly, you've got some perfectly good code here that you're not using. You should use it.
import os,sys
folder = 'C:\/MattCole\/test'
for filename in os.listdir(folder):
infilename = os.path.join(folder,filename)
if os.path.isfile(infilename):
oldbase, oldext = os.path.splitext(infilename)
if oldext.startswith('.pdf'):
output = os.rename(infilename, oldbase+'.pdf')
You want to split the old file name on _, then take the first part as new name:
>>> old_name = 'file1.pdf_20160614-050421'
>>> new_name = old_name.split('_')[0]
>>> new_name
'file1.pdf'

How to change multiple filenames in a directory using Python

I am learning Python and I have been tasked with:
adding "file_" to the beginning of each name in a directory
changing the extension (directory contains 4 different types currently: .py, .TEXT, .rtf, .text)
I have many files, all with different names, each 7 characters long. I was able to change the extensions but it feels very clunky. I am positive there is a cleaner way to write the following (but its functioning, so no complaints on that note):
import os, sys
path = 'C:/Users/dana/Desktop/text_files_2/'
for filename in os.listdir(path):
if filename.endswith('.rtf'):
newname = filename.replace('.rtf', '.txt')
os.rename(filename, newname)
elif filename.endswith('.py'):
newname = filename.replace('.py', '.txt')
os.rename(filename, newname)
elif filename.endswith('.TEXT'):
newname = filename.replace('.TEXT', '.txt')
os.rename(filename, newname)
elif filename.endswith('.text'):
newname = filename.replace('.text', '.txt')
os.rename(filename, newname)
I do still have a bit of a problem:
the script currently must be inside my directory for it to run.
I can not figure out how to add "file_" to the start of each of the filenames [you would think that would be the easy part]. I have tried declaring newname as
newname = 'file_' + str(filename)
it then states filename is undefined.
Any assistance on my two existing issues would be greatly appreciated.
The basic idea would be first get the file extension part and the real file name part, then put the filename into a new string.
os.path.splitext(p) method will help to get the file extensions, for example: os.path.splitext('hello.world.aaa.txt') will return ['hello.world.aaa', '.txt'], it will ignore the leading dots.
So in this case, it can be done like this:
import os
import sys
path = 'C:/Users/dana/Desktop/text_files_2/'
for filename in os.listdir(path):
filename_splitext = os.path.splitext(filename)
if filename_splitext[1] in ['.rtf', '.py', '.TEXT', '.text']:
os.rename(os.path.join(path, filename),
os.path.join(path, 'file_' + filename_splitext[0] + '.txt'))
Supply the full path name with os.path.join():
os.rename(os.path.join(path, filename), os.path.join(name, newname))
and you can run your program from any directory.
You can further simply your program:
extensions = ['.rtf', '.py', '.TEXT', '.text']
for extension in extensions:
if filename.endswith(extension):
newname = filename.replace(extension, '.txt')
os.rename(os.path.join(path, filename), os.path.join(path, newname))
break
All the other elif statements are not needed anymore.
import glob, os
path = 'test/'# your path
extensions = ['.rtf', '.py', '.TEXT', '.text']
for file in glob.glob(os.path.join(path, '*.*')):
file_path, extension = os.path.splitext(file)
if extension in extensions:
new_file_name = '{0}.txt'.format(
os.path.basename(file_path)
)
if not new_file_name.startswith('file_'): # check if file allready has 'file_' at beginning
new_file_name = 'file_{0}'.format( # if not add
new_file_name
)
new_file = os.path.join(
os.path.dirname(file_path),
new_file_name
)
os.rename(file, new_file)
file_path, extension = os.path.splitext(file) getting file path without extension and extension f.e ('dir_name/file_name_without_extension','.extension')
os.path.dirname(file_path) getting directory f.e if file_path is dir1/dir2/file.ext result will be 'dir1/dir2'
os.path.basename(file_path) getting file name without extension
import os, sys
path = 'data' // Initial path
def changeFileName( path, oldExtensions, newExtension ):
for name in os.listdir(path):
for oldExtension in oldExtensions:
if name.endswith(oldExtension):
name = os.path.join(path, name)
newName = name.replace(oldExtension, newExtension)
os.rename(name, newName)
break;
if __name__ == "__main__":
changeFileName( 'data', ['.py', '.rtf' , '.text', '.TEXT'], '.txt')
Use an array to store all the old extensions and iterate through them.

Delete keywords from filenames in python

I am fairly new to programming and wanted to delete a certain keyword like 'website.com' from all the filenames in a folder by looping through and searching for the keyword.Please help......thanx in advance!
This is some code I have written so far to loop through the files.
import os
rootdir = r'C:\Users\Hemant\Desktop\testfiles'
for subdir, dirs, files in os.walk(rootdir):
for file in files:
print(os.path.join(subdir, file))
Update:
Thanx to kponz.....my updated code is
import os
rootdir = r'C:\Users\Hemant\Desktop\myfiles'
str = " text"
for filename in os.listdir(rootdir):
if str in filename:
os.rename(filename, filename.replace(str, ""))
else:
continue
But now I am getting the following error
os.rename(filename, filename.replace(str, ""))
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'even more text.txt' -> 'even more.txt'
I am trying to delete the word 'text' from file named:
even more text.txt
some text.txt
much more text.txt
To replace a specific keyword, you can just use the string replace function:
import os
rootdir = r'C:\Users\Hemant\Desktop\testfiles'
str = "example.com"
for filename in os.listdir(rootdir):
if str in filename:
filepath = os.path.join(rootdir, filename)
newfilepath = os.path.join(rootdir, filename.replace(str, ""))
os.rename(filepath, newfilepath)

How can I check the extension of a file?

I'm working on a certain program where I need to do different things depending on the extension of the file. Could I just use this?
if m == *.mp3
...
elif m == *.flac
...
Assuming m is a string, you can use endswith:
if m.endswith('.mp3'):
...
elif m.endswith('.flac'):
...
To be case-insensitive, and to eliminate a potentially large else-if chain:
m.lower().endswith(('.png', '.jpg', '.jpeg'))
os.path provides many functions for manipulating paths/filenames. (docs)
os.path.splitext takes a path and splits the file extension from the end of it.
import os
filepaths = ["/folder/soundfile.mp3", "folder1/folder/soundfile.flac"]
for fp in filepaths:
# Split the extension from the path and normalise it to lowercase.
ext = os.path.splitext(fp)[-1].lower()
# Now we can simply use == to check for equality, no need for wildcards.
if ext == ".mp3":
print fp, "is an mp3!"
elif ext == ".flac":
print fp, "is a flac file!"
else:
print fp, "is an unknown file format."
Gives:
/folder/soundfile.mp3 is an mp3!
folder1/folder/soundfile.flac is a flac file!
Use pathlib From Python3.4 onwards.
from pathlib import Path
Path('my_file.mp3').suffix == '.mp3'
If you are working with folders that contain periods, you can perform an extra check using
Path('your_folder.mp3').is_file() and Path('your_folder.mp3').suffix == '.mp3'
to ensure that a folder with a .mp3 suffix is not interpreted to be an mp3 file.
Look at module fnmatch. That will do what you're trying to do.
import fnmatch
import os
for file in os.listdir('.'):
if fnmatch.fnmatch(file, '*.txt'):
print file
or perhaps:
from glob import glob
...
for files in glob('path/*.mp3'):
do something
for files in glob('path/*.flac'):
do something else
one easy way could be:
import os
if os.path.splitext(file)[1] == ".mp3":
# do something
os.path.splitext(file) will return a tuple with two values (the filename without extension + just the extension). The second index ([1]) will therefor give you just the extension. The cool thing is, that this way you can also access the filename pretty easily, if needed!
An old thread, but may help future readers...
I would avoid using .lower() on filenames if for no other reason than to make your code more platform independent. (linux is case sensistive, .lower() on a filename will surely corrupt your logic eventually ...or worse, an important file!)
Why not use re? (Although to be even more robust, you should check the magic file header of each file...
How to check type of files without extensions in python? )
import re
def checkext(fname):
if re.search('\.mp3$',fname,flags=re.IGNORECASE):
return('mp3')
if re.search('\.flac$',fname,flags=re.IGNORECASE):
return('flac')
return('skip')
flist = ['myfile.mp3', 'myfile.MP3','myfile.mP3','myfile.mp4','myfile.flack','myfile.FLAC',
'myfile.Mov','myfile.fLaC']
for f in flist:
print "{} ==> {}".format(f,checkext(f))
Output:
myfile.mp3 ==> mp3
myfile.MP3 ==> mp3
myfile.mP3 ==> mp3
myfile.mp4 ==> skip
myfile.flack ==> skip
myfile.FLAC ==> flac
myfile.Mov ==> skip
myfile.fLaC ==> flac
You should make sure the "file" isn't actually a folder before checking the extension. Some of the answers above don't account for folder names with periods. (folder.mp3 is a valid folder name).
Checking the extension of a file:
import os
file_path = "C:/folder/file.mp3"
if os.path.isfile(file_path):
file_extension = os.path.splitext(file_path)[1]
if file_extension.lower() == ".mp3":
print("It's an mp3")
if file_extension.lower() == ".flac":
print("It's a flac")
Output:
It's an mp3
Checking the extension of all files in a folder:
import os
directory = "C:/folder"
for file in os.listdir(directory):
file_path = os.path.join(directory, file)
if os.path.isfile(file_path):
file_extension = os.path.splitext(file_path)[1]
print(file, "ends in", file_extension)
Output:
abc.txt ends in .txt
file.mp3 ends in .mp3
song.flac ends in .flac
Comparing file extension against multiple types:
import os
file_path = "C:/folder/file.mp3"
if os.path.isfile(file_path):
file_extension = os.path.splitext(file_path)[1]
if file_extension.lower() in {'.mp3', '.flac', '.ogg'}:
print("It's a music file")
elif file_extension.lower() in {'.jpg', '.jpeg', '.png'}:
print("It's an image file")
Output:
It's a music file
import os
source = ['test_sound.flac','ts.mp3']
for files in source:
fileName,fileExtension = os.path.splitext(files)
print fileExtension # Print File Extensions
print fileName # It print file name
#!/usr/bin/python
import shutil, os
source = ['test_sound.flac','ts.mp3']
for files in source:
fileName,fileExtension = os.path.splitext(files)
if fileExtension==".flac" :
print 'This file is flac file %s' %files
elif fileExtension==".mp3":
print 'This file is mp3 file %s' %files
else:
print 'Format is not valid'
if (file.split(".")[1] == "mp3"):
print "its mp3"
elif (file.split(".")[1] == "flac"):
print "its flac"
else:
print "not compat"
If your file is uploaded then
import os
file= request.FILES['your_file_name'] #Your input file_name for your_file_name
ext = os.path.splitext(file.name)[-1].lower()
if ext=='.mp3':
#do something
elif ext=='.xls' or '.xlsx' or '.csv':
#do something
else:
#The uploaded file is not the required format
file='test.xlsx'
if file.endswith('.csv'):
print('file is CSV')
elif file.endswith('.xlsx'):
print('file is excel')
else:
print('none of them')
I'm surprised none of the answers proposed the use of the pathlib library.
Of course, its use is situational but when it comes to file handling or stats pathlib is gold.
Here's a snippet:
import pathlib
def get_parts(p: str or pathlib.Path) -> None:
p_ = pathlib.Path(p).expanduser().resolve()
print(p_)
print(f"file name: {p_.name}")
print(f"file extension: {p_.suffix}")
print(f"file extensions: {p_.suffixes}\n")
if __name__ == '__main__':
file_path = 'conf/conf.yml'
arch_file_path = 'export/lib.tar.gz'
get_parts(p=file_path)
get_parts(p=arch_file_path)
and the output:
/Users/hamster/temp/src/pro1/conf/conf.yml
file name: conf.yml
file extension: .yml
file extensions: ['.yml']
/Users/hamster/temp/src/pro1/conf/lib.tar.gz
file name: lib.tar.gz
file extension: .gz
file extensions: ['.tar', '.gz']

Categories

Resources