This question already has answers here:
How can I check the extension of a file?
(14 answers)
Closed 5 years ago.
Is there a way in Python to check a file name to see if its extension is included in the name? My current workaround is to simply check if the name contains a . in it, and add an extension if it doesn't...this obviously won't catch files with . but no extension in the name (ie. 12.10.13_file). Anyone have any ideas?
'12.10.13_file' as a filename, does have '13_file' as it's file extension. At least regarding the file system.
But, instead of finding the last . yourself, use os.path.splitext:
import os
fileName, fileExtension = os.path.splitext('/path/yourfile.ext')
# Results in:
# fileName = '/path/yourfile'
# fileExtension = '.ext'
If you want to exclude certain extensions, you could blacklist those after you've used the above.
You can use libmagic via https://pypi.python.org/pypi/python-magic to determine "file types." It's not 100% perfect, but a whole lot of files can be accurately classified this way, and then you can decide your own rules, such as .txt for text files, .pdf for PDFs, etc.
Don't think in terms of finding files with or without extensions--think of it in terms of classifying your files based on their content, ignoring their current names.
Related
This question already has answers here:
python - Finding the user's "Downloads" folder
(6 answers)
Closed 2 months ago.
I am wanting to set up a system of CSV formatters in conjunction with a PySimpleGUI program. I want the output file to go to the users Downloads folder, but currently I know how to only use the Path method for my own Downloads folder. If I packaged this up, it would not be dynamic.
path = Path(r"C:\\Users\\xxx.xxxxx\\downloads\\Finished_File.csv")
I am unsure of other ways to go about auto-filling the user info without inputting it manually
My only other thinking is perhaps have this change dynamically with PySimpleGui using a list of potential names, and then having the user set who they are?
Find it manually before saving uisng pathlib
from pathlib import Path
users_download_path = str(Path.home() / "Downloads")
res =str( users_download_path) + '\' + str('Finished_File.csv')
path = Path(res)
This question already has answers here:
Non-alphanumeric list order from os.listdir()
(14 answers)
Closed 1 year ago.
directory = r'/home/bugramatik/Desktop/Folder'
for filename in os.listdir(directory):
file = open('/home/bugramatik/Desktop/Folder'+filename,'r')
print(BinaryToString(file.read().replace(" ","")))
I want to read all files inside of the a folder same order with folder structure.
For example my folder is like
a
b
c
d
But when I run the program at above it shows like
c
a
d
b
How can I read it like a,b,c,d?
The order in os.listdir() is actually more correct. But if you want to open the files in alphabetical order, like ls displays them, just reimplement the sorting it does.
for filename in sorted(os.listdir(directory)):
with open(os.path.join(directory, filename) ,'r') as file:
print(BinaryToString(file.read().replace(" ","")))
Notice the addition of sorted() and also the use of os.path.join() to produce an actually correct OS-independent file name for open(), and the use of a with context manager to fix the bug where you forgot to close the files you opened. (You can leave a few open just fine, but the program will crash with an exception when you have more files because the OS limits how many open files you can have.)
This question already has answers here:
Extracting extension from filename in Python
(33 answers)
Getting file extension using pattern matching in python
(6 answers)
Closed 5 years ago.
I have this pattern:
dir1/dir2/.log.gz
dir1/dir2/a.log.gz
dir1/dir2/a.py
dir1/dir2/*.gzip.tar
I want to get filename or path and extension. e.g:
(name,extension)=(dir1/dir2/,.log.gz)
(name,extension)=(dir1/dir2/a,.log.gz)
(name,extension)=(dir1/dir2/a,.py)
(name,extension)=(dir1/dir2/,.gzip.tar)
I try:
re.findall(r'(.*).*\.?(.*)',path)
but it doesn't work perfect
If you just want the file's name and extension:
import os
# path = C:/Users/Me/some_file.tar.gz
temp = os.path.splitext(path)
var = (os.path.basename(temp[0]), temp[1])
print (var)
# (some_file.tar, .gz)
Its worth noting that files with "dual" extensions will need to be recursed if you want. For example, the .tar.gz is a gzip file that happens to be an archive file as well. But the current state of it is .gz.
There is more on this topic here on SO.
General strategy: find the first '.' everything before it is the path, everything after it is the extension.
def get_path_and_extension(filename):
index = filename.find('.')
return filename[:index], filename[index + 1:]
This question already has answers here:
Is there a built in function for string natural sort?
(23 answers)
Closed 9 years ago.
I have a number of files in a folder with names following the convention:
0.1.txt, 0.15.txt, 0.2.txt, 0.25.txt, 0.3.txt, ...
I need to read them one by one and manipulate the data inside them. Currently I open each file with the command:
import os
# This is the path where all the files are stored.
folder path = '/home/user/some_folder/'
# Open one of the files,
for data_file in os.listdir(folder_path):
...
Unfortunately this reads the files in no particular order (not sure how it picks them) and I need to read them starting with the one having the minimum number as a filename, then the one with the immediate larger number and so on until the last one.
A simple example using sorted() that returns a new sorted list.
import os
# This is the path where all the files are stored.
folder_path = 'c:\\'
# Open one of the files,
for data_file in sorted(os.listdir(folder_path)):
print data_file
You can read more here at the Docs
Edit for natural sorting:
If you are looking for natural sorting you can see this great post by #unutbu
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
In python, how can I check if a filename ends in '.html' or '_files'?
import os
path = '/Users/Marjan/Documents/Nothing/Costco'
print path
names = os.listdir(path)
print len(names)
for name in names:
print name
Here is the code I've been using, it lists all the names in this category in terminal. There are a few filenames in this file (Costco) that don't have .html and _files. I need to pick them out, the only issue is that it has over 2,500 filenames. Need help on a code that will search through this path and pick out all the filenames that don't end with .html or _files. Thanks guys
for name in names:
if filename.endswith('.html') or filename.endswith('_files'):
continue
#do stuff
Usually os.path.splitext() would be more appropriate if you needed the extension of a file, but in this case endswith() is perfectly fine.
A little shorter than ThiefMaster's suggestion:
for name in [x for x in names if not x.endswith(('.html', '_files'))]:
# do stuff