This question already has answers here:
Is there a built in function for string natural sort?
(23 answers)
Closed 9 years ago.
I have a number of files in a folder with names following the convention:
0.1.txt, 0.15.txt, 0.2.txt, 0.25.txt, 0.3.txt, ...
I need to read them one by one and manipulate the data inside them. Currently I open each file with the command:
import os
# This is the path where all the files are stored.
folder path = '/home/user/some_folder/'
# Open one of the files,
for data_file in os.listdir(folder_path):
...
Unfortunately this reads the files in no particular order (not sure how it picks them) and I need to read them starting with the one having the minimum number as a filename, then the one with the immediate larger number and so on until the last one.
A simple example using sorted() that returns a new sorted list.
import os
# This is the path where all the files are stored.
folder_path = 'c:\\'
# Open one of the files,
for data_file in sorted(os.listdir(folder_path)):
print data_file
You can read more here at the Docs
Edit for natural sorting:
If you are looking for natural sorting you can see this great post by #unutbu
Related
This question already has answers here:
Non-alphanumeric list order from os.listdir()
(14 answers)
Closed 1 year ago.
directory = r'/home/bugramatik/Desktop/Folder'
for filename in os.listdir(directory):
file = open('/home/bugramatik/Desktop/Folder'+filename,'r')
print(BinaryToString(file.read().replace(" ","")))
I want to read all files inside of the a folder same order with folder structure.
For example my folder is like
a
b
c
d
But when I run the program at above it shows like
c
a
d
b
How can I read it like a,b,c,d?
The order in os.listdir() is actually more correct. But if you want to open the files in alphabetical order, like ls displays them, just reimplement the sorting it does.
for filename in sorted(os.listdir(directory)):
with open(os.path.join(directory, filename) ,'r') as file:
print(BinaryToString(file.read().replace(" ","")))
Notice the addition of sorted() and also the use of os.path.join() to produce an actually correct OS-independent file name for open(), and the use of a with context manager to fix the bug where you forgot to close the files you opened. (You can leave a few open just fine, but the program will crash with an exception when you have more files because the OS limits how many open files you can have.)
This question already has answers here:
How to use glob to read limited set of files with numeric names?
(2 answers)
Closed 3 years ago.
I have a directory of files with names 10.txt, 11.txt .. 29.txt, 30.txt
How can I select files 12 to 21?
Ive tried:
glob.glob('path/[12-21].txt')
Glob is good if you do not know what file names and just need a rough pattern, however, in your case you do know so you might not need to use glob.
It would be better to simply generate your expected list of files and check if they exist eg
from os.path import isfile
# Generate a list of expected file names
expected_files = ["path/{}.txt".format(i) for i in range(12, 22)]
# Filter the list to just the files that actually exist.
actual_files = [f for f in expected_files if isfile(f)]
This question already has answers here:
python copy files by wildcards
(3 answers)
Closed 4 years ago.
I have a large number of .txt files named in the combination of "cb" + number (like cb10, cb13), and I need to filter them out from a source folder that contains all the files named in "cb + number", including the target files.
The numbers in the target file names are all random, so I have to list all the file names.
import fnmatch
import os
import shutil
os.chdir('/Users/college_board_selection')
os.getcwd()
source = '/Users/college_board_selection'
dest = '/Users/seperated_files'
files = os.listdir(source)
for f in os.listdir('.'):
names = ['cb10.txt','cb11.txt']
if names in f:
shutil.move(f,dest)
if names in f: isn't going to work as f is a filename, not a list. Maybe you want if f in names:
But you don't need to scan a whole directory for this, just loop on the files you're targetting, it they exist:
for f in ['cb10.txt','cb11.txt']:
if os.path.exists(f):
shutil.move(f,dest)
If you have a lot of cbxxx.txt files, maybe an alternative would be to compute the intersection of this list with the result of os.listdir using a set (for faster lookup than a list, worth if there are a lot of elements):
for f in {'cb10.txt','cb11.txt'}.intersection(os.listdir(".")):
shutil.move(f,dest)
On Linux, with a lot of "cb" files, this would be faster because listdir doesn't perform a fstat, whereas os.path.exists does.
EDIT: if the files have the same prefix/suffix, you can build the lookup set with a set comprehension to avoid tedious copy/paste:
s = {'cb{}.txt'.format(i) for i in ('10','11')}
for f in s.intersection(os.listdir(".")):
or for the first alternative:
for p in ['10','11']:
f = "cb{}.txt".format(p)
if os.path.exists(f):
shutil.move(f,dest)
EDIT: if all cb*.txt files must be moved, then you can use glob.glob("cb*.txt"). I won't elaborate, the linked "duplicate target" answer explains it better.
This question already has answers here:
How to list only top level directories in Python?
(21 answers)
Closed 2 years ago.
How can I bring python to only output directories via os.listdir, while specifying which directory to list via raw_input?
What I have:
file_to_search = raw_input("which file to search?\n>")
dirlist=[]
for filename in os.listdir(file_to_search):
if os.path.isdir(filename) == True:
dirlist.append(filename)
print dirlist
Now this actually works if I input (via raw_input) the current working directory. However, if I put in anything else, the list returns empty. I tried to divide and conquer this problem but individually every code piece works as intended.
that's expected, since os.listdir only returns the names of the files/dirs, so objects are not found, unless you're running it in the current directory.
You have to join to scanned directory to compute the full path for it to work:
for filename in os.listdir(file_to_search):
if os.path.isdir(os.path.join(file_to_search,filename)):
dirlist.append(filename)
note the list comprehension version:
dirlist = [filename for filename in os.listdir(file_to_search) if os.path.isdir(os.path.join(file_to_search,filename))]
This question already has answers here:
How can I check the extension of a file?
(14 answers)
Closed 5 years ago.
Is there a way in Python to check a file name to see if its extension is included in the name? My current workaround is to simply check if the name contains a . in it, and add an extension if it doesn't...this obviously won't catch files with . but no extension in the name (ie. 12.10.13_file). Anyone have any ideas?
'12.10.13_file' as a filename, does have '13_file' as it's file extension. At least regarding the file system.
But, instead of finding the last . yourself, use os.path.splitext:
import os
fileName, fileExtension = os.path.splitext('/path/yourfile.ext')
# Results in:
# fileName = '/path/yourfile'
# fileExtension = '.ext'
If you want to exclude certain extensions, you could blacklist those after you've used the above.
You can use libmagic via https://pypi.python.org/pypi/python-magic to determine "file types." It's not 100% perfect, but a whole lot of files can be accurately classified this way, and then you can decide your own rules, such as .txt for text files, .pdf for PDFs, etc.
Don't think in terms of finding files with or without extensions--think of it in terms of classifying your files based on their content, ignoring their current names.