How to get only files in directory? [duplicate] - python

This question already has answers here:
How do I list all files of a directory?
(21 answers)
Closed 9 years ago.
I have this code:
allFiles = os.listdir(myPath)
for module in allFiles:
if 'Module' in module: #if the word module is in the filename
dirToScreens = os.path.join(myPath, module)
allSreens = os.listdir(dirToScreens)
Now, all works well, I just need to change the line
allSreens = os.listdir(dirToScreens)
to get a list of just files, not folders.
Therefore, when I use
allScreens [ f for f in os.listdir(dirToScreens) if os.isfile(join(dirToScreens, f)) ]
it says
module object has no attribute isfile
NOTE: I am using Python 2.7

You can use os.path.isfile method:
import os
from os import path
files = [f for f in os.listdir(dirToScreens) if path.isfile(f)]
Or if you feel functional :D
files = filter(path.isfile, os.listdir(dirToScreens))

"If you need a list of filenames that all have a certain extension, prefix, or any common string in the middle, use glob instead of writing code to scan the directory contents yourself"
import os
import glob
[name for name in glob.glob(os.path.join(path,'*.*')) if os.path.isfile(os.path.join(path,name))]

Related

How to get a list with all filenames with full path along with extension [duplicate]

This question already has answers here:
Why do backslashes appear twice?
(2 answers)
Closed 2 years ago.
I have one folder say ABC in which i have so many files with extension say 001.py, 001.xls, 001.pdf and many more. I want to write one program in which we get list with this filename say
["C:\Users\Desktop\ABC\001.py", "C:\Users\Desktop\ABC\001.xls", "C:\Users\Desktop\ABC\001.pdf"]
MyCode:
import os
from os import listdir
from os.path import isfile, join
dir_path = os.path.dirname(os.path.realpath(__file__))
print(dir_path) #current path
cwd = os.getcwd()
list3 = []
onlyfiles = [f for f in listdir(cwd) if isfile(join(cwd, f))]
for i in onlyfiles:
list3.append(dir_path+"\\"+i)
print(list3)
I am getting output as :
["C:\\Users\\Desktop\\ABC\\001.py", "C:\\Users\\Desktop\\ABC\\001.xls", "C:\\Users\\Desktop\\ABC\\001.pdf"]
I am looking for output as :
["C:\Users\Desktop\ABC\001.py", "C:\Users\Desktop\ABC\001.xls", "C:\Users\Desktop\ABC\001.pdf"]
If you can use Python 3, pathlib can help you assemble your paths in a clearer way! If you're forced to use Python 2, you could bring in the library it's based off!
The double-backslash occurs and needs to be dealt with because Windows bizarrely chose to use \ instead of / as the path separator while it is ubiquitous as an escape character in many, especially C-derived languages. You can use / and it'll still work fine. You'll find need to escape spaces with \ when not using pathlib too.

How to get file names in a directory one-by-one? [duplicate]

This question already has answers here:
How do I list all files of a directory?
(21 answers)
Closed 3 years ago.
I have a directory with more than 100k files in it. I need to loop through them and perform operations. I don't want to load the whole list of files in memory, instead of that, I want to traverse synchronously. What is the best way to achieve that in Python?
Edit:
This question isn't similar to my question as I don't want to load all the filenames into the memory at once.
Pathlib.iterdir() offers a generator to iterate through directories, which reduces memory consumption:
import sys
import pathlib
import os
path = '/cache/srtm'
pl = pathlib.Path(path).iterdir()
oslb = os.listdir(path)
print(type(pl))
print (type(oslb))
print ('pathlib.iter: %s' % sys.getsizeof(pl))
print ('os.listdir: %s' % sys.getsizeof(oslb))
Prints:
<class 'generator'>
<class 'list'>
pathlib.iter: 88
os.listdir: 124920
This is how you would loop through a list of files in a directory assuming that you have a directory path as an str object in a variable called myDirectory
import os
directory = os.fsencode(myDirectory)
for file in os.listdir(directory):
filename = os.fsdecode(file)
# do opperations with filename
Alternatively you could use pathlib
from pathlib import Path
pathlist = Path(myDirectory)
for path in pathlist:
filename = str(path)
# Do opperations with filename

get filenames in a directory without extension - Python [duplicate]

This question already has answers here:
How do I get the filename without the extension from a path in Python?
(31 answers)
Closed 3 years ago.
I am trying to get the filenames in a directory without getting the extension. With my current code -
import os
path = '/Users/vivek/Desktop/buffer/xmlTest/'
files = os.listdir(path)
print (files)
I end up with an output like so:
['img_8111.jpg', 'img_8120.jpg', 'img_8127.jpg', 'img_8128.jpg', 'img_8129.jpg', 'img_8130.jpg']
However I want the output to look more like so:
['img_8111', 'img_8120', 'img_8127', 'img_8128', 'img_8129', 'img_8130']
How can I make this happen?
You can use os's splitext.
import os
path = '/Users/vivek/Desktop/buffer/xmlTest/'
files = [os.path.splitext(filename)[0] for filename in os.listdir(path)]
print (files)
Just a heads up: basename won't work for this. basename doesn't remove the extension.
Here are two options
import os
print(os.path.splitext("path_to_file")[0])
Or
from os.path import basename
print(basename("/a/b/c.txt"))

error with splitting file path string by / in python [duplicate]

This question already has answers here:
How to split a dos path into its components in Python
(23 answers)
Closed 6 years ago.
This is what I'm doing. I'm taking a text from a folder, modifying that text, and writing it out to another folder with a modified file name. I'm trying to establish the file name as a variable. Unfortunately this happens:
import os
import glob
path = r'C://Users/Alexander/Desktop/test/*.txt'
for file in glob.glob(path):
name = file.split(r'/')[5]
name2 = name.split(".")[0]
print(name2)
Output: test\indillama_Luisa_testfile
The file name is 'indillama_Luisa_testfile.txt' it is saved in a folder on my desktop called 'test'.
Python is including the 'test\' in the file name. If I try to split name at [6] it says that index is out of range. I'm using regex and I'm assuming that it's reading '/*' as a single unit and not as a slash in the file directory.
How do I get the file name?
You can split by the OS path separator:
import os
import glob
path = r'C://Users/Alexander/Desktop/test/*.txt'
for file in glob.glob(path):
name = file.split(os.path.sep)[-1]
name2 = name.split(".")[0]
print(name2)
import os
import glob
path = r'C://Users/Alexander/Desktop/test/*.txt'
for file in glob.glob(path):
name = os.path.basename(file)
(path, ext) = os.path.splitext(file)
print(ext)
os.path.basename() will extract the filename part of the path. os.path.splitext() hands back a tuple containing the path and the split-off extension. Since that's what your example seemed to be printing, that's what I did in my suggested answer.
For portability, it's usually safer to use the built-in path manipulation routines rather than trying to do it yourself.
You can use os.listdir(path) to list all the files in a directory.
Then iterate over the list to get the filename of each file.
for file in os.listdir(path):
name2 = file .split(".")[0]
print(name2)

Filename scanning for certain names Python [duplicate]

This question already has answers here:
Find all files in a directory with extension .txt in Python
(25 answers)
Closed 7 years ago.
I have lots of files in a directory, lets say around 100, most of their file names begin with "Mod", i need to add all filenames that begin with "Mod" to a list so i can reference them later in my code. Any help? Thanks!
Use the glob package.
import glob
filepaths = glob.glob('/path/to/file/Mod*')
More generally, you can use os.listdir. Unlike glob, it only returns the last part of the filename (without the full path).
import os
directory = '/path/to/directory'
filenames = os.listdir(directory )
full_filepaths = [os.path.join(directory, f) for f in filenames]
only_files = [f for f in full_filepaths if os.path.isfile(f)]
You can use glob library to find the files with the given pattern:
import glob,os
mylist=[]
os.chdir("/mydir")
for file in glob.glob("Mod*"):
mylist.append(file)
print mylist
or you can use os.walk
for root, dirs, files in os.walk('/mydir'):
for names in files:
if names.startswith("Mod"):
mylist.append(os.path.join(root, names))

Categories

Resources