I have a python program named myscript.py which would give me the list of files and folders in the path provided.
import os
import sys
def get_files_in_directory(path):
for root, dirs, files in os.walk(path):
print(root)
print(dirs)
print(files)
path=sys.argv[1]
get_files_in_directory(path)
the path i provided is D:\Python\TEST and there are some folders and sub folder in it as you can see in the output provided below :
C:\Python34>python myscript.py "D:\Python\Test"
D:\Python\Test
['D1', 'D2']
[]
D:\Python\Test\D1
['SD1', 'SD2', 'SD3']
[]
D:\Python\Test\D1\SD1
[]
['f1.bat', 'f2.bat', 'f3.bat']
D:\Python\Test\D1\SD2
[]
['f1.bat']
D:\Python\Test\D1\SD3
[]
['f1.bat', 'f2.bat']
D:\Python\Test\D2
['SD1', 'SD2']
[]
D:\Python\Test\D2\SD1
[]
['f1.bat', 'f2.bat']
D:\Python\Test\D2\SD2
[]
['f1.bat']
I need to get the output this way :
D1-SD1-f1.bat
D1-SD1-f2.bat
D1-SD1-f3.bat
D1-SD2-f1.bat
D1-SD3-f1.bat
D1-SD3-f2.bat
D2-SD1-f1.bat
D2-SD1-f2.bat
D2-SD2-f1.bat
how do i get the output this way.(Keep in mind the directory structure here is just an example. The program should be flexible for any path). How do i do this.
Is there any os command for this. Can you Please help me solve this? (Additional Information : I am using Python3.4)
You could try using the glob module instead:
import glob
glob.glob('D:\Python\Test\D1\*\*\*.bat')
Or, to just get the filenames
import os
import glob
[os.path.basename(x) for x in glob.glob('D:\Python\Test\D1\*\*\*.bat')]
To get what you want, you could do the following:
def get_files_in_directory(path):
# Get the root dir (in your case: test)
rootDir = path.split('\\')[-1]
# Walk through all subfolder/files
for root, subfolder, fileList in os.walk(path):
for file in fileList:
# Skip empty dirs
if file != '':
# Get the full path of the file
fullPath = os.path.join(root,file)
# Split the path and the file (May do this one and the step above in one go
path, file = os.path.split(fullPath)
# For each subfolder in the path (in REVERSE order)
subfolders = []
for subfolder in path.split('\\')[::-1]:
# As long as it isn't the root dir, append it to the subfolders list
if subfolder == rootDir:
break
subfolders.append(subfolder)
# Print the list of subfolders (joined by '-')
# + '-' + file
print('{}-{}'.format( '-'.join(subfolders), file) )
path=sys.argv[1]
get_files_in_directory(path)
My test folder:
SD1-D1-f1.bat
SD1-D1-f2.bat
SD2-D1-f1.bat
SD3-D1-f1.bat
SD3-D1-f2.bat
It may not be the best way to do it, but it will get you what you want.
Related
I would like to know if there's a quick 1 line of code to list all the directories in a directory. It's similar to this question: How do I list all files of a directory? , but with folders instead of files.
Here is a recipe for just the first level:
dname = '/tmp'
[os.path.join(dname, d) for d in next(os.walk(dname))[1]]
and a recursive one:
dname = '/tmp'
[os.path.join(root, d) for root, dirs, _ in os.walk(dname) for d in dirs]
(after import os, obviously)
Note that on filesystems that support symbolic links, any links to directories will not be included here, only actual directories.
Using os.listdir to list all the files and folders and os.path.isdir as the condition:
import os
cpath = r'C:\Program Files (x86)'
onlyfolders = [f for f in os.listdir(cpath) if os.path.isdir(os.path.join(cpath, f))]
Import os
#use os.walk()
Example:
For folder, sub, file in os.walk(path):
Print(folder). #returns parent folder
Print (sub). #returns sub_folder
Print (file) # returns files
[ i[0] for i in os.walk('/tmp')]
here '/tmp' is the directory path for which all directories are listed
you need to import os
This works because i[0] gives all the root paths as it will walk through them so we don't need to do join or anything.
I'd like to browse through the current folder and all its subfolders and get all the files with .htm|.html extensions. I have found out that it is possible to find out whether an object is a dir or file like this:
import os
dirList = os.listdir("./") # current directory
for dir in dirList:
if os.path.isdir(dir) == True:
# I don't know how to get into this dir and do the same thing here
else:
# I got file and i can regexp if it is .htm|html
and in the end, I would like to have all the files and their paths in an array. Is something like that possible?
You can use os.walk() to recursively iterate through a directory and all its subdirectories:
for root, dirs, files in os.walk(path):
for name in files:
if name.endswith((".html", ".htm")):
# whatever
To build a list of these names, you can use a list comprehension:
htmlfiles = [os.path.join(root, name)
for root, dirs, files in os.walk(path)
for name in files
if name.endswith((".html", ".htm"))]
I had a similar thing to work on, and this is how I did it.
import os
rootdir = os.getcwd()
for subdir, dirs, files in os.walk(rootdir):
for file in files:
#print os.path.join(subdir, file)
filepath = subdir + os.sep + file
if filepath.endswith(".html"):
print (filepath)
Hope this helps.
In python 3 you can use os.scandir():
def dir_scan(path):
for i in os.scandir(path):
if i.is_file():
print('File: ' + i.path)
elif i.is_dir():
print('Folder: ' + i.path)
dir_scan(i.path)
Use newDirName = os.path.abspath(dir) to create a full directory path name for the subdirectory and then list its contents as you have done with the parent (i.e. newDirList = os.listDir(newDirName))
You can create a separate method of your code snippet and call it recursively through the subdirectory structure. The first parameter is the directory pathname. This will change for each subdirectory.
This answer is based on the 3.1.1 version documentation of the Python Library. There is a good model example of this in action on page 228 of the Python 3.1.1 Library Reference (Chapter 10 - File and Directory Access).
Good Luck!
Slightly altered version of Sven Marnach's solution..
import os
folder_location = 'C:\SomeFolderName'
file_list = create_file_list(folder_location)
def create_file_list(path):
return_list = []
for filenames in os.walk(path):
for file_list in filenames:
for file_name in file_list:
if file_name.endswith((".txt")):
return_list.append(file_name)
return return_list
There are two ways works for me.
1. Work with the `os` package and use `'__file__'` to replace the main
directory when the project locates
import os
script_dir = os.path.dirname(__file__)
path = 'subdirectory/test.txt'
file = os.path.join(script_dir, path)
fileread = open(file,'r')
2. By using '\\' to read or write the file in subfolder
fileread = open('subdirectory\\test.txt','r')
from tkinter import *
import os
root = Tk()
file = filedialog.askdirectory()
changed_dir = os.listdir(file)
print(changed_dir)
root.mainloop()
I Have a Question :
I need to get paths of a file in a directory, I have a folder that contains other folders and other folders etc.... and each of them contains a file "tv.sas7bdat" I need to get every path to that file.
Thank you !!!
You can try the following code, where PATH stands for the parent directory
import os
def getAlldirInDiGui(path,resultList):
filesList=os.listdir(path)
for fileName in filesList:
fileAbpath=os.path.join(path,fileName)
if os.path.isdir(fileAbpath):
getAlldirInDiGui(fileAbpath,resultList)
else:
if fileName=='tv.sas7bdat':
resultList.append(fileAbpath)
resultList = []
PATH = ""
getAlldirInDiGui(PATH,resultList)
You can use os.walk()
import os
for root, dirs, files in os.walk(os.getcwd()):
for f in files:
if f.find("tv.sas7bdat")>=0:
print(root,f)
If I get your problem right you can achieve your goal using Pythons's os.walk function, like so:
import os
for root, dirs, files in os.walk("<starting folder here>", topdown=False):
for name in files:
if name == "tv.sas7bdat":
print(os.path.join(root, name))
p.s: as for comments in your question, next time please provide as many details possible in your question and provide code of your attempt, see the asking guidelines
Hope fully below code should work for you:
import glob
initial_path = "c:\<intital folder location>"
files = [file for file in glob.glob(initial_path+ "tv.sas7bdat" , recursive=True)]
for f in files:
print(f)
You could use the os python package combined with a recursive function to search through a certain directory
import os
from os.path import isfile, join, isdir
def get_files_path(directory, paths):
for item in os.listdir(directory):
if isfile(join(directory, item)) and item == "tv.sas7bda":
paths.append(directory + item)
elif isdir(directory+item):
get_files_path(directory + item, paths)
return paths
directory_to_search = "./"
get_files_path(directory_to_search , [])
I am practicing with the os module and more specifically os.walk(). I am wondering if there is an easier/more efficient way to find the actual path to a file considering this produces a path that suggests the file is in the original folder when os.walk() is first ran:
import os
threshold_size = 500
for folder, subfolders, files in os.walk(os.getcwd()):
for file in files:
filePath = os.path.abspath(file)
if os.path.getsize(filePath) >= threshold_size:
print filePath, str(os.path.getsize(filePath))+"kB"
This is my current workaround:
import os
threshold_size = 500
for folder, subfolders, files in os.walk(os.getcwd()):
path = os.path.abspath(folder)
for file in files:
filePath = path + "\\" + file
if os.path.getsize(filePath) >= threshold_size:
print filePath, str(os.path.getsize(filePath))+"kB"
For shaktimaan, this:
for folder, subfolders, files in os.walk(os.getcwd()):
for file in files:
filePath = os.path.abspath(file)
print filePath
produces this(most of these files are in a subfolder of projects, not projects itself):
C:\Python27\projects\ps4.py
C:\Python27\projects\ps4_encryption_sol.py
C:\Python27\projects\ps4_recursion_sol.py
C:\Python27\projects\words.txt
C:\Python27\projects\feedparser.py
C:\Python27\projects\feedparser.pyc
C:\Python27\projects\news_gui.py
C:\Python27\projects\news_gui.pyc
C:\Python27\projects\project_util.py
C:\Python27\projects\project_util.pyc
C:\Python27\projects\ps5.py
C:\Python27\projects\ps5.pyc
C:\Python27\projects\ps5_test.py
C:\Python27\projects\test.py
C:\Python27\projects\triggers.txt
C:\Python27\projects\ps6.py
C:\Python27\projects\ps6_pkgtest.py
C:\Python27\projects\ps6_solution.py
C:\Python27\projects\ps6_visualize.py
C:\Python27\projects\ps6_visualize.pyc
C:\Python27\projects\capitalsquiz1.txt
C:\Python27\projects\capitalsquiz2.txt
C:\Python27\projects\capitalsquiz3.txt
C:\Python27\projects\capitalsquiz4.txt
C:\Python27\projects\capitalsquiz5.txt
C:\Python27\projects\capitalsquiz_answers1.txt
C:\Python27\projects\capitalsquiz_answers2.txt
C:\Python27\projects\capitalsquiz_answers3.txt
C:\Python27\projects\capitalsquiz_answers4.txt
C:\Python27\projects\capitalsquiz_answers5.txt
C:\Python27\projects\quiz.py
C:\Python27\projects\file2.txt
C:\Python27\projects\regexes.txt
C:\Python27\projects\regexsearch.py
C:\Python27\projects\testfile.txt
C:\Python27\projects\renamedates.py
I think there you mistook what abspath does. abspath just convert a relative path to a complete absolute filename.
For e.g.
os.path.abspath(os.path.join(r"c:\users\anonymous\", ".."))
#produces this output : c:\users
Without any other information, abspath can only form an absolute path from the only directory it can know about, for your case the current working directory. So currently what it is doing is it joins os.getcwd() and your file
So what you would have to do is:
for folder, subfolders, files in os.walk(os.getcwd()):
for file in files:
filePath = os.path.join(os.path.abspath(folder), file)
Your work around should work fine, but a simpler way to do this would be:
import os
threshold_size = 500
root = os.getcwd()
root = os.path.abspath(root) # redunant with os.getcwd(), maybe needed otherwise
for folder, subfolders, files in os.walk(root):
for file in files:
filePath = os.path.join(folder, file)
if os.path.getsize(filePath) >= threshold_size:
print filePath, str(os.path.getsize(filePath))+"kB"
The basic idea here is that folder will be an absolute normalized path if the argument to os.walk is one and os.path.join will produce an absolute normalized path if any of the arguments is an absolute path and all the following arguments are normalized.
The reason why os.path.abspath(file) doesn't work in your first example is that file is a bare filename like quiz.py. So when you use abspath it does essentially the same thing os.path.join(os.getcwd(), file) would do.
This simple example should do the trick.
I have stored the result in a list, because for me it's quite handy to pass the list to a different function and execute different operations on a list.
import os
directory = os.getcwd()
list1 = []
for root, subfolders, files in os.walk(directory):
list1.append( [ os.path.join(os.path.abspath(root), elem) for elem in files if elem ])
# clean the list from empty elements
final_list = [ x for x in list1 if x != [] ]
I'd like to browse through the current folder and all its subfolders and get all the files with .htm|.html extensions. I have found out that it is possible to find out whether an object is a dir or file like this:
import os
dirList = os.listdir("./") # current directory
for dir in dirList:
if os.path.isdir(dir) == True:
# I don't know how to get into this dir and do the same thing here
else:
# I got file and i can regexp if it is .htm|html
and in the end, I would like to have all the files and their paths in an array. Is something like that possible?
You can use os.walk() to recursively iterate through a directory and all its subdirectories:
for root, dirs, files in os.walk(path):
for name in files:
if name.endswith((".html", ".htm")):
# whatever
To build a list of these names, you can use a list comprehension:
htmlfiles = [os.path.join(root, name)
for root, dirs, files in os.walk(path)
for name in files
if name.endswith((".html", ".htm"))]
I had a similar thing to work on, and this is how I did it.
import os
rootdir = os.getcwd()
for subdir, dirs, files in os.walk(rootdir):
for file in files:
#print os.path.join(subdir, file)
filepath = subdir + os.sep + file
if filepath.endswith(".html"):
print (filepath)
Hope this helps.
In python 3 you can use os.scandir():
def dir_scan(path):
for i in os.scandir(path):
if i.is_file():
print('File: ' + i.path)
elif i.is_dir():
print('Folder: ' + i.path)
dir_scan(i.path)
Use newDirName = os.path.abspath(dir) to create a full directory path name for the subdirectory and then list its contents as you have done with the parent (i.e. newDirList = os.listDir(newDirName))
You can create a separate method of your code snippet and call it recursively through the subdirectory structure. The first parameter is the directory pathname. This will change for each subdirectory.
This answer is based on the 3.1.1 version documentation of the Python Library. There is a good model example of this in action on page 228 of the Python 3.1.1 Library Reference (Chapter 10 - File and Directory Access).
Good Luck!
Slightly altered version of Sven Marnach's solution..
import os
folder_location = 'C:\SomeFolderName'
file_list = create_file_list(folder_location)
def create_file_list(path):
return_list = []
for filenames in os.walk(path):
for file_list in filenames:
for file_name in file_list:
if file_name.endswith((".txt")):
return_list.append(file_name)
return return_list
There are two ways works for me.
1. Work with the `os` package and use `'__file__'` to replace the main
directory when the project locates
import os
script_dir = os.path.dirname(__file__)
path = 'subdirectory/test.txt'
file = os.path.join(script_dir, path)
fileread = open(file,'r')
2. By using '\\' to read or write the file in subfolder
fileread = open('subdirectory\\test.txt','r')
from tkinter import *
import os
root = Tk()
file = filedialog.askdirectory()
changed_dir = os.listdir(file)
print(changed_dir)
root.mainloop()