List top-level directories with file count - python

I need to show the total file count for all top-level directories, including the ones that have a file count of zero. Each top-level directory can contain subdirectories. I need the total count listed next to top-level directory only.
cnt = 0
for dirpath, dirnames, files in os.walk(FILES):
filecount = len(files)
cnt += filecount
print(dirnames,": ",filecount)
How can I get the above to print something like:
top-level-dir1: 234
top-level-dir2: 0
top-level-dir3: 5
....etc.
So, total files, including what's in the nested subfolders, but print the total next to the top-level folders only.
for directory in os.listdir(DOCUMENTS):
if os.path.isdir(directory):
filecount = 0
for dirpath, dirnames, files in os.walk(directory):
filecount += len(files)
print(directory,": ",filecount)
I'm close, but this just shows file count as 1 for each.

You are resetting the filecount variable for each directory. Instead, you want the count to persist over each directory in CONTRACTS.
Also, os.listdir(CONTRACTS) only shows the immediate directory names in CONTRACTS; for the script to work in directories other than the current directory, you need to use os.path.join() to specify the full path when you call os.walk().
Finally, as #TimRoberts says, you should use os.path.isdir() to check the output of os.listdir(), as it can also return files.
Something like this should do the trick:
import os
target = "target_directory"
for dir_name in os.listdir(target):
dir_path = os.path.join(target, dir_name)
if os.path.isdir(dir_path):
file_count = 0
for _, _, files in os.walk(dir_path):
file_count += len(files)
print(f"{dir_name}: {file_count}")

Related

How do I copy and rename files in the same order from one folder into another? [Python]

How can I achieve that the code copys the files and renames them in the same order as in the folder?
import os
folder = r'C:\Users\Desktop\001_Something\\'
count = 83
folder2 = r'C:\Users\Desktop\Rename\\'
# count increase by 1 in each iteration
# iterate all files from a directory
for file_name in sorted(os.listdir(folder)):
# Construct old file name
source = folder + file_name
# Adding the count to the new file name and extension
destination = folder2 + "Slice_" + str(count) +"_0" + ".asp"
# Renaming the file
os.rename(source, destination)
count += 1
# verify the result
res = os.listdir(folder2)
print(res)
Files in the directory
What I need is the code to rename Slice_42_0 into Slice_83_0, Slice_43_0 into Slice_84_0, Slice_44_0 into Slice_85_0 and so on.
My code only takes the files from one folder, renames the files in wrong order in puts them into another folder.

Print statement not responding in my filemanagement system

I have 2 folders: Source and Destination. Each of those folders have 3 subfolders inside them named A, B and C. The 3 subfolders in Source all contain multiple files. The 3 subfolders in Destination are empty (yet).
I need the full path of all because my goal is to overwrite the files from Source A, B and C in Destination A, B and C.
How come my two print statements are not printing anything? I have zero errors.
import os
src = r'c:\data\AM\Desktop\Source'
dst = r'c:\data\AM\Desktop\Destination'
os.chdir(src)
for root, subdirs, files in os.walk(src):
for f in subdirs:
subdir_paths = os.path.join(src, f)
subdir_paths1 = os.path.join(dst, f)
for a in files:
file_paths = os.path.join(subdir_paths, a)
file_paths1 = os.path.join(subdir_paths1, a)
print(file_paths)
print(file_paths1)
Problem
As jasonharper said in a comment,
You are misunderstanding how os.walk() works. The files returned in files are in the root directory; you are acting as if though they existed in each of the subdirs directories, which are actually in root themselves.
The reason nothing is printed is that, on the first iteration, files is empty, so for a in files is not entered. Then on the following iterations (where root is A, B and C respectively), subdirs is empty, so for f in subdirs is not entered.
Solution
In fact you can ignore subdirs entirely. Instead walk the current dir, and join src/dst + root + a:
import os
src = r'c:\data\AM\Desktop\Source'
dst = r'c:\data\AM\Desktop\Destination'
os.chdir(src)
for root, subdirs, files in os.walk('.'):
src_dir = os.path.join(src, root)
dst_dir = os.path.join(dst, root)
for a in files:
src_file = os.path.join(src_dir, a)
dst_file = os.path.join(dst_dir, a)
print(src_file)
print(dst_file)
The output should have an extra dot directory between src/dst and root. If anyone could tell me how to get rid of it, I'm all ears.

Scanning for file paths with glob

I am searching for all .csv's located in a subfolder with glob like so:
def scan_for_files(path):
file_list = []
for path, dirs, files in os.walk(path):
for d in dirs:
for f in glob.iglob(os.path.join(path, d, '*.csv')):
file_list.append(f)
return file_list
If I call:
path = r'/data/realtimedata/trades/bitfinex/'
scan_for_files(path)
I get the correct recursive list of files:
['/data/realtimedata/trades/bitfinex/btcusd/bitfinex_btcusd_trades_2018_05_12.csv',
'/data/realtimedata/trades/bitfinex/btcusd/bitfinex_btcusd_trades_2018_05_13.csv',
'/data/realtimedata/trades/bitfinex/btcusd/bitfinex_btcusd_trades_2018_05_15.csv',
'/data/realtimedata/trades/bitfinex/btcusd/bitfinex_btcusd_trades_2018_05_11.csv',
'/data/realtimedata/trades/bitfinex/btcusd/bitfinex_btcusd_trades_2018_05_09.csv',
'/data/realtimedata/trades/bitfinex/btcusd/bitfinex_btcusd_trades_2018_05_10.csv',
'/data/realtimedata/trades/bitfinex/btcusd/bitfinex_btcusd_trades_2018_05_08.csv',
'/data/realtimedata/trades/bitfinex/btcusd/bitfinex_btcusd_trades_2018_05_14.csv',
'/data/realtimedata/trades/bitfinex/ethusd/bitfinex_ethusd_trades_2018_05_14.csv',
'/data/realtimedata/trades/bitfinex/ethusd/bitfinex_ethusd_trades_2018_05_12.csv',
'/data/realtimedata/trades/bitfinex/ethusd/bitfinex_ethusd_trades_2018_05_10.csv',
'/data/realtimedata/trades/bitfinex/ethusd/bitfinex_ethusd_trades_2018_05_08.csv',
'/data/realtimedata/trades/bitfinex/ethusd/bitfinex_ethusd_trades_2018_05_09.csv',
'/data/realtimedata/trades/bitfinex/ethusd/bitfinex_ethusd_trades_2018_05_15.csv',
'/data/realtimedata/trades/bitfinex/ethusd/bitfinex_ethusd_trades_2018_05_11.csv',
'/data/realtimedata/trades/bitfinex/ethusd/bitfinex_ethusd_trades_2018_05_13.csv']
However when using the actual sub-directory containing the files I want - it returns an empty list. Any idea why this is happening? Thanks.
path = r'/data/realtimedata/trades/bitfinex/btcusd/'
scan_for_files(path)
returns: []
Looks like btcusd is a bottom-level directory. That means that when you call os.walk with the r'/data/realtimedata/trades/bitfinex/btcusd/' path, the dirs variable will be an empty list [], so the inner loop for d in dirs: does not execute at all.
My advice would be to re-write your function to iterate over the files directly, and not the directories... don't worry, you'll get there eventually, that's the nature of a directory tree.
def scan_for_files(path):
file_list = []
for path, _, files in os.walk(path):
for f in files:
file_list.extend(glob.iglob(os.path.join(path, f, '*.csv'))
return file_list
However, on more recent versions of python (3.5+), you can use recursive glob:
def scan_for_files(path):
return glob.glob(os.path.join(path, '**', '*.csv'), recursive=True)
Source.

count number of folders with given name

I am lookling to get count of folders and subfolders with a given name... Here I am searching for number of subfolders named "L-4"? Returns zero and I am sure thats not true? What did I miss?
import os
path = "R:\\"
i = 0
for (path, dirs, files) in os.walk(path):
if os.path.dirname == "L-4":
i += 1
print i
os.path.dirname is a reference to the standard library function, not a string. Perhaps you wanted to use os.path.dirname(path) instead here.
You could instead count how many times L-4 appears in the dirs list:
i = 0
for root, dirs, files in os.walk(path):
i += dirs.count('L-4')
print i
or, as a one-liner:
print sum(dirs.count('L-4') for _, dirs, _ in os.walk(path))

How get number of subfolders and folders using Python os walks?

What I have a directory of folders and subfolders. What I'm trying to do is get the number of subfolders within the folders, and plot them on a scatter plot using matplotlib. I have the code to get the number of files, but how would I get the number of subfolders within a folder. This probably has a simple answer but I'm a newb to Python. Any help is appreciated.
This is the code I have so far to get the number of files:
import os
import matplotlib.pyplot as plt
def fcount(path):
count1 = 0
for f in os.listdir(path):
if os.path.isfile(os.path.join(path, f)):
count1 += 1
return count1
path = "/Desktop/lay"
print fcount(path)
import os
def fcount(path, map = {}):
count = 0
for f in os.listdir(path):
child = os.path.join(path, f)
if os.path.isdir(child):
child_count = fcount(child, map)
count += child_count + 1 # unless include self
map[path] = count
return count
path = "/Desktop/lay"
map = {}
print fcount(path, map)
Here is a full implementation and tested. It returns the number of subfolders without the current folder. If you want to change that you have to put the + 1 in the last line instead of where the comment is.
I think os.walk could be what you are looking for:
import os
def fcount(path):
count1 = 0
for root, dirs, files in os.walk(path):
count1 += len(dirs)
return count1
path = "/home/"
print fcount(path)
This will walk give you the number of directories in the given path.
Try the following recipe:
import os.path
import glob
folder = glob.glob("path/*")
len(folder)
Answering to:
how would I get the number of subfolders within a folder
You can use the os.path.isdir function similarly to os.path.isfile to count directories.
I guess you are looking for os.walk. Look in the Python reference, it says that:
os.walk(top, topdown=True, onerror=None, followlinks=False)
Generate the file names in a directory tree by walking the tree either top-down
or bottom-up. For each directory in the tree rooted at directory top
(including top itself), it yields a 3-tuple (dirpath, dirnames,
filenames).
So, you can try to do this to get only the directories:
for root, dirs, files in os.walk('/usr/bin'):
for name in dirs:
print os.path.join(root, name)
count += 1

Categories

Resources