How to read folder structure and assign it to datastructure? - python

I'm only starting with python, and I'm trying to accomplish following:
I have a folder structure (simplified):
.
├── folder1
│ ├── file1
│ └── file2
├── folder2
│ └── file3
└── folder3
├── file4
├── file5
└── file6
I'd like to read filenames into some kind of a datastructure, that is able to distinguish which files are from the same folder. I've used glob in a one folder case, but is it possible to get for example following datastructure using glob?
files = [{file1, folder1}, {file2, folder1}, {file3, folder2}...]

I assume you'd rather get this kind of structure:
files = {folder1: [file1, file2], folder2: [file3], ...}
The following code will do the trick:
import os
rootDir = '.'
files = {}
for dirName, subdirList, fileList in os.walk(rootDir):
files[dirName] = fileList

Related

How do I match the file name from different directories and replace the partial filename with the actual filename?

So I have a slightly complicated issue that I need some help with :(
In Directory 1, I have the filenames as follows:
00HFP.mp4
0AMBV.mp4
2D5GN.mp4
3HVKR.mp4
3IJGQ.mp4
In Directory 2, I did some processing to the mp4s and got some output files:
_0HFP.usd
_AMBV.usd
_D5GN.usd
_HVKR.usd
_IJGQ.usd
For some reason, the programme I'm using replaces the first number/character with an underscore for some files. Other files are generally left alone. But I need the filenames to match :( How do I do a mass renaming (over 500 files) based on this partial naming using python script? So like for example: _0HFP.usd should become 00HFP.usd since there's a 00HFP.mp4 file in Directory 1.
Please help :( Thank you!
Trying this (as suggested by Corralien): but still doesn't work for me :(
dir1 = pathlib.Path('./mnt/d/Downloads/Charades_v1_480/charades_18Jan/done/')
dir2 = pathlib.Path('./mnt/d/Downloads/Charades_v1_480/charades_18Jan_anim/pt-charades-output/')
print('i am here')
for f1 in dir1.glob('*.mp4'):
print(f1)
f2 = dir2 / f'_{f1.stem[1:]}.usd'
if f2.exists():
f2.rename(dir2 / f'{f1.stem}.usd')
Suppose the following directories:
Dir1
├── 00HFP.mp4
├── 0AMBV.mp4
├── 2D5GN.mp4
├── 3HVKR.mp4
└── 3IJGQ.mp4
Dir2
├── _0HFP.usd
├── _AMBV.usd
├── _D5GN.usd
├── _HVKR.usd
└── _IJGQ.usd
Try:
import pathlib
dir1 = pathlib.Path('./Dir1')
dir2 = pathlib.Path('./Dir2')
for f1 in dir1.glob('*.mp4'):
f2 = dir2 / f'_{f1.stem[1:]}.usd'
if f2.exists():
f2.rename(dir2 / f'{f1.stem}.usd')
After processing:
Dir1
├── 00HFP.mp4
├── 0AMBV.mp4
├── 2D5GN.mp4
├── 3HVKR.mp4
└── 3IJGQ.mp4
Dir2
├── 00HFP.usd
├── 0AMBV.usd
├── 2D5GN.usd
├── 3HVKR.usd
└── 3IJGQ.usd

Export specific CSV filename (eg: abc*.csv) from list of folders within a folder using python through looping method

How to iterate through multiple folders within a folder and export specific CSV filename that begins for example- "abc*.csv" and export them into a new folder directory.
Trying to search similar example in Stack overflow but most examples were reading multiple CSV files within a folder and combine them to one data frame. Thanks.
import os
import shutil
files = []
source_dir = ['./files/']
dest_dir = './export/'
# List all files from all directories within the path
while len(source_dir) > 0:
for (dirpath, dirnames, filenames) in os.walk(source_dir.pop()):
source_dir.extend(dirnames)
files.extend(map(lambda n: os.path.join(
*n), zip([dirpath] * len(filenames), filenames)))
# Loop thru files to copy/move the matching CSVs
for file in files:
if file.startswith('abc') and file.endswith('.csv'):
shutil.copy(file, dest_dir)
# shutil.move(file, dest_dir) # Use .move to move the file instead
If you have this file structure:
files
├── dir1
│   ├── abc789.csv
│   └── abc789.txt
├── dir2
│   ├── abc098.csv
│   └── abc098.txt
└── dir3
├── abc456.csv
└── subdir3
├── abc123.csv
└── abc123.txt
Only the matching files will be exported:
$ ls export/
abc098.csv abc123.csv abc456.csv abc789.csv

Directory compression in python

I will try to explain it on example.
abc
├── test
├── dir1
├── dir2
├── not_for_zipping.txt
I want to compress all directories in test dir (in this example it is dir1 and dir2)
Right now I made it like this:
directory = dlg.lineEdit_zipfile_path2.text() // this should be path to test dir. (.../abc/test/)
arr = os.listdir(directory)
for item in arr:
allfiles2zip = directory + item
try:
shutil.make_archive(item,'zip', + allfiles2zip)
except OSError:
pass
it looks like it is working but all directories (dir1 and dir2) are compressed to: .../abc/here
abc
├── dit1.zip
├── dir2.zip
├── test
├── dir1
├── dir2
├── not_for_zipping.txt
but I would like to receive those files in selected path (directory) ...abc/test/here
abc
├── test
├── dir1
├── dir2
├── not_for_zipping.txt
├── dir1.zip
├── dir2.zip
Do you have any idea how can I change it ?
By the way, do you have any better way for this case ?
You can use path in file name
make_archive('test/' + item, 'zip', ...)
Eventually you can change folder before compressing
old_folder = os.getcwd()
os.chdir('test')
shutil.make_archive(item, 'zip', ...)
os.chdir(old_folder)

Rename part of filenames in sub-folders python

I have to rename images in main directory with contains sub-folders, script with I using right now do some work but not exactly what I need: I can't find a way to do it properly, now i have it:
maindir #my example origin
├── Sub1
│ ├── example01.jpg
│ ├── example02.jpg
│ └── example03.jpg
└── Sub2
├── example01.jpg
├── example02.jpg
└── example03.jpg
My script do that:
maindir
├── Sub1
│ ├── Sub1_example01.jpg
│ ├── Sub1_example02.jpg
│ └── Sub1_example03.jpg
└── Sub2
├── Sub2_example01.jpg
├── Sub2_example02.jpg
└── Sub2_example03.jpg
And I would like to get it :replace a letters in my filenames by my sub-folder name and keep the origin numbers of my jpg:
maindir
├── Sub1
│ ├── Sub1_01.jpg
│ ├── Sub1_02.jpg
│ └── Sub1_03.jpg
└── Sub2
├── Sub2_01.jpg
├── Sub2_02.jpg
└── Sub2_03.jpg
there is my code 4 witch I using:
from os import walk, path, rename
parent = ("F:\\PS\\maindir")
for dirpath, _, files in walk(parent):
for f in files:
rename(path.join(dirpath, f), path.join(dirpath, path.split(dirpath)[-1] + '_' + f))
what I have to change overhere to get my result???
instead of that line:
rename(path.join(dirpath, f), path.join(dirpath, path.split(dirpath)[-1] + '_' + f))
generate a new name using str.replace:
newf = f.replace("example",os.path.basename(dirpath)+"_")
then
rename(path.join(dirpath, f), path.join(dirpath,newf))
of course if you don't know the extension or the "prefix" of the input file, and only want to keep the number & extension, there's a way:
import re
number = (re.findall("\d+",f) or ['000'])[0]
this extracts the number from the name, and if not found, issues 000.
Then rebuild newf with the folder name, the extracted number & the original extension:
newf = "{}_{}.{}".format(os.path.basename(dirpath),number,os.path.splitext(f)[1])

Python. Rename files in subdirectories

Could you please help me to modify below script to change the name of files also in subdirectories.
def change():
path = e.get()
for filename in os.walk(path):
for ele in filename:
if type(ele) == type([]) and len(ele)!=0:
for every_file in ele:
if every_file[0:6].isdigit():
number = every_file[0:6]
name = every_file[6:]
x = int(number)+y
newname = (str(x) + name)
os.rename(os.path.join(path, every_file), os.path.join(path, newname))
I don't know what constraints you have on file names, therefore I wrote a general script just to show you how change their names in a given folder and all subfolders.
The test folder has the following tree structure:
~/test$ tree
.
├── bye.txt
├── hello.txt
├── subtest
│   ├── hey.txt
│   ├── lol.txt
│   └── subsubtest
│   └── good.txt
└── subtest2
└── bad.txt
3 directories, 6 files
As you can see all files have .txt extension.
The script that rename all of them is the following:
import os
def main():
path = "/path/toyour/folder"
count = 1
for root, dirs, files in os.walk(path):
for i in files:
os.rename(os.path.join(root, i), os.path.join(root, "changed" + str(count) + ".txt"))
count += 1
if __name__ == '__main__':
main()
The count variable is useful only to have different names for every file; probably you can get rid of it.
After executing the script, the folder looks like this:
~/test$ tree
.
├── changed1.txt
├── changed2.txt
├── subtest
│   ├── changed4.txt
│   ├── changed5.txt
│   └── subsubtest
│   └── changed6.txt
└── subtest2
└── changed3.txt
3 directories, 6 files
I think that the problem in your code is that you don't use the actual root of the os.walk function.
Hope this helps.

Categories

Resources