Walk subdirectories in Python starting from the subdirectories - python

I've the following dir structure:
root
└── env
   ├── team_1
   │   ├── policies
│ │ └── file.yaml
   │   └── roles
   └── team_2
   ├── policies
   └── roles
and I need to read all the files under a team directory and merge them to create one unique file.
This is my attempt:
env_path = os.path.join('root', env)
if os.path.exists(env_path):
for team_dir in os.listdir(env_path):
for root, dirs, files in os.walk(team_dir):
print(root, dirs, files)
The problem is that os.walk doesn't return anything when I pass team_dir. I should use os.path.join(env_path, team_dir) but at that point it returns the entire tree which I don't want. How can youreturn from os.walk the subdirs of already a subdir?

you have to use os.path.join(env_path, team_dir) or os.walk won't find anything.
But if you don't want all the hierarchy, just remove the start of the string:
for team_dir in os.listdir(env_path):
for root, dirs, files in os.walk(os.path.join(env_path, team_dir)):
for f in files+dirs:
print(os.path.join(root,f)[len(env_path)+1:]) # strip start of path + separator

Related

Get directories from the current to the n-th depth

Suppose a directory structure as:
├── parent_1
│   ├── child_1
│   │   ├── sub_child_1
│   │   │   └── file_1.py
│   │   └── file_2.py
│   └── file_3.py
├── parent_2
│   └── child_2
│   └── file_4.py
└── file_5.py
I want to get two arrays:
parents = ["parent_1", "parent_2"]
children = ["child_1", "child_2"]
Note that files and sub_child_1 are not included.
Using suggestions such as this, I can write:
parents = []
children = []
for root, dir, files in os.walk(path, topdown=True):
depth = root[len(path) + len(os.path.sep):].count(os.path.sep)
if depth == 0:
parents.append(dir)
elif depth == 1:
children.append(dir)
However, this is a bit wordy and I was wondering if there is a cleaner way of doing this.
Update 1
I also tried a listdir-based approach:
parents = [f for f in listdir(root) if isdir(join(root, f))]
children = []
for p in parents:
children.append([f for f in listdir(p) if isdir(join(root, p, f))])
You can clear the directories returned by os.walk to prevent it from traversing deeper when it has reached your desired depth:
for root, dirs, _ in os.walk(path, topdown=True):
if root == path:
continue
parents.append(root)
children.extend(dirs)
dirs.clear()

Python Sphinx imported method organization within a class

I have a Class that is importing many methods from submodules. Currently Sphinx is setup to organize 'bysource' so they're at least sorting by the order in which the submodules are imported.
What I would like though, is some kind of header or searchable text for the title of the file they come from.
Current Directory Structure:
my_project
├── setup.py
└── my_package
   ├── __init__.py # Classes for subpackage_1 and subpackage_2 and imports occur here
   ├── subpackage_1
   │   ├── __init__.py
   │   ├── _submodule_a.py
   │   └── _submodule_b.py
   └── subpackage_2
   ├── __init__.py
   ├── _submodule_c.py
   └── _submodule_d.py
Sphinx module rst file:
Module contents
---------------
.. automodule:: my_package
:members:
:undoc-members:
:show-inheritance:
:member-order: bysource
In the my_package init.py there are there parent Classes defined where all the submodules/methods are imported to their related Class.
class MyClass_1():
...
from .subpackage_1._submodule_a import method_a
from .subpackage_1._submodule_b import method_b, method_c
class MyClass_2():
...
from .subpackage_2._submodule_c import method_d, method_e
from .subpackage_2._submodule_d import method_f
In the resulting Sphinx method documentation I see the methods under each class, but the desire is to be able to see which submodule file the methods sourced from. Doesn't have to be a subsection, merely including a header/note etc. would be nice without having to resort to manually listing all the modlues/methods in the Sphinx file.
There's hundreds of functions in the real package so it would be helpful for the user to be able to quickly discern what submodule file a method came from when viewing the documentation.
Current Sphinx Output:
Desired Sphinx Output:
Class MyClass_1...
Class MyClass_1...
submodule_a
* method_a
* method_a
submodule_b
* method_b
* method_b
* method_c
* method_c
Class MyClass_2...
Class MyClass_2...
submodule_c
* method_d
* method_d
* method_d
* method_e
submodule_d
* method_f
* method_f
I see worse case scenario, putting the submodule filename in docstring for each method within the file, but would love if someone's figured this out in a more automated fashion

In a coockiecutter template, add folder only if choice variable has a given value

I am creating a cookiecutter template and would like to add a folder (and the files it contains) only if a variable has a given value. For example cookiecutter.json:
{
"project_slug":"project_folder"
"i_want_this_folder":['y','n']
}
and my template structure looks like:
template
└── {{ cookiecutter.project_slug }}
   ├── config.ini
   ├── data
   │   └── data.csv
   ├── {% if cookiecutter.i_want_this_folder == 'y' %}my_folder{% endif %}
└── some_files
However, when running cookiecutter template and choose 'n' I get an error
Error: "~/project_folder" directory already exists
Is my syntax for the folder name correct?
I was facing the same issue having the option to add or no folders with different contents (all folders can exist at the same time). The structure of the project is the following:
├── {{cookiecutter.project_slug}}
│ │
│ ├── folder_1_to_add_or_no
│ │ ├── file1.py
│ │ ├── file2.py
│ │ └── file3.txt
│ │
│ ├── folder_2_to_add_or_no
│ │ ├── image.png
│ │ ├── data.csv
│ │ └── file.txt
│ │
│ └── folder_3_to_add_or_no
│ ├── file1.py
│ └── some_dir
│
├── hooks
│ └── post_gen_project.py
│
└── cookiecutter.json
where the cookiecutter.json contains the following
{
"project_owner": "some-name",
"project_slug": "some-project",
"add_folder_one": ["yes", "no"],
"add_folder_two": ["yes", "no"],
"add_folder_three": ["yes", "no"],
}
as each directory folder_X_to_add_or_no contains different files, the trick is to remove those folders that the answer is "no", you can do this through a hook. Inside the post_gen_project.py file
# post_gen_project.py
import os
import shutil
from pathlib import Path
# Current path
path = Path(os.getcwd())
# Source path
parent_path = path.parent.absolute()
def remove(filepath):
if os.path.isfile(filepath):
os.remove(filepath)
elif os.path.isdir(filepath):
shutil.rmtree(filepath)
folders_to_add = [
'folder_one',
'folder_two',
'folder_three'
]
for folder in folders_to_add:
# Check if user wants the folder
cookiecutter_var = '{{cookiecutter.' + f'{folder}' + '}}'
add_folder = cookiecutter_var == 'yes'
# User does not want folder so remove it
if not add_folder:
folder_path = os.path.join(
parent_path,
'{{cookiecutter.project_slug}}',
'folder'
)
remove(folder_path)
Now the folders the user choose not to add will be removed.
Select add_folder_one:
1 - yes
2 - no
Choose from 1, 2 [1]:
References
This answer is based on briancapello answer on this github issue

Get full file path from GtkTreeView

So, I found a tutorial on creating a file browser using Gtk.TreeView but I'm facing a problem, when I select a file inside a folder I cant get the file's full path. I can get the model path but I don't know what to do with it.
This is my project tree:
.
├── browser
│ ├── index.html
│ └── markdown.min.js
├── compiler.py
├── ide-preview.png
├── __init__.py
├── main.py
├── __pycache__
│ ├── compiler.cpython-35.pyc
│ └── welcomeWindow.cpython-35.pyc
├── pyide-settings.json
├── README.md
├── resources
│ └── icons
│ ├── git-branch.svg
│ ├── git-branch-uptodate.svg
│ └── git-branch-waitforcommit.svg
├── test.py
├── WelcomeWindow.glade
└── welcomeWindow.py
When I click on main.py the path is 4, but if I click on browser/markdown.min.js I get 0:1.
In my code I check if the path's length (I split the path by ':') is bigger than 1, if not I open the file normally, if it is... This is where I'm stuck. Anyone can help?
Here is my TreeSelection on changed function:
def onRowActivated(self, selection):
# print(path.to_string()) # Might do the job...
model, row = selection.get_selected()
if row is not None:
# print(model[row][0])
path = model.get_path(row).to_string()
pathArr = path.split(':')
fileFullPath = ''
if not os.path.isdir(os.path.realpath(model[row][0])):
# self.openFile(os.path.realpath(model[row][0]))
if len(pathArr) <= 1:
self.openFile(os.path.realpath(model[row][0]))
else:
# Don't know what to do!
self.languageLbl.set_text('Language: {}'.format(self.sbuff.get_language().get_name()))
else:
print('None')
Full code is available at https://github.com/raggesilver/PyIDE/blob/master/main.py
Edit 1: Just to be more specific, my problem is that when I get the name of the file from the TreeView, I can't get the path before it, so I get index.html instead of browser/index.html.
I found a solution to my problem, the logic was to iterate through the path (e.g.: 4:3:5:0) backwards and get the last parent's name and then prepend to the path variable. So we have:
def onRowActivated(self, selection):
model, row = selection.get_selected()
if row is not None:
fullPath = ''
cur = row
while cur is not None:
fullPath = os.path.join(model[cur][0], fullPath)
cur = model.iter_parent(cur)
# do whatever with fullPath

Python function that similar to bash find command

I have a dir structure like the following:
[me#mypc]$ tree .
.
├── set01
│   ├── 01
│   │   ├── p1-001a.png
│   │   ├── p1-001b.png
│   │   ├── p1-001c.png
│   │   ├── p1-001d.png
│   │   └── p1-001e.png
│   ├── 02
│   │   ├── p2-001a.png
│   │   ├── p2-001b.png
│   │   ├── p2-001c.png
│   │   ├── p2-001d.png
│   │   └── p2-001e.png
I would like to write a python script to rename all *a.png to 01.png, *b.png to 02.png, and so on. Frist I guess I have to use something similar to find . -name '*.png', and the most similar thing I found in python was os.walk. However, in os.walk I have to check every file, if it's png, then I'll concatenate it with it's root, somehow not that elegant. I was wondering if there is a better way to do this? Thanks in advance.
For a search pattern like that, you can probably get away with glob.
from glob import glob
paths = glob('set01/*/*.png')
You can use os.walk to traverse the directory tree.
Maybe this works?
import os
for dpath, dnames, fnames in os.walk("."):
for i, fname in enumerate([os.path.join(dpath, fname) for fname in fnames]):
if fname.endswith(".png"):
#os.rename(fname, os.path.join(dpath, "%04d.png" % i))
print "mv %s %s" % (fname, os.path.join(dpath, "%04d.png" % i))
For Python 3.4+ you may want to use pathlib.glob() with a recursive pattern (e.g., **/*.png) instead:
Recursively iterate through all subdirectories using pathlib
https://docs.python.org/3/library/pathlib.html#pathlib.Path.glob
https://docs.python.org/3/library/pathlib.html#pathlib.Path.rglob
Check out genfind.py from David M. Beazley.
# genfind.py
#
# A function that generates files that match a given filename pattern
import os
import fnmatch
def gen_find(filepat,top):
for path, dirlist, filelist in os.walk(top):
for name in fnmatch.filter(filelist,filepat):
yield os.path.join(path,name)
# Example use
if __name__ == '__main__':
lognames = gen_find("access-log*","www")
for name in lognames:
print name
These days, pathlib is a convenient option.

Categories

Resources