Python parse dirs, remove subdirectory tree - python

I have a long list of directories, with something like this
C:\Users\vanstrie\Desktop\ntnu\SCHEMA\2012\07_paper\results\026\onsets
I want to parse through folders 001-040 (026 shown above) and remove the onsets subdirectory with all files and subfolders that are in it. I am unsure how to achieve this with python 3. If you have a solution, please advise. Many thanks in advance.
Niels

I would think that something like this should work...
import glob
import os.path
import shutil
files_dirs = glob.glob(r'C:\Users\vanstrie\Desktop\ntnu\SCHEMA\2012\07_paper\results\*')
for d in files_dirs:
head,tail = os.path.split(d)
try:
if (0 < int(tail) < 41) and (len(tail) == 3): #don't want to delete `\results\3\onsets` I guess...
print("about to delete:",d)
shutil.rmtree(os.path.join(d,'onsets'),ignore_errors=True)
except ValueError: #apparently we got a non-integer. Leave that directory.
pass
As with anything when deleting files, I would definitely print the things that would be deleted on a first pass -- Just to make sure the script is actually working as expected (and to make sure you don't delete something you want to keep).

import shutil, os.path
root_folder = "C:\\Users\\vanstrie\\Desktop\\ntnu\\SCHEMA\\2012\\07_paper\\results"
suffix = "onsets"
for i in range(1,41):
folder = os.path.join( root_folder, "%03d" % i, suffix )
shutil.rmtree( folder, ignore_errors=True, onerror=None )

Related

Run a script for each txt file in all subfolders

I need to run the following script for each txt file located in all subfolders.
The main folder is "simulations" in which there are different subfolders (called as "year-month-day"). In each subfolder there is a txt file "diagno.inp". I have to run this script for each "diagno.inp" file in order to have a list with the following data (a row for each day):
"year-month-day", "W_int", "W_dir"
Here's the code that is working for only a subfolder. Can you help me to create a loop?
fid=open('/Users/silviamassaro/weather/simulations/20180105/diagno.inp', "r")
subfolder="20180105"
n = fid.read().splitlines()[51:]
for element in n:
"do something" # here code to calculate W_dirand W_int for each day
print (subfolder, W_int, W_dir)
Here's what I usually do when I need to loop over a directory and its child recursively:
import os
main_folder = '/path/to/the/main/folder'
files_to_process = [os.path.join(main_folder, child) for child in os.listdir(main_folder)]
while files_to_process:
child = files_to_process.pop()
if os.path.isdir(child):
files_to_process.extend(os.path.join(child, sub_child) for sub_child in os.listdir(child))
else:
# We have a file here, we can do what we want with it
It's short, but has pretty strong assumptions:
You don't care about the order in which the files are treated.
You only have either directories or regular files in the childs of your entry point.
Edit: added another possible solution using glob, thanks to #jacques-gaudin's comment
This solution has the advantaged that you are sure to get only .inp files, but you are still not sure of their order.
import glob
main_folder = '/path/to/the/main/folder'
files_to_process = glob.glob('%s/**/*.inp' % main_folder, recursive=Tre)
for found_file in files_to_process:
# We have a file here, we can do what we want with it
Hope this helps!
With pathlib you can do something like this:
from pathlib import Path
sim_folder = Path("path/to/simulations/folder")
for inp_file in sim_folder.rglob('*.inp'):
subfolder = inp_file.parent.name
with open(inp_file, 'r') as fid:
n = fid.read().splitlines()[51:]
for element in n:
"do something" # here code to calculate W_dirand W_int for each day
print (subfolder, W_int, W_dir)
Note this is recursively traversing all subfolders to look for .inp files.

Recursively copy a folder and change folder/file names of copied files

I need to copy a large folder and rename all the files and folders inside if they contain a specific string. Basically I want to copy everything and change any instance of 10 to 11.
For example if I have a folder structured like this:
mainfolder10
-group10
-group10.js
-group10.html
I want it to copy it like this:
mainfolder11
-group11
-group11.js
-group11.html
I could also copy it first with cp -r mainfolder10/ mainfolder11/ and then use a different command or script to rename the files. I'm just looking for anything to not have to do this manually.
I am looking to accomplish this in either bash, node, or python...whatever you all recommend. Does anyone know a simple way to do this?
The usual technique for recursion over directories and files is to use os.walk():
for root, dirs, files in os.walk('somepath'):
...
From there, you can use os.rename() or any of the shutils as needed on a file-by-file or directory-by-directory basis.
To avoid confusion, I would rename all the files on the first pass and then make a second pass to rename the directories.
So I finally ended up accomplishing this with the following script.
#!/usr/bin/env python
# coding: utf-8
from pathlib import Path
import shutil
import tempfile
cohorts_master = Path.home() / "/Users/leo/Desktop/repos/drive-scripts"
cohorts_master
NEEDLE, REPLACEMENT = (f"C{i}" for i in range(39, 41))
src = cohorts_master / f"MIA {NEEDLE}"
src
assert src.exists(), f"'{src}' does not exist"
dest = Path(tempfile.mkdtemp()) / f"MIA {REPLACEMENT}"
print(dest)
try: # I get errors here. Likely permissions problems. I was getting a partial copy
shutil.copytree(src, dest)
except Exception as err:
pass
for item in dest.glob("**/*"):
shutil.move(item.as_posix(), item.as_posix().replace(NEEDLE, REPLACEMENT))
new_src = dest
new_dest = cohorts_master / dest.name.replace(
REPLACEMENT, f"{REPLACEMENT}-copied-from-{NEEDLE}"
)
new_src, new_dest
shutil.copytree(new_src, new_dest)

Using input to change directory path

I'm kinda new to python and I feel like the answer to this is so simple but I have no idea what the answer is. I'm trying to move files from one place to another but I don't want to have to change my code every time I wanna move that file so I just want to get user input from the terminal.
import shutil
loop = True
while loop:
a = input()
shutil.move("/home/Path/a", "/home/Path/Pictures")
What do I have to put around the a so that it doesn't read it as part of the string?
This should do what you want. the os.path.join() will combine the string value in a, that you get from input with the first part of the path you have provided. You should use os.path.join() as this will form paths in a way that is system independent.
import shutil
import os
loop = True
while loop:
a = input()
shutil.move(os.path.join("/home/Path/", a), "/home/Path/Pictures")
Output:
>>> a = input()
test.txt
>>> path = os.path.join("/home/Path/", a)
>>> path
'/home/Path/test.txt'
You can also use "/home/Path/{0}".format(a) which will swap the value of a with {0}, or you can do do "/home/Path/{0}" + str(a) which will also do what you want.
Edited to account for Question in comment:
This will work if your directory doesn't have any sub-directories. it may still work if there are directories and files in there but I didn't test that.
import shutil
import os
files = os.listdir("/home/Path/")
for file in files:
shutil.move(os.path.join("/home/Path/", file), "/home/Path/Pictures")
one solution
a = 'test.csv'
path = '/home/Path/{}'.format(a)
>>> path
/home/Path/test.csv

How do I figure out the number of folders in a directory(not subdirectories)

I have the following code which works but it searches in all the sub-directories,I only want to search in the immediate directory and also limit the search for folders,I don't need the files cound,can anyone suggest how to do that?
import os
files = folders = 0
path = "\\\\snowcone\\builds708\\INTEGRATION\\CI_LA.UM.5.7-45903-8x98.1-4\\LINUX\\android\\out\\target\\product"
for _, dirnames, filenames in os.walk(path):
# ^ this idiom means "we won't be using this value"
files += len(filenames)
folders += len(dirnames)
print "{:,} files, {:,} folders".format(files, folders)
You can do this instead
import os
len([i for i in os.listdir(path) if os.path.isdir(i)])
Or as recommended (saves from creating list):
import os
sum(os.path.isdir(i) for i in os.listdir(path))
Well even though a very good answer has already been given to you, here's another one just for the sake of it. I reckon the solution is less elegant and less efficient though.
import os
path = 'your_path_goes_here'
number_of_dirs = len(list(filter(os.path.isdir, os.listdir(path))))

Python, Opening files in loop (dicom)

I am currently reading in 200 dicom images manually using the code:
ds1 = dicom.read_file('1.dcm')
so far, this has worked but I am trying to make my code shorter and easier to use by creating a loop to read in the files using this code:
for filename in os.listdir(dirName):
dicom_file = os.path.join("/",dirName,filename)
exists = os.path.isfile(dicom_file)
print filename
ds = dicom.read_file(dicom_file)
This code is not currently working and I am receiving the error:
"raise InvalidDicomError("File is missing 'DICM' marker. "
dicom.errors.InvalidDicomError: File is missing 'DICM' marker. Use
force=True to force reading
Could anyone advice me on where I am going wrong please?
I think the line:
dicom_file = os.path.join("/",dirName,filename)
might be an issue? It will join all three to form a path rooted at '/'. For example:
os.path.join("/","directory","file")
will give you "/directory/file" (an absolute path), while:
os.path.join("directory","file")
will give you "directory/file" (a relative path)
If you know that all the files you want are "*.dcm"
you can try the glob module:
import glob
files_with_dcm = glob.glob("*.dcm")
This will also work with full paths:
import glob
files_with_dcm = glob.glob("/full/path/to/files/*.dcm")
But also, os.listdir(dirName) will include everything in the directory including other directories, dot files, and whatnot
Your exists = os.path.isfile(dicom_file) line will filter out all the non files if you use an "if exists:" before reading.
I would recommend the glob approach, if you know the pattern, otherwise:
if exists:
try:
ds = dicom.read_file(dicom_file)
except InvalidDicomError as exc:
print "something wrong with", dicom_file
If you do a try/except, the if exists: is a bit redundant, but doesn't hurt...
Try adding:
dicom_file = os.path.join("/",dirName,filename)
if not dicom_file.endswith('.dcm'):
continue

Categories

Resources