so I'm a rookie at programming and I'm trying to make a program in python that basically opens a text file with a bunch of columns and writes the data to 3 different text files based on a string in the row. As my program stands right now, I have it change the directory to a specific output folder using os.chdir so it can open my text file but what I want is it to do something like this:
Imagine a folder set up like this :
Source Folder contains N number of folders. Each of those folders contains N number of output folders. Each output folder contains 1 Results.txt.
The idea is to have the program start at the source folder, look into Folder 1, look for output 1, open the .txt file then do it's thing. Once it's done, it should go back to folder 1 and open output 2 and do it's thing again. Then it should go back to Folder 1 and if it can't find any more output folders, it needs to go to Folder A and then enter Folder 2 and repeat the process until there are no more folders. Honestly not sure where to really start with this, the best I could do is make a small program that prints all my .txt files but I'm not sure how to open them at all. Hope my question makes sense and thanks for the help.
If all you need is to process each file in a directory recursively:
import os
def process_dir(dir):
for subdir, dirs, files in os.walk(dir):
for file in files:
file_path = os.path.join(subdir, file)
print file_path
# process file here
This will process each file in the root dir recursively. If you're looking for conditional iteration you might need to make the loop a little smarter.
Read the base folder path and stored into variable and move to sub folder and process the text file using chdir and base path change the directory and read the sub folder once again.
dirlist = os.listdir(os.getcwd())
dirlist = filter(lambda x: os.path.isdir(x), filelist)
for dirname in dirlist:
print os.path.join(os.getcwd(),dirname,'Results.txt')
first, i think you could format your question for better reading.
Concerning your question, here's a naïve implementation example :
import os
where = "J:/tmp/"
what = "Results.txt"
def processpath(where, name):
for elem in os.listdir(where):
elempath = os.path.join(where,elem)
if (elem == name):
# Do something with your file
f = open(elempath, "w") # example
f.write("modified2") # example
elif(os.path.isdir(elempath)):
processpath(elempath, name)
processpath(where, what)
I would do this without chdir. The most straight forward solution to me is to use os.listdir and filter the results. Then os.path.join to construct complete relative paths instead of chdir. I suspect this would be less prone to bugs such as winding up in an unexpected current working directory where all your relative paths are then wrong.
nfolders = [d for d in os.listdir(".") if re.match("^Folder [0-9]+$", d)]
for f1 in nfolders:
noutputs = [d for d in os.listdir(f1) if re.match("^Output [0-9]+$", d)]
for f2 in noutputs:
resultsFilename = os.path.join(f1, f2, "results.txt")
#do whatever with resultsFilename
Related
I need to run the following script for each txt file located in all subfolders.
The main folder is "simulations" in which there are different subfolders (called as "year-month-day"). In each subfolder there is a txt file "diagno.inp". I have to run this script for each "diagno.inp" file in order to have a list with the following data (a row for each day):
"year-month-day", "W_int", "W_dir"
Here's the code that is working for only a subfolder. Can you help me to create a loop?
fid=open('/Users/silviamassaro/weather/simulations/20180105/diagno.inp', "r")
subfolder="20180105"
n = fid.read().splitlines()[51:]
for element in n:
"do something" # here code to calculate W_dirand W_int for each day
print (subfolder, W_int, W_dir)
Here's what I usually do when I need to loop over a directory and its child recursively:
import os
main_folder = '/path/to/the/main/folder'
files_to_process = [os.path.join(main_folder, child) for child in os.listdir(main_folder)]
while files_to_process:
child = files_to_process.pop()
if os.path.isdir(child):
files_to_process.extend(os.path.join(child, sub_child) for sub_child in os.listdir(child))
else:
# We have a file here, we can do what we want with it
It's short, but has pretty strong assumptions:
You don't care about the order in which the files are treated.
You only have either directories or regular files in the childs of your entry point.
Edit: added another possible solution using glob, thanks to #jacques-gaudin's comment
This solution has the advantaged that you are sure to get only .inp files, but you are still not sure of their order.
import glob
main_folder = '/path/to/the/main/folder'
files_to_process = glob.glob('%s/**/*.inp' % main_folder, recursive=Tre)
for found_file in files_to_process:
# We have a file here, we can do what we want with it
Hope this helps!
With pathlib you can do something like this:
from pathlib import Path
sim_folder = Path("path/to/simulations/folder")
for inp_file in sim_folder.rglob('*.inp'):
subfolder = inp_file.parent.name
with open(inp_file, 'r') as fid:
n = fid.read().splitlines()[51:]
for element in n:
"do something" # here code to calculate W_dirand W_int for each day
print (subfolder, W_int, W_dir)
Note this is recursively traversing all subfolders to look for .inp files.
I have a Python program where I am calculating the number of files within different directories, but I wanted to know if it was possible to use a text file containing a list of different directory locations to change the cwd within my program?
Input: Would be a text file that has different folder locations that contains various files.
I have my program set up to return the total amount of files in a given folder location and return the amount to a count text file that will be located in each folder the program is called on.
You can use os module in Python.
import os
# dirs will store the list of directories, can be populated from your text file
dirs = []
text_file = open(your_text_file, "r")
for dir in text_file.readlines():
dirs.append(dir)
#Now simply loop over dirs list
for directory in dirs:
# Change directory
os.chdir(directory)
# Print cwd
print(os.getcwd())
# Print number of files in cwd
print(len([name for name in os.listdir(directory)
if os.path.isfile(os.path.join(directory, name))]))
Yes.
start_dir = os.getcwd()
indexfile = open(dir_index_file, "r")
for targetdir in indexfile.readlines():
os.chdir(targetdir)
# Do your stuff here
os.chdir(start_dir)
Do bear in mind that if your program dies half way through it'll leave you in a different working directory to the one you started in, which is confusing for users and can occasionally be dangerous (especially if they don't notice it's happened and start trying to delete files that they expect to be there - they might get the wrong file). You might want to consider if there's a way to achieve what you want without changing the working directory.
EDIT:
And to suggest the latter, rather than changing directory use os.listdir() to get the files in the directory of interest:
import os
start_dir = os.getcwd()
indexfile = open(dir_index_file, "r")
for targetdir in indexfile.readlines():
contents = os.listdir(targetdir)
numfiles = len(contents)
countfile = open(os.path.join(targetdir, "count.txt"), "w")
countfile.write(str(numfiles))
countfile.close()
Note that this will count files and directories, not just files. If you only want files then you'll have to go through the list returned by os.listdir checking whether each item is a file using os.path.isfile()
Problem: I have around 100 folders in 1 main folder each with a csv file with the exact same name and structure, for example input.csv. I have written a Python script for one of the files which takes the csv-file as an input and gives two images as output in the same folder. I want to produce these images and put them in every folder given the input per folder.
Is there a fast way to do this? Until now I have copied my script every time in each folder and been executing it. For 5 folders it was alright, but for 100 this will get tedious and take a lot of time.
Can someone please help met out? I'm very new to coding w.r.t. directories, paths, files etc. I have already tried to look for a solution, but no succes so far.
You could try something like this:
import os
import pandas as pd
path = r'path\to\folders'
filename = 'input'
# get all directories within the top level directory
dirs = [os.path.join(path, val) for val in os.listdir(path) if os.path.isdir(os.path.join(path, val))]
# loop through each directory
for dd in dirs:
file_ = [f for f in os.listdir(dd) if f.endwith('.csv') and 'input' in f]
if file_:
# this assumes only a single file of 'filename' and extension .csv exists in each directory
file_ = file_[0]
else:
continue
data = pd.read_csv(file_)
# apply your script here and save images back to folder dd
Hey, this is my first post (!)
Just looking after headache recursive solution to my littel project :)
Trying to collect all folders path (recursively),
thats contaion some specefic file
to array of path's.
ex:
my (root) path is:
c:/test
folder test is contain the file 'test.txt'
and some folders: '1','2','3'.
any of them contain 'test.txt' too!
(if 'text.txt' is not found:
just brake the loop and dont search in subfolders!)
now my function will look for 'test.txt'
and then, collect all folders to my folderslist:
if os.path.exists(os.path.join(path, 'test.txt')):
full_list = os.listdir(path)
folderslist = []
for folder in full_list:
if os.path.isfile(os.path.join(path, folder)) == 0:
folderslist.append(os.path.join(path, folder))
its working not bad, just not recurcive...
really dont know how to call the function again
with the same list, and force him to change the 'current path'...
not sure if 'list' is the best data struct for me to call with it again.
my goal is to make some opration's in every forlder on this list:
c:/test c:/test/1 c:/test/2 c:/test/3
but if there is more folders (that not contain 'test.txt' so, just dont add it to my folder list, and do not looking inside)
hope my fisrt post was clear enough :X
You can use os.walk to traverse the subfolders, and if test.txt is not found, clear the directory list so os.walk won't traverse its subfolders any further:
import os
folderslist = []
for root, dirs, files in os.walk('c:/test'):
if 'test.txt' in files:
folderslist.append(root)
else:
dirs.clear()
I'm working on something here, and I'm completely confused. Basically, I have the script in my directory, and that script has to run on multiple folders with a particular extension. Right now, I have it up and running on a single folder. Here's the structure, I have a main folder say, Python, inside that I have multiple folders all with the same .ext, and inside each sub-folder I again have few folders, inside which I have the working file.
Now, I want the script to visit the whole path say, we are inside the main folder 'python', inside which we have folder1.ext->sub-folder1->working-file, come out of this again go back to the main folder 'Python' and start visiting the second directory.
Now there are so many things in my head, the glob module, os.walk, or the for loop. I'm getting the logic wrong. I desperately need some help.
Say, Path=r'\path1'
How do I start about? Would greatly appreciate any help.
I'm not sure if this is what you want, but this main function with a recursive helper function gets a dictionary of all of the files in a main directory:
import os, os.path
def getFiles(path):
'''Gets all of the files in a directory'''
sub = os.listdir(path)
paths = {}
for p in sub:
print p
pDir = os.path.join(path, p)
if os.path.isdir(pDir):
paths.update(getAllFiles(pDir, paths))
else:
paths[p] = pDir
return paths
def getAllFiles(mainPath, paths = {}):
'''Helper function for getFiles(path)'''
subPaths = os.listdir(mainPath)
for path in subPaths:
pathDir = os.path.join(path, p)
if os.path.isdir(pathDir):
paths.update(getAllFiles(pathDir, paths))
else:
paths[path] = pathDir
return paths
This returns a dictionary of the form {'my_file.txt': 'C:\User\Example\my_file.txt', ...}.
Since you distinguish first level directories from its sub-directories, you could do something like this:
# this is a generator to get all first level directories
dirs = (d for d in os.listdir(my_path) if os.path.isdir(d)
and os.path.splitext(d)[-1] == my_ext)
for d in dirs:
for root, sub_dirs, files in os.walk(d):
for f in files:
# call your script on each file f
You could use Formic (disclosure: I am the author). Formic allows you to specify one multi-directory glob to match your files so eliminating directory walking:
import formic
fileset = formic.FileSet(include="*.ext/*/working-file", directory=r"path1")
for file_name in fileset:
# Do something with file_name
A couple of points to note:
/*/ matches every subdirectory, while /**/ recursively descends into every subdirectory, their subdirectories and so on. Some options:
If the working file is precisely one directory below your *.ext, then use /*/
If the working file is at any depth under *.ext, then use /**/ instead.
If the working file is at least one directory, then you might use /*/**/
Formic starts searching in the current working directory. If this is the correct directory, you can omit the directory=r"path1"
I am assuming the working file is literally called working-file. If not, substitute a glob that matches it, like *.sh or script-*.