iterating over folders executing a fuction at each 2 folders - python

I have a function called plot_ih_il that receives two data frames in order to generate a plot. I also have a set of folders that each contain a .h5 file with the data I need to give to the function plot_ih_il... I'm trying to feed the function two datasets at a time but unsuccessfully.
I've been using pathlib to do so
path = Path("files")
for log in path.glob("log*"):
for file in log.glob("log*.h5"):
df = pd.DataFrame(file, key = "log")
but using this loop, I can only feed one data frame at a time, I need two of them.
The structure of the folders is something like,
files->log1-> log1.h5
log2-> log2.h5
log3-> log3.h5
log4-> log4.h5
I would like to feed the function plot_il_ih the following sequence,
plot_il_ih(dataframeof_log1.h5, dataframeof_log2.h5) then
plot_il_ih(dataframeof_log2.h5, dataframeof_log3.h5) and so on.
I have tried to use zip
def pairwise(iterable):
a = iter(iterable)
return zip(a, a)
for l1, l2 in pairwise(list(path.glob('log*'))):
plot_il_ih(l1, l2)
but it doesn't move forward, just opens the 2 firsts.
What is wrong with my logic?

consider something like this. You might have to play around with the indexing
filelist = list(path.glob('log*'))
for i in range(1, len(filelist)):
print(filelist[i-1])
print(filelist[i])
print('\n')

Related

Editing String Objects in a List in Python

I have read in data from a basic txt file. The data is time and date in this form "DD/HHMM" (meteorological date and time data). I have read this data into a list: time[]. It prints out as you would imagine like so: ['15/1056', '15/0956', '15/0856', .........]. Is there a way to alter the list so that it ends up just having the time, basically removing the date and the forward slash, like so: ['1056', '0956', '0856',.........]? I have already tried list.split but thats not how that works I don't think. Thanks.
I'm still learning myself and I haven't touched python in sometime, BUT, my solution if you really need one:
myList = ['15/1056', '15/0956', '15/0856']
newList = []
for x in mylist:
newList.append(x.split("/")[1])
# splits at '/'
# returns ["15", "1056"]
# then appends w/e is at index 1
print(newList) # for verification

My CSV files are not being assigned to the correct Key in a dictionary

def read_prices(tikrList):
#read each file and get the price list dictionary
def getPriceDict():
priceDict = {}
TLL = len(tikrList)
for x in range(0,TLL):
with open(tikrList[x] + '.csv','r') as csvFile:
csvReader = csv.reader(csvFile)
for column in csvReader:
priceDict[column[0]] = float(column[1])
return priceDict
#populate the final dictionary with the price dictionary from the previous function
def popDict():
combDict = {}
TLL = len(tikrList)
for x in range(0,TLL):
for y in tikrList:
combDict[y] = getPriceDict()
return combDict
return(popDict())
print(read_prices(['GOOG','XOM','FB']))
What is wrong with the code is that when I return the final dictionary the key for GOOG,XOM,FB is represnting the values for the FB dictionary only.
As you can see with this output:
{'GOOG': {'2015-12-31': 104.660004, '2015-12-30': 106.220001},
'XOM': {'2015-12-31': 104.660004, '2015-12-30': 106.220001},
'FB': {'2015-12-31': 104.660004, '2015-12-30': 106.220001}
I have 3 different CSV files but all of them are just reading the CSV file for FB.
I want to apologize ahead of time if my code is not easy to read or doesn't make sense. I think there is an issue with storing the values and returning the priceDict in the getPriceDict function but I cant seem to figure it out.
Any help is appreciated, thank you!
Since this is classwork I won't provide a solution but I'll point a few things out.
You have defined three functions - two are defined inside the third. While structuring functions like that can make sense for some problems/solutions I don't see any benefit in your solution. It seems to make it more complicated.
The two inner functions don't have any parameters, you might want to refactor them so that when they are called you pass them the information they need. One advantage of a function is to encapsulate an idea/process into a self-contained code block that doesn't rely on resources external to itself. This makes it easy to test so you know that the function works and you can concentrate on other parts of the code.
This piece of your code doesn't make much sense - it never uses x from the outer loop:
...
for x in range(0,TLL):
for y in tikrList:
combDict[y] = getPriceDict()
When you iterate over a list the iteration will stop after the last item and it will iterate over the items themselves - no need to iterate over numbers to access the items: don't do for i in range(thelist): print(thelist[i])
>>> tikrList = ['GOOG','XOM','FB']
>>> for name in tikrList:
... print(name)
GOOG
XOM
FB
>>>
When you read through a tutorial or the documentation, don't just look at the examples - read and understand the text .

How to iterate a command (loop) through files in a list in Python

I'm new in Python. I'm trying to a write a brief script. I want to run a loop in which I have to read many files and for each file run a command.In particular, I want to do a calculation throught the the two rows of every file and return an output whith a name which is refered to the relative file.
I was able to load the files in a list ('work'). I tried to write the second single loop for the calculation that I have to do whith one of the file in the list and it runs correctly. THe problem is that I'm not able to iterate it over all the files and obtain each 'integr' value from the relative file.
Let me show what I tried to do:
import numpy as np
#I'm loading the files that contain the values whith which I want to do my calculation in a loop
work = {}
for i in range(0,100):
work[i] = np.loadtxt('work{}.txt'.format(i), float).T
#Now I'm trying to write a double loop in which I want to iterate the second loop (the calculation) over the files (that don't have the same length) in the list
integr = 0
for k in work:
for i in range(1, len(k[1,:])):
integr = integr + k[1,i]*(k[0,i] - k[0,i-1])
#I would like to print every 'integr' which come from the calculation over each file
print(integr)
When I try to run this, I obtain this message error:
Traceback (most recent call last):
File "lavoro.py", line 11, in <module>
for i in range(1, len(k[1,:])):
TypeError: 'int' object has no attribute '__getitem__'
Thank you in advance.
I am a bit guessing, but if I understood correctly, you want work to be a list and not a dictionary. Or maybe you don't want it, but surely you can use a list instead of a dictionary, given the context.
This is how you can create your work list:
work = []
for i in range(0,100):
work.append(np.loadtxt('work{}.txt'.format(i), float).T)
Or using the equivalent list comprehension of the above loop (usually the list comprehension is faster):
work = [np.loadtxt('work{}.txt'.format(i), float).T for i in range(100)]
Now you can loop over the work list to do your calculations (I assume they are correct, no way for me to check this):
for k in work:
integr = 0
for i in range(1, len(k[1,:])):
integr = integr + k[1,i]*(k[0,i] - k[0,i-1])
Note that I moved integr = 0 inside the loop, so that is reinitalized to 0 for each file, otherwise each inner loop will add to the result of the previous inner loops.
However if that was the desided behaviour, move integr = 0 outside the loop as your original code.
Guessing from the context you wanted:
for k in work.values():
iterating over dictionary produces only keys, not values.

Use files from different folders in a function in a loop?

I have a main folder like this:
mainf/01/streets/streets.shp
mainf/02/streets/streets.shp #normal files
mainf/03/streets/streets.shp
...
and another main folder like this:
mainfo/01/streets/streets.shp
mainfo/02/streets/streets.shp #empty files
mainfo/03/streets/streets.shp
...
I want to use a function that will take as first parameter the first normal file from the upper folder (normal files) and as second the corresponding from the other folder (empty files).
Based on the [-3] level folder number (ex.01,02,03,etc)
Example with a function:
appendfunc(first_file_from_normal_files,first_file_from_empty_files)
How to do this in a loop?
My code:
for i in mainf and j in mainfo:
appendfunc(i,j)
Update
Correct version:
first = ["mainf/01/streets/streets.shp", "mainf/02/streets/streets.shp", "mainf/03/streets/streets.shp"]
second = ["mainfo/01/streets/streets.shp", "mainfo/02/streets/streets.shp", "mainfo/03/streets/streets.shp"]
final = [(f,s) for f,s in zip(first,second)]
for i , j in final:
appendfunc(i,j)
An alternative to automatically put in a list all the files in a main folder with full paths?
first= []
for (dirpath, dirnames, filenames) in walk(mainf):
first.append(os.path.join(dirpath,dirnames,filenames))
second = []
for (dirpath, dirnames, filenames) in walk(mainfo):
second.append(os.path.join(dirpath,dirnames,filenames))
Use zip:
first = ["mainf/01/streets/streets.shp", "mainf/02/streets/streets.shp", "mainf/03/streets/streets.shp"]
second = ["mainf/01/streets/streets.shp", "mainf/02/streets/streets.shp", "mainf/03/streets/streets.shp"]
final = [(f,s) for f,s in zip(first,second)]
print(final)
You can't use a for ... and loop. You can loop one iterable in one statement, and another iterable in another statement. This still won't give you what you want:
for i in mainf:
for j in mainfo:
appendfunc(i,j)
What you probably want is something like (I'm assuming mainf and mainfo are essentially the same, except one is empty):
for folder_num in range(len(mainf)):
appendfunc(mainf[folder_num], mainfo[folder_num])
You haven't said what appendfunc is supposed to do, so I'll leave that to you. I'm also assuming that, depending on how you're accessing the files, you can figure out how you might need to modify the calls to mainf[folder_num] and mainfo[folder_num] (eg. you may need to inject the number back into the directory structure somehow (mainf/{}/streets/streets.shp".format(zero_padded(folder_num))).

Creating a list of dictionaries

I have code that generates a list of 28 dictionaries. It cycles thru 28 files and links data points from each file in the appropriate dictionary. In order to make my code more flexible I wanted to use:
tegDics = [dict() for x in range(len(files))]
But when I run the code the first 27 dictionaries are blank and only the last, tegDics[27], has data. Below is the code including the clumsy, yet functional, code I'm having to use that generates the dictionaries:
x=0
import os
files=os.listdir("DirPath")
os.chdir("DirPath")
tegDics = [{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{}] # THIS WORKS!!!
#tegDics = [dict() for x in range(len(files))] - THIS WON'T WORK!!!
allRads=[]
while x<len(tegDics): # now builds dictionaries
for line in open(files[x]):
z=line.split('\t')
allRads.append(z[2])
tegDics[x][z[2]]=z[4] # pairs catNo with locNo
x+=1
Does anybody know why the more elegant code doesn't work.
Since you're using x within the list comprehension, it will no longer be zero by the time you reach the while loop - it will be len(files)-1 instead. I suggest changing the variable you use to something else. It's traditional to use a single underscore for a value you don't care about.
tegDics = [dict() for _ in range(len(files))]
It could be useful to eliminate your use of x entirely. It's customary in python to iterate directly over the objects in a sequence, rather than using a counter variable. You might do something like:
for tegDic in tegDics:
#do stuff with tegDic here
Although it's slightly trickier in your case, since you want to simultaneously iterate through tegDics and files at the same time. You can use zip to do that.
import os
files=os.listdir("DirPath")
os.chdir("DirPath")
tegDics = [dict() for _ in range(len(files))]
allRads=[]
for file, tegDic in zip(files,tegDics):
for line in open(file):
z=line.split('\t')
allRads.append(z[2])
tegDic[z[2]]=z[4] # pairs catNo with locNo
Anyway there is a simplest way imho:
taegDics = [{}]*len(files)

Categories

Resources