How convert multidimensional array to two dimensional array - python

Here, my code feats value form text file; and create matrices as multidimensional array, but the problem is the code create more then two dimensional array, that I can't manipulate, I need two dimensional array, how I do that?
Explain algorithm of my code:
Moto of code:
My code fetch value from a specific folder, each folder contain 7 'txt' file, that generate from one user, in this way multiple folder contain multiple data of multiple user.
step1: Start a 1st for loop, and control it using how many folder have in specific folder,and in variable 'path' store the first path of first folder.
step2: Open the path and fetch data of 7 txt file using 2nd for loop.after feats, it close 2nd for loop and execute the rest code.
step3: Concat the data of 7 txt file in one 1d array.
step4(Here the problem arise): Store the 1d arry of each folder as 2d array.end first for loop.
Code:
import numpy as np
from array import *
import os
f_path='Result'
array_control_var=0
#for feacth directory path
for (path,dirs,file) in os.walk(f_path):
if(path==f_path):
continue
f_path_1= path +'\page_1.txt'
#Get data from page1 indivisualy beacuse there string type data exiest
pgno_1 = np.array(np.loadtxt(f_path_1, dtype='U', delimiter=','))
#only for page_2.txt
f_path_2= path +'\page_2.txt'
with open(f_path_2) as f:
str_arr = ','.join([l.strip() for l in f])
pgno_2 = np.asarray(str_arr.split(','), dtype=int)
#using loop feach data from those text file.datda type = int
for j in range(3,8):
#store file path using variable
txt_file_path=path+'\page_'+str(j)+'.txt'
if os.path.exists(txt_file_path)==True:
#genarate a variable name that auto incriment with for loop
foo='pgno_'+str(j)
else:
break
#pass the variable name as string and store value
exec(foo + " = np.array(np.loadtxt(txt_file_path, dtype='i', delimiter=','))")
#z=np.array([pgno_2,pgno_3,pgno_4,pgno_5,pgno_6,pgno_7])
#marge all array from page 2 to rest in single array in one dimensation
f_array=np.concatenate((pgno_2,pgno_3,pgno_4,pgno_5,pgno_6,pgno_7), axis=0)
#for first time of the loop assing this value
if array_control_var==0:
main_f_array=f_array
else:
#here the problem arise
main_f_array=np.array([main_f_array,f_array])
array_control_var+=1
print(main_f_array)
current my code generate array like this(for 3 folder)
[
array([[0,0,0],[0,0,0]]),
array([0,0,0])
]
Note: I don't know how many dimension it have
But I want
[
array(
[0,0,0]
[0,0,0]
[0,0,0])
]

I tried to write a recursive code that recursively flattens the list of lists into one list. It gives the desired output for your case, but I did not try it for many other inputs(And it is buggy for certain cases such as :list =[0,[[0,0,0],[0,0,0]],[0,0,0]])...
flat = []
def main():
list =[[[0,0,0],[0,0,0]],[0,0,0]]
recFlat(list)
print(flat)
def recFlat(Lists):
if len(Lists) == 0:
return Lists
head, tail = Lists[0], Lists[1:]
if isinstance(head, (list,)):
recFlat(head)
return recFlat(tail)
else:
return flat.append(Lists)
if __name__ == '__main__':
main()
My idea behind the code was to traverse the head of each list, and check whether it is an instance of a list or an element. If the head is an element, this means I have a flat list and I can return the list. Else, I should recursively traverse more.

Related

Python: Use the "i" counter in while loop as digit for expressions

This seems like it should be very simple but am not sure the proper syntax in Python. To streamline my code I want a while loop (or for loop if better) to cycle through 9 datasets and use the counter to call each file out using the counter as a way to call on correct file.
I would like to use the "i" variable within the while loop so that for each file with sequential names I can get the average of 2 arrays, the max-min of this delta, and the max-min of another array.
Example code of what I am trying to do but the avg(i) and calling out temp(i) in loop does not seem proper. Thank you very much for any help and I will continue to look for solutions but am unsure how to best phrase this to search for them.
temp1 = pd.read_excel("/content/113VW.xlsx")
temp2 = pd.read_excel("/content/113W6.xlsx")
..-> temp9
i=1
while i<=9
avg(i) =np.mean(np.array([temp(i)['CC_H='],temp(i)['CC_V=']]),axis=0)
Delta(i)=(np.max(avg(i)))-(np.min(avg(i)))
deltaT(i)=(np.max(temp(i)['temperature='])-np.min(temp(i)['temperature=']))
i+= 1
EG: The slow method would be repeating code this for each file
avg1 =np.mean(np.array([temp1['CC_H='],temp1['CC_V=']]),axis=0)
Delta1=(np.max(avg1))-(np.min(avg1))
deltaT1=(np.max(temp1['temperature='])-np.min(temp1['temperature=']))
avg2 =np.mean(np.array([temp2['CC_H='],temp2['CC_V=']]),axis=0)
Delta2=(np.max(avg2))-(np.min(avg2))
deltaT2=(np.max(temp2['temperature='])-np.min(temp2['temperature=']))
......
Think of things in terms of lists.
temps = []
for name in ('113VW','113W6',...):
temps.append( pd.read_excel(f"/content/{name}.xlsx") )
avg = []
Delta = []
deltaT = []
for data in temps:
avg.append(np.mean(np.array([data['CC_H='],data['CC_V=']]),axis=0)
Delta.append(np.max(avg[-1]))-(np.min(avg[-1]))
deltaT.append((np.max(data['temperature='])-np.min(data['temperature=']))
You could just do your computations inside the first loop, if you don't need the dataframes after that point.
The way that I would tackle this problem would be to create a list of filenames, and then iterate through them to do the necessary calculations as per the following:
import pandas as pd
# Place the files to read into this list
files_to_read = ["/content/113VW.xlsx", "/content/113W6.xlsx"]
results = []
for i, filename in enumerate(files_to_read):
temp = pd.read_excel(filename)
avg_val =np.mean(np.array([temp(i)['CC_H='],temp['CC_V=']]),axis=0)
Delta=(np.max(avg_val))-(np.min(avg_val))
deltaT=(np.max(temp['temperature='])-np.min(temp['temperature=']))
results.append({"avg":avg_val, "Delta":Delta, "deltaT":deltaT})
# Create a dataframe to show the results
df = pd.DataFrame(results)
print(df)
I have included the enumerate feature to grab the index (or i) should you want to access it for anything, or include it in the results. For example, you could change the the results.append line to something like this:
results.append({"index":i, "Filename":filename, "avg":avg_val, "Delta":Delta, "deltaT":deltaT})
Not sure if I understood the question correctly. But if you want to read the files inside a loop using indexes (i variable), you can create a list to hold the contents of the excel files instead of using 9 different variables.
something like
files = []
files.append(pd.read_excel("/content/113VW.xlsx"))
files.append(pd.read_excel("/content/113W6.xlsx"))
...
then use the index variable to iterate over the list
i=1
while i<=9
avg(i) = np.mean(np.array([files[i]['CC_H='],files[i]['CC_V=']]),axis=0)
...
i+=1
P.S.: I am not a Pandas/NumPy expert, so you may have to adapt the code to your needs

create a loop in python to loop over files, variables, and matrix

Does anyone can help me with creating a loop in python? I need to create a loop to loop in the same time over files, variable, and matrix. To explain better, these are the steps of my code:
read files using h5py library:
file_00 = h5py.File('file0','r')
file_01 = h5py.File('file1','r')
file_1000 = h5py.File('file1000','r')
Extract variables from file:
alpha_00 = numpy.array(file_00['alpha'])[:,:,:,:]
alpha_01 = numpy.array(file_01['alpha'])[:,:,:,:]
alpha_1000= numpy.array(file_1000['alpha'])[:,:,:,:]
Construct new matrix:
new_alpha_1 = np.zeros([100,200,100])
for index in range (100):
m =
n =
p =
new_alpha_1[m:m+10,n:n+10,p:p+10]=alpha_1[index,:,:,:]
My goal is to create a loop from 0 to 1000, to read all files, extract alpha variable for all files and construct new_alpha matrix for all files.
What I tried is first looping over files by creating a list:
for counter in range (0,1000):
File=h5py.File('file_{0:0=4d}'.format(counter),'r')
This lines works and are able to read all files.
How can I create a loop to extract alpha variables for all files and construct the matrix new_alpha for all files ?

Memory issues with a list of lists [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I am having some memory issues and I am wondering if there is any way I can free up some memory in the code below. I have tried using a generator expression rather than list comprehension but that does not produce unique combinations, as the memory is freed up.
The list of lists (combinations) causes me to run out of memory and the program does not finish.
The end result would be 729 lists in this list, with each list containing 6 WindowsPath elements that point to images. I have tried storing the lists as strings in a text file but I can not get that to work, I tried using a pandas dataframe but I can not get that to work.
I need to figure out a different solution. The output right now is exactly what I need but the memory is the only issue.
from pathlib import Path
from random import choice
from itertools import product
from PIL import Image
import sys
def combine(arr):
return list(product(*arr))
def generate(x):
#set new value for name
name = int(x)
#Turn name into string for file name
img_name = str(name)
#Pick 1 random from each directory, add to list.
a_paths = [choice(k) for k in layers]
#if the length of the list of unique combinations is equal to the number of total combinations, this function stops
if len(combinations) == len(combine(layers)):
print("Done")
sys.exit()
else:
#If combination exists, generate new list
if any(j == a_paths for j in combinations) == True:
print("Redo")
generate(name)
#Else, initialize new image, paste layers + save image, add combination to list, and generate new list
else:
#initialize image
img = Image.new("RGBA", (648, 648))
png_info = img.info
#For each path in the list, paste on top of previous, sets image to be saved
for path in a_paths:
layer = Image.open(str(path), "r")
img.paste(layer, (0, 0), layer)
print(str(name) + ' - Unique')
img.save(img_name + '.png', **png_info)
combinations.append(a_paths)
name = name - 1
generate(name)
'''
Main method
'''
global layers
layers = [list(Path(directory).glob("*.png")) for directory in ("dir1/", "dir2/", "dir3/", "dir4/", "dir5/", "dir6/")]
#name will dictate the name of the file output(.png image) it is equal to the number of combinations of the image layers
global name
name = len(combine(layers))
#combinations is the list of lists that will store all unique combinations of images
global combinations
combinations = []
#calling recursive function
generate(name)
Let's start with a MRE version of your code (i.e. something that I can run without needing a bunch of PNGs -- all we're concerned with here is how to go through the images without hitting recursion limits):
from random import choice
from itertools import product
def combine(arr):
return list(product(*arr))
def generate(x):
# set new value for name
name = int(x)
# Turn name into string for file name
img_name = str(name)
# Pick 1 random from each directory, add to list.
a_paths = [choice(k) for k in layers]
# if the length of the list of unique combinations is equal to the number of total combinations, this function stops
if len(combinations) == len(combine(layers)):
print("Done")
return
else:
# If combination exists, generate new list
if any(j == a_paths for j in combinations) == True:
print("Redo")
generate(name)
# Else, initialize new image, paste layers + save image, add combination to list, and generate new list
else:
# initialize image
img = []
# For each path in the list, paste on top of previous, sets image to be saved
for path in a_paths:
img.append(path)
print(str(name) + ' - Unique')
print(img_name + '.png', img)
combinations.append(a_paths)
name = name - 1
generate(name)
'''
Main method
'''
global layers
layers = [
[f"{d}{f}.png" for f in ("foo", "bar", "baz", "ola", "qux")]
for d in ("dir1/", "dir2/", "dir3/", "dir4/", "dir5/", "dir6/")
]
# name will dictate the name of the file output(.png image) it is equal to the number of combinations of the image layers
global name
name = len(combine(layers))
# combinations is the list of lists that will store all unique combinations of images
global combinations
combinations = []
# calling recursive function
generate(name)
When I run this I get some output that starts with:
15625 - Unique
15625.png ['dir1/qux.png', 'dir2/bar.png', 'dir3/bar.png', 'dir4/foo.png', 'dir5/baz.png', 'dir6/foo.png']
15624 - Unique
15624.png ['dir1/baz.png', 'dir2/qux.png', 'dir3/foo.png', 'dir4/foo.png', 'dir5/foo.png', 'dir6/foo.png']
15623 - Unique
15623.png ['dir1/ola.png', 'dir2/qux.png', 'dir3/bar.png', 'dir4/ola.png', 'dir5/ola.png', 'dir6/bar.png']
...
and ends with a RecursionError. I assume this is what you mean when you say you "ran out of memory" -- in reality it doesn't seem like I'm anywhere close to running out of memory (maybe this would behave differently if I had actual images?), but Python's stack depth is finite and this function seems to be recursing into itself arbitrarily deep for no particularly good reason.
Since you're trying to eventually generate all the possible combinations, you already have a perfectly good solution, which you're even already using -- itertools.product. All you have to do is iterate through the combinations that it gives you. You don't need recursion and you don't need global variables.
from itertools import product
from typing import List
def generate(layers: List[List[str]]) -> None:
for name, a_paths in enumerate(product(*layers), 1):
# initialize image
img = []
# For each path in the list, paste on top of previous,
# sets image to be saved
for path in a_paths:
img.append(path)
print(f"{name} - Unique")
print(f"{name}.png", img)
print("Done")
'''
Main method
'''
layers = [
[f"{d}{f}.png" for f in ("foo", "bar", "baz", "ola", "qux")]
for d in ("dir1/", "dir2/", "dir3/", "dir4/", "dir5/", "dir6/")
]
# calling iterative function
generate(layers)
Now we get all of the combinations -- the naming starts at 1 and goes all the way to 15625:
1 - Unique
1.png ['dir1/foo.png', 'dir2/foo.png', 'dir3/foo.png', 'dir4/foo.png', 'dir5/foo.png', 'dir6/foo.png']
2 - Unique
2.png ['dir1/foo.png', 'dir2/foo.png', 'dir3/foo.png', 'dir4/foo.png', 'dir5/foo.png', 'dir6/bar.png']
3 - Unique
3.png ['dir1/foo.png', 'dir2/foo.png', 'dir3/foo.png', 'dir4/foo.png', 'dir5/foo.png', 'dir6/baz.png']
...
15623 - Unique
15623.png ['dir1/qux.png', 'dir2/qux.png', 'dir3/qux.png', 'dir4/qux.png', 'dir5/qux.png', 'dir6/baz.png']
15624 - Unique
15624.png ['dir1/qux.png', 'dir2/qux.png', 'dir3/qux.png', 'dir4/qux.png', 'dir5/qux.png', 'dir6/ola.png']
15625 - Unique
15625.png ['dir1/qux.png', 'dir2/qux.png', 'dir3/qux.png', 'dir4/qux.png', 'dir5/qux.png', 'dir6/qux.png']
Done
Replacing the actual image-generating code back into my mocked-out version is left as an exercise for the reader.
If you wanted to randomize the order of the combinations, it'd be pretty reasonable to do:
from random import shuffle
...
combinations = list(product(*layers))
shuffle(combinations)
for name, a_paths in enumerate(combinations, 1):
...
This uses more memory (since now you're building a list of the product instead of iterating through a generator), but the number of images you're working with isn't actually that large, so this is fine as long as you aren't adding a level of recursion for each image.

Convert Matlab to Python

I'm converting matlab code to python, and I'm having a huge doubt on the following line of code:
BD_teste = [BD_teste; grupos.(['g',int2str(l)]).('elementos')(ind_element,:),l];
the whole code is this:
BD_teste = [];
por_treino = 0;
for l = 1:k
quant_elementos_t = int64((length(grupos.(['g',int2str(l)]).('elementos')) * por_treino)/100);
for element_c = 1 : quant_elementos_t
ind_element = randi([1 length(grupos.(['g',int2str(l)]).('elementos'))]);
BD_teste = [BD_teste; grupos.(['g',int2str(l)]).('elementos')(ind_element,:),l];
grupos.(['g',int2str(l)]).('elementos')(ind_element,:) = [];
end
end
This line of code below is a structure, as I am converting to python, I used a list and inside it, a dictionary with its list 'elementos':
'g',int2str(l)]).('elementos')
So my question is just in the line I quoted above, I was wondering what is happening and how it is occurring, and how I would write in python.
Thank you very much in advance.
BD_teste = [BD_teste; grupos.(['g',int2str(l)]).('elementos')(ind_element,:),l];
Is one very weird line. Let's break it down into pieces:
int2str(l) returns the number l as a char array (will span from '1' until k).
['g',int2str(l)] returns the char array g1, then g2 and so on along with the value of l.
grupos.(['g',int2str(l)]) will return the value of the field named g1, g2 and so on that belongs to the struct grupos.
grupos.(['g',int2str(l)]).('elementos') Now assumes that grupos.(['g',int2str(l)]) is itself a struct, and returns the value of its field named 'elementos'.
grupos.(['g',int2str(l)]).('elementos')(ind_element,:) Assuming that grupos.(['g',int2str(l)]) is a matrix, this line returns a line-vector containing the ind_element-th line of said matrix.
grupos.(['g',int2str(l)]).('elementos')(ind_element,:),l appends the number one to the vector obtained before.
[BD_teste; grupos.(['g',int2str(l)]).('elementos')(ind_element,:),l] appends the line vector [grupos.(['g',int2str(l)]).('elementos')(ind_element,:),l] to the matrix BD_teste, at its bottom. and creates a new matrix.
Finally:
BD_teste = [BD_teste; grupos.(['g',int2str(l)]).('elementos')(ind_element,:),l];``assignes the value of the obtained matrix to the variableBD_teste`, overwriting its previous value. Effectively, this just appends the new line, but because of the overwriting step, it is not very effective.
It would be recommendable to append with:
BD_teste(end+1,:) = [grupos.(['g',int2str(l)]).('elementos')(ind_element,:),l];
Now, how you will rewrite this in Python is a whole different story, and will depend on how you want to define the variable grupos mostly.

Save all values of a variable (in a loop) in another variable in Python

I have a code that I inform a folder, where it has n images that the code should return me the relative frequency histogram.
From there I have a function call:
for image in total_images:
histogram(image)
Where image is the current image that the code is working on and total_images is the total of images (n) it has in the previously informed folder.
And from there I call the histogram() function, sending as a parameter the current image that the code is working.
My histogram() function has the purpose of returning the histogram of the relative frequency of each image (rel_freq).
Although the returned values ​​are correct, rel_freq should be a array 1x256 positions ranging from 0 to 255.
How can I transform the rel_freq variable into a 1x256 array? And each value stored in its corresponding position?
When I do len *rel_freq) it returns me 256, that's when I realized that it is not in the format I need...
Again, although the returned data is correct...
After that, I need to create an array store_all = len(total_images)x256 to save all rel_freq...
I need to save all rel_freq in an array to later save it and to an external file, such as .txt.
I'm thinking of creating another function to do this...
Something like that, but I do not know how to do it correctly, but I believe you will understand the logic...
def store_all_histograms(total_images):
n = len(total_images)
store_all = [n][256]
for i in range(0,n):
store_all[i] = rel_freq
I know the function store_all_histograms() is wrong, I just wrote it here to show more or less the way I'm thinking of doing... but again, I do not know how to do it properly... At this point, the error I get is:
store_all = [n][256]
IndexError: list index out of range
After all, I need the store_all variable to save all relative frequency histograms for example like this:
position: 0 ... 256
store_all = [
[..., ..., ...],
[..., ..., ...],
.
.
.
n
]
Now follow this block of code:
def histogram(path):
global rel_freq
#Part of the code that is not relevant to the question...
rel_freq = [(float(item) / total_size) * 100 if item else 0 for item in abs_freq]
def store_all_histograms(total_images):
n = len(total_images)
store_all = [n][256]
for i in range(0,n):
store_all[i] = rel_freq
#Part of the code that is not relevant to the question...
# Call the functions
for fn in total_images:
histogram(fn)
store_all_histograms(total_images)
I hope I have managed to be clear with the question.
Thanks in advance, if you need any additional information, you can ask me...
Return the result, don't use a global variable:
def histogram(path):
return [(float(item) / total_size) * 100 if item else 0 for item in abs_freq]
Create an empty list:
store_all = []
and append your results:
for fn in total_images:
store_all.append(histogram(fn))
Alternatively, use a list comprehension:
store_all = [histogram(fn) for fn in total_images]
for i in range(0,n):
store_all[i+1] = rel_freq
Try this perhaps? I'm a bit confused on the question though if I'm honest. Are you trying to shift the way you call the array with all the items by 1 so that instead of calling position 1 by list[0] you call it via list[1]?
So you want it to act like this?
>>list = [0,1,2,3,4]
>>list[1]
0

Categories

Resources