So I have multiple files that can be accessed and be treated as 2D arrays.
What I would like to do is take all those 2D arrays and put them in a single 3D array.
For example, if I have 10 files with the shapes (100,100), when I combine them, I should be left with a 3D array of shape (10,100,100). The following attempt I have is the following:
filenames = glob.glob('source')
preset = np.empty([100,100], dtype = 'int16')
for file in filenames:
data = fits.open(file)[0].data
np.vstack([preset,data]).reshape((10,100,100))
But what I'm getting is the following error:
ValueError: cannot reshape array of size 20000 into shape (10,100,100)
You are performing the operation pair by pair. Try to perform this on all the arrays together:
arrs = [fits.open(file)[0].data for file in filenames]
np.vstack(arrs).reshape((10,100,100))
Or even more direct:
np.stack(arrs)
Related
I have different .npy files, in which there are saved numpy arrays (or images represented as matrices, with a dimension = 64, the other one I don't know).
I want to read them, store them in a numpy.ndarray of 3 dimensions.
What I have done till now is something very different, and I'm having problems dealing with the structures I created.
database_list = list()
labels_list = list()
for filename in glob.glob('*.npy'):
database_list.append(np.load(filename))
label_temp = extract_label(filename)
labels_list.append(label_temp)
database = np.array(database_list)
labels = np.array(labels_list)
In that way, I have a numpy.ndarray database of shape (n_elements,).
Let's assume that I reshape each image as (n, 64), I want database to be of the shape (n_elements, n, 64).
How can I do it?
What I want to achieve is an array of the same shape of MNIST database, for working on neural network.
EDIT:
database type is numpy.ndarray. It can't be reshaped, database is of size n, say 10 (because it is composed of n elements, for example 10 if 10 files are loaded. The files are matrices of two dimensions, but I want them to be "part of" database).
For database = np.array(database_list) to make a 3d array with shape (n_elements, dim1, dim2), the database_list has to contain arrays all with the shape (dim1, dim2). If they differ in shape the result will be a (n_elements,) shaped array with object dtype (or in some cases it will throw an error).
I want to make a function that takes a list of resized images (size of image is (200,200)) and convert each picture one by one in the form of numpy array. so in the end i will have main numpy array which contain all the 20 images in the form of numpy array. And in the in the end this function will return numpay array of shape (200,200,3) which contain all pics.
What I have done so far is this:
def converttonumpyarray(list_of_resized_images):
for image in list_of_resized_images:
array1 = np.array(image).reshape(200,200,3)
for img in list_of_resized_images:
array2 = np.array(image).reshape(200,200,3)
array1 = np.concatenate((array1,array2))
break
return array1
but reshape function is generating following error:
ValueError: cannot reshape array of size 90000 into shape (200,200,3)
Kindly, let me know if there is any other way to do this working. Thank you
I'm trying to put multiple 2-D numpy arrays into one 3-D numpy array and then save the 3-D numpy array as a compressed file to a directory for later use.
I have a list that I'm looping through which will compute forecasts for different hazards. A forecast for each hazard (a 129x185 numpy array) will be computed one at a time. I want to then put each forecast array into an empty 129x185x7 numpy array.
hazlist = ['allsvr', 'torn', 'sigtorn', 'hail', 'sighail', 'wind', 'sigwind']
# Create 3-D empty numpy array
grid = np.zeros(shape=(129,185,7))
for i,haz in enumerate(hazlist):
*do some computation to create forecast array for current hazard*
# Now have 2-D 129x185 forecast array
print fcst
# Place 2-D array into empty 3-D array.
*Not sure how to do this...*
# Save 3-D array to .npz file in directory when all 7 hazard forecasts are done.
np.savez_compressed('pathtodir/3dnumpyarray.npz')
But, I want to give each forecast array it's own grid name inside the 3-D array so that if I want a certain one (like tornadoes) I can just call it with:
filename = np.load('pathtodir/3dnumpyarray.npz')
arr = filename['torn']
It would be greatly appreciated if someone were able to assist me. Thanks.
It sounds like you actually want to use a dictionary. Each dictionary entry could be a 2D array with the reference name as the key:
hazlist = ['allsvr', 'torn', 'sigtorn', 'hail', 'sighail', 'wind', 'sigwind']
# Create empty dictionary
grid = {}
for i,haz in enumerate(hazlist):
*do some computation to create forecast array for current hazard*
# Now have 2-D 129x185 forecast array
print fcst
# Place 2-D array into dictionary.
grid[haz] = fcst # Assuming fcst is the 2D array?
# Save 3-D array to npz file
np.savez_compressed("output", grid)
It might be best to save this as a JSON file. If the data needs to be compressed you can refer to this question and answer as to saving json in gzipped format, or this one may be clearer.
It's not clear from your example, but my assumption in the above code is that fcst is the 2D array that corresponds to the label haz in each iteration of the loop.
I've been running into a TypeError: list indices must be integers, not tuple. However, I can't figure out how to fix it, as I'm apparently misunderstanding where the tuple is (wasn't even aware there would be one from what I understand). Shouldn't my index and the values that I'm passing in all be integers?
def videoVolume(images):
""" Create a video volume from the image list.
Note: Simple function to convert a list to a 4D numpy array.
Args:
images (list): A list of frames. Each element of the list contains a
numpy array of a colored image. You may assume that each
frame has the same shape, (rows, cols, 3).
Returns:
output (numpy.ndarray): A 4D numpy array. This array should have
dimensions (num_frames, rows, cols, 3) and
dtype np.uint8.
"""
output = np.zeros((len(images), images[0].shape[0], images[0].shape[1],
images[0].shape[2]), dtype=np.uint8)
# WRITE YOUR CODE HERE.
for x in range(len(images)):
output[:,:,:,:] = [x, images[x,:,3], images[:,x,3], 3]
# END OF FUNCTION.
return output
The tuple referred to in the error message is the x,:,3 in the index here:
images[x,:,3]
The reason this is happening is that images is passed in as a list of frames (each a 3d numpy array), but you are trying to access it as though it is itself a numpy array. (Try doing lst = [1, 2, 3]; lst[:,:] and you'll see you get the same error message).
Instead, you meant to access it as something like images[x][:,:,:], for instance
for x in range(len(images)):
output[x,:,:,:] = images[x][:,:,:]
I have a list of several hundred 10x10 arrays that I want to stack together into a single Nx10x10 array. At first I tried a simple
newarray = np.array(mylist)
But that returned with "ValueError: setting an array element with a sequence."
Then I found the online documentation for dstack(), which looked perfect: "...This is a simple way to stack 2D arrays (images) into a single 3D array for processing." Which is exactly what I'm trying to do. However,
newarray = np.dstack(mylist)
tells me "ValueError: array dimensions must agree except for d_0", which is odd because all my arrays are 10x10. I thought maybe the problem was that dstack() expects a tuple instead of a list, but
newarray = np.dstack(tuple(mylist))
produced the same result.
At this point I've spent about two hours searching here and elsewhere to find out what I'm doing wrong and/or how to go about this correctly. I've even tried converting my list of arrays into a list of lists of lists and then back into a 3D array, but that didn't work either (I ended up with lists of lists of arrays, followed by the "setting array element as sequence" error again).
Any help would be appreciated.
newarray = np.dstack(mylist)
should work. For example:
import numpy as np
# Here is a list of five 10x10 arrays:
x = [np.random.random((10,10)) for _ in range(5)]
y = np.dstack(x)
print(y.shape)
# (10, 10, 5)
# To get the shape to be Nx10x10, you could use rollaxis:
y = np.rollaxis(y,-1)
print(y.shape)
# (5, 10, 10)
np.dstack returns a new array. Thus, using np.dstack requires as much additional memory as the input arrays. If you are tight on memory, an alternative to np.dstack which requires less memory is to
allocate space for the final array first, and then pour the input arrays into it one at a time.
For example, if you had 58 arrays of shape (159459, 2380), then you could use
y = np.empty((159459, 2380, 58))
for i in range(58):
# instantiate the input arrays one at a time
x = np.random.random((159459, 2380))
# copy x into y
y[..., i] = x