I'm trying to put multiple 2-D numpy arrays into one 3-D numpy array and then save the 3-D numpy array as a compressed file to a directory for later use.
I have a list that I'm looping through which will compute forecasts for different hazards. A forecast for each hazard (a 129x185 numpy array) will be computed one at a time. I want to then put each forecast array into an empty 129x185x7 numpy array.
hazlist = ['allsvr', 'torn', 'sigtorn', 'hail', 'sighail', 'wind', 'sigwind']
# Create 3-D empty numpy array
grid = np.zeros(shape=(129,185,7))
for i,haz in enumerate(hazlist):
*do some computation to create forecast array for current hazard*
# Now have 2-D 129x185 forecast array
print fcst
# Place 2-D array into empty 3-D array.
*Not sure how to do this...*
# Save 3-D array to .npz file in directory when all 7 hazard forecasts are done.
np.savez_compressed('pathtodir/3dnumpyarray.npz')
But, I want to give each forecast array it's own grid name inside the 3-D array so that if I want a certain one (like tornadoes) I can just call it with:
filename = np.load('pathtodir/3dnumpyarray.npz')
arr = filename['torn']
It would be greatly appreciated if someone were able to assist me. Thanks.
It sounds like you actually want to use a dictionary. Each dictionary entry could be a 2D array with the reference name as the key:
hazlist = ['allsvr', 'torn', 'sigtorn', 'hail', 'sighail', 'wind', 'sigwind']
# Create empty dictionary
grid = {}
for i,haz in enumerate(hazlist):
*do some computation to create forecast array for current hazard*
# Now have 2-D 129x185 forecast array
print fcst
# Place 2-D array into dictionary.
grid[haz] = fcst # Assuming fcst is the 2D array?
# Save 3-D array to npz file
np.savez_compressed("output", grid)
It might be best to save this as a JSON file. If the data needs to be compressed you can refer to this question and answer as to saving json in gzipped format, or this one may be clearer.
It's not clear from your example, but my assumption in the above code is that fcst is the 2D array that corresponds to the label haz in each iteration of the loop.
Related
So I have multiple files that can be accessed and be treated as 2D arrays.
What I would like to do is take all those 2D arrays and put them in a single 3D array.
For example, if I have 10 files with the shapes (100,100), when I combine them, I should be left with a 3D array of shape (10,100,100). The following attempt I have is the following:
filenames = glob.glob('source')
preset = np.empty([100,100], dtype = 'int16')
for file in filenames:
data = fits.open(file)[0].data
np.vstack([preset,data]).reshape((10,100,100))
But what I'm getting is the following error:
ValueError: cannot reshape array of size 20000 into shape (10,100,100)
You are performing the operation pair by pair. Try to perform this on all the arrays together:
arrs = [fits.open(file)[0].data for file in filenames]
np.vstack(arrs).reshape((10,100,100))
Or even more direct:
np.stack(arrs)
I want to vectorize the process of adding a 2d array to every 2d array inside a 3d array.
I imported an image file using image from matplotlib
data = image.imread('test.jpg')
Then I tried to add the average of each RGB array to another array of the same shape as data
data2 = np.zeros_like(data)
data3 = np.average(data, axis=2)
for i in range(len(data2[0,0,:])):
data2[:,:,i] = data3
I just want to vectorize the above 2 line code to one line
Convert data3 to the result datatype and then broadcast/repeat after extending to 3D with np.newaxis/None -
b = data3.astype(data.dtype)
data2_out = np.broadcast_to(b[...,None], data.shape)
The output would simply be a view into b and hence we are gaining memory-efficiency there.
If you need an output with its own memory space, we can force the copy with data2_out.copy() or use np.repeat, like so -
np.repeat(b[...,None],data.shape[2],axis=2)
If you already have the output array data2 initialized and just want to assign into it, we can do so with extending data3 to 3D and this might more intuitive in some scenarios too, like so -
data2[:] = data3[...,None]
Hey guys Ii need help..
I want to use tensorflows data import, where data is loaded by calling the features/labels vectors from a structured numpy array.
https://www.tensorflow.org/programmers_guide/datasets#consuming_numpy_arrays
I want to create such an structured array by adding consecutively the 2 vectors (feature_vec and label_vec) to an numpy structured array.
import numpy as np
# example vectors
feature_vec= np.arange(10)
label_vec = np.arange(10)
# structured array which should get the vectors
struc_array = np.array([feature_vec,label_vec],dtype=([('features',np.float32), ('labels',np.float32)]))
# How can I add now new vectors to struc_array?
struc_array.append(---)
I want later when this array is loaded from file call either the feature vectors (which is a matrix now) by using the fieldname:
with np.load("/var/data/training_data.npy") as data:
features = data["features"] # matrix containing feature vectors as rows
labels = data["labels"] #matrix containing labels vectors as rows
Everything I tried to code was complete crap.. never got a correct output..
Thanks for your help!
Don't create a NumPy array and then append to it. That doesn't really make sense, as NumPy arrays have a fixed size and require a full copy to append a single row or column. Instead, create a list, append to it, then construct the array at the end:
vecs = [feature_vec,label_vec]
dtype = [('features',np.float32), ('labels',np.float32)]
# append as many times as you want:
vecs.append(other_vec)
dtype.append(('other', np.float32))
struc_array = np.array(vecs, dtype=dtype)
Of course, you probably need ot
Unfortunately, this doesn't solve the problem.
i want to get just the labels or the features from structured array by using:
labels = struc_array['labels']
features = struc_array['features']
But when i use the structured array like you did, labels and also features contains all given appended vectors:
import numpy as np
feature_vec= np.arange(10)
label_vec = np.arange(0,5,0.5)
vecs = [feature_vec,label_vec]
dtype = [('features',np.float32), ('labels',np.float32)]
other_vec = np.arange(6,11,0.5)
vecs.append(other_vec)
dtype.append(('other', np.float32))
struc_array = np.array(vecs, dtype=dtype)
# This contains all vectors.. not just the labels vector
labels = struc_array['labels']
# This also contains all vectors.. not just the feature vector
features = struc_array['features']
I have a directory with multiple .npy files (numpy arrays), each file has a 2 dimensional array (same width and height). I need to read all files and generate a 3 dimensional array containing all arrays in directory, the result shape should be something like (# of files, width, height).
My code so far:
import os
import numpy
for file in os.listdir(os.getcwd()):
result = numpy.load(file) #Obviously this doen't work
But I just simply don't know how to generate the result array. Should I first create a zeros array and then fill it? Can I make this on the fly?
Can you help me please?
If you know how many there are and what the size is, create an empty array first. (An empty array is faster, because you don't have to zero all elements.) Something like this:
# Allocate empty array.
bigarray = numpy.empty([width, height, len(filenames)]);
# Load files.
for i in range(len(filenames)):
bigarray[:,:,i] = numpy.load(filenames[i]);
If you do not know the dimensions in advance, use numpy.append. This is fairly slow, because it has to allocate a new chunck of memory and copy data in each iteration. Try this:
# Load first array.
bigarray = numpy.load(filenames[0]);
# Add a new axis to make it 3D.
bigarray = bigarray[numpy.newaxis,...];
# Load rest of arrays.
for i in range(1,len(filenames)):
bigarray = numpy.append(bigarray, numpy.load(filenames[i])[numpy.newaxis,...], axis=0);
I have the data in the following form the shape of the array is
(10,4,4,3)
First i want to create an array with shape (merging, or flattening)
(10,48)
such that data (4,4,3) is converted to one row.
Secondly I want to go back to the original shape of the data(splitting) such that each element is again placed at the same location.
Thanks
b = a.reshape(10,48)
a = b.reshape(10,4,4,3)