How to create numpy ndarray from numpy ndarrays? - python

I used the MNIST dataset for training a neural network, where the training data is returned as a tuple with two entries. The first entry contains the actual training images. This is a numpy ndarray with 50,000 entries. Each entry is, in turn, a numpy ndarray with 784 values, representing the 28 * 28 = 784 pixels in a single MNIST image.
I would like to create a new training set, however I do not know how to create an ndarray from other ndarrays. For instance, if I have the following two ndarrays:
a = np.ndarray((3,1), buffer=np.array([0.9,1.0,1.0]), dtype=float)
b = np.ndarray((3,1), buffer=np.array([0.8,1.0,1.0]), dtype=float)
how to make a third one containing these two?
I tried the following but it creates only one entry.
c = np.ndarray((1,6,1), buffer=np.array(([a],[b])), dtype=float)
I would need it to be two entries.

Thanks, in the meanwhile I figured out it is simply:
c = np.array((a, b))

Related

Converting an array from one shape to another?

I have a video i.e image sequences saved in an array. The output is:
Output:
(13,9,9)
Where the 13 represents the 13 image sequences and the two 9's represent the pixels. I wish to convert the array into an output like:
Output:
(81,13)
Where the 81 represents the 81 pixel instances and the 13 is capturing the time domain i.e. the video frames in time. I will then be feeding this into my CNN.
Does anyone have any suggestions? As using array.reshape(81,13) of course doesn't work.
Assuming x is the original video 3D array, you need this to convert it to the desired 2D array:
import numpy as np
x2d = x.transpose(1, 2, 0).reshape(-1, x.shape[0])
This also works:
x2d = x.reshape(x.shape[0], -1).T
Essentially the concept is to reshape or transpose the array in such a way that the elements you want in a row should end up in contiguous memory locations.

Pytorch: Index with tensor along multiple axes OR scatter to more than one index at once

I am trying to update very specific indices of a multidimensional tensor in Pytorch, and I am not sure how to access the correct indices. I can do this in a very straightforward way in Numpy:
import numpy as np
#set up the array containing the data
data = 100*np.ones((10,10,2))
data[5:,:,:] = 0
#select the data points that I want to update
idxs = np.nonzero(data.sum(2))
#generate the updates that I am going to do
updates = np.random.randint(5,size=(idxs[0].shape[0],2))
#update the data
data[idxs[0],idxs[1],:] = updates
I need to implement this in Pytorch but I am not sure how to do this. It seems like I need the scatter function but that only works along a single dimension instead of the multiple dimensions that I need. How can I do this?
These operations work exactly the same in their PyTorch counterparts, except for torch.nonzero, which by default returns a tensor of size [z, n] (where z is the number of non-zero elements and n the number of dimensions) instead of a tuple of n tensors with size [z] (as NumPy does), but that behaviour can be changed by setting as_tuple=True.
Other than that you can directly translate it to PyTorch, but you need to make sure that the types match, because you cannot assign a tensor of type torch.long (default of torch.randint) to a tensor of type torch.float (default of torch.ones). In this case, data is probably meant to have type torch.long:
#set up the array containing the data
data = 100*torch.ones((10,10,2), dtype=torch.long)
data[5:,:,:] = 0
#select the data points that I want to update
idxs = torch.nonzero(data.sum(2), as_tuple=True)
#generate the updates that I am going to do
updates = torch.randint(5,size=(idxs[0].shape[0],2))
#update the data
data[idxs[0],idxs[1],:] = updates

How to add several vectors to numpy structered array and call matrix later from fieldname?

Hey guys Ii need help..
I want to use tensorflows data import, where data is loaded by calling the features/labels vectors from a structured numpy array.
https://www.tensorflow.org/programmers_guide/datasets#consuming_numpy_arrays
I want to create such an structured array by adding consecutively the 2 vectors (feature_vec and label_vec) to an numpy structured array.
import numpy as np
# example vectors
feature_vec= np.arange(10)
label_vec = np.arange(10)
# structured array which should get the vectors
struc_array = np.array([feature_vec,label_vec],dtype=([('features',np.float32), ('labels',np.float32)]))
# How can I add now new vectors to struc_array?
struc_array.append(---)
I want later when this array is loaded from file call either the feature vectors (which is a matrix now) by using the fieldname:
with np.load("/var/data/training_data.npy") as data:
features = data["features"] # matrix containing feature vectors as rows
labels = data["labels"] #matrix containing labels vectors as rows
Everything I tried to code was complete crap.. never got a correct output..
Thanks for your help!
Don't create a NumPy array and then append to it. That doesn't really make sense, as NumPy arrays have a fixed size and require a full copy to append a single row or column. Instead, create a list, append to it, then construct the array at the end:
vecs = [feature_vec,label_vec]
dtype = [('features',np.float32), ('labels',np.float32)]
# append as many times as you want:
vecs.append(other_vec)
dtype.append(('other', np.float32))
struc_array = np.array(vecs, dtype=dtype)
Of course, you probably need ot
Unfortunately, this doesn't solve the problem.
i want to get just the labels or the features from structured array by using:
labels = struc_array['labels']
features = struc_array['features']
But when i use the structured array like you did, labels and also features contains all given appended vectors:
import numpy as np
feature_vec= np.arange(10)
label_vec = np.arange(0,5,0.5)
vecs = [feature_vec,label_vec]
dtype = [('features',np.float32), ('labels',np.float32)]
other_vec = np.arange(6,11,0.5)
vecs.append(other_vec)
dtype.append(('other', np.float32))
struc_array = np.array(vecs, dtype=dtype)
# This contains all vectors.. not just the labels vector
labels = struc_array['labels']
# This also contains all vectors.. not just the feature vector
features = struc_array['features']

Feeding Numpy Arrays to CNTK LSTM Model

I'm looking to see if there is a way to feed sequence data as Numpy arrays to a text LSTM model defined in CTNK. Each instance in my dataset is a sequence of integers mapping back to words, and the length of each sequence is different. It seems like one can convert their raw text data to the CTF format and feed this data to a model by creating a reader function which generates mini-batches as in this example. However, I'm wondering if there is a way to feed Numpy arrays to this same model.
Further down in this example, there is a discussion of feeding sequences with Numpy, which I was hoping would solve my problem. However, the example deals with sequences of images instead of variable-length sequences of words. In the case of the example, we'll end up with a tensor of n elements that are each 3 x 32 x 32, and we can set up an input variable expecting these dimensions. However, in the case of sequences of words where each sequence has a different length, this example breaks down.
Any help on interop between CTNK and Numpy for text-based LSTM's / RNN's would be greatly appreciated.
You are probably looking for:
x = cntk.sequence.input_variable(shape=())
Here is a sample little program that demonstrates how it works with a variable sequence length:
import numpy as np
import cntk
# define the model
x = cntk.sequence.input_variable(shape=())
z = cntk.sequence.last(x)
# define the data
a = [[1,2,3], [4,5], [6,7,8,9], [0]]
b = [np.array(i, dtype=np.float32) for i in a]
# evaluate
res = z.eval({x: b})
print(res)

How to concatenate N different 1D arrays in python

I am new to python. I have to implement k-fold cross validation in python. I am able to split the given data in k equal sized arrays but not able to concatenate the k-1 arrays which will be the training data set. I know about concatenate() in numpy but as k is determine on the fly not sure how to use it in this scenario. Appreciate any info in this regard. Thanks in advance.
Check out numpy.vstack. This stacks an iterable of arrays on top of each other (assuming the column dimensions match). hstack does the opposite.
import numpy as np
k = 10
all_data = [np.random.random((10,5)) for i in range(k)]
train = all_data[:k-1] #list of 9 (10,5) arrays
test = all_data[k-1] #one (10,5) array
train = np.vstack(train) #stacks them on top of each other
print train.shape # one (90, 5) array

Categories

Resources