How to expand multidimensional numpy arrays? - python

I have a multidimensional numpy array of following shape
(80, 755, 6)
that can be interpreted as 80 timeseries each of them contains 755 elements (time points/rows) consisting of 6 values (features). Now I want to extend the number of features to 8 features.
The two new features are fixed values of type string and int. At the end for each observed values of the features I want to add the new features in the list for example this is one oberved list of features
[0.1,0.2,0.3,0.4,0.5, 0.6]
that I want to extend with fixed values like this
[28,0.1,0.2,0.3,0.4,0.5, 0.6, "Door"]
for all observations in each time series so that at the end I end with an numpy array of shape (80, 755, 8). How would I do that the fastest way?

Related

Unpack 2D np.array with cells containing 1D np.arrays with different lengths

I have a two-dimensional np.array, where cells are filled with floats or 1d arrays.
In the two-dimensional array, the first dimension are samples, the second dimension are sample descriptions from different sources. Each cell is a string, represented as an ASCII-encoded array or floats.
Example:
array([[3.2, array([1,2,5,1]), array([1,6,9]), array([1,2])],
[2.1, array([1,2,9]), array([8,3,5,8]), array([1,3])],
[1.2, array([1,1]), array([4,2,6,4,5]), array([2,2,4])]])
The first three columns are my inputs, the fourth is my output.
I want to feed a seq2seq LSTM in TensorFlow with this data.
As first approach, I've tried to convert each 1d array in cells to a Tensor but I get an error:
ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object
type tensorflow.python.framework.ops.EagerTensor).
I'm wondering if it is necessary to unpack the 1d arrays in cells to a new dimension. How can that be done, considering 1d arrays in cells have different lenghts?
Somewhere, I've read that using batch_size=1 is it possible to feed LSTM with arrays of different dimensions. Does someone have experience with that?
Thanks for your help.

From array of indicies, create an array of the values at those indicies

This is for a Machine Learning problem (in Python of course).
I have a 2 dimensional array, the rows are set of points, and the columns are indices into another 1 dimensional array of values for those points.
data = [[1,3,2], [3,3,1], [5,1,2]]
# yes there are duplicates in the labels
labels = [2,8,9,8,8,9]
What I need is to create a 2D array that is the original data array, but where the values in it are now the value from labels that the index represented.
new_data = [[8,8,9], [8,8,8], [9,8,9]]
I can do this with for loops obviously. I'm asking here in case numpy or something has a call that does this.
Use the indices as indices:
np.array(labels)[np.array(data)]
The output of an advanced (integer) index is the shape of the index array (data).

Resample a categorical numpy array

I have a 1 dimensional numpy array labels (say its length is 700k) sampled at 700 Hz. So, it corresponds to 1000 seconds of time series data. The array consists of integers 0 to 3 which stand for some categorical information. Also, the categories rarely change, like 200 seconds of 0, then 150 seconds of 2 and so on...
Now, I would like to convert it to an array of 64 Hz, that is, the new length of the array will be 700k x (64/700) = 64k.
resampledLabels = scipy.signal.resample(labels, 64000)
The problem with the above code is that it makes some assumptions about the array, and make interpolations. I tried to round them to the nearest integer, but the result also contained a -1 which is actually out of the range of the actual array.
My problem is, how can I resample the array without making interpolations?
I think you can just simple numpy slicing, which is of the format start:stop:step. This is constant time and reflects possible changes you might make to the resampled array.
In your case it would be: labels[0::64000]

How to index several 2d numpy arrays with different number of rows in a 3d array?

I have the following problem in python: I have several numpy 2d-arrays where all have same number of columns ,but different number of rows.I want to index all these 2d-array in a unique numpy 3d-array where the first index keeps into account each 2d-array.
For example :
let's suppose I got two 2d-arrays like this :
[[1,2,3][4,5,6][7,8,9]] (3X3 array)
[[11,12,13][14,15,16]] (2X3 array)
I want to get a numpy 3d-array name,for example, c where : c[0] has shape (3,3), c[1] (2,3) and so on...So I expect a 3d-array whose shape is (number of 2d matrices,number of rows of one matrix,number of columns of one matrix)
How can I reshape since each 2d-arrays have different number of rows each other?
Thanks

Python: Reshaping arrays and lists

I have a numpy ndarray object with the following shape:
(3, 256, 170, 256).
So, basically this represents an array of 3-dimensional vectors. The dimension of the vector is the first element as it enables one to write something like: array[0] for the relevant vector component.
Now, I am trying to use scipy pdist function, which computes the distance between the entries. So, I need to modify this array, so that it can be represented as a two dimensional matrix, where the number of rows is 256*170*256 and the number of columns is 3 and pdist should return me the matrix where each element is the squared distance between the corresponding 3 dimensional vectors (if I have interpreted the documentation correctly).
Can someone tell me how I can get a view into this numpy array, so that I can generate this matrix. I do not want to copy the data again (as these matrices can be quite large), so looking for some efficient solutions.

Categories

Resources