Python reshape list to ndim array - python

Hi I have a list flat which is length 2800, it contains 100 results for each of 28 variables: Below is an example of 4 results for 2 variables
[0,
0,
1,
1,
2,
2,
3,
3]
I would like to reshape the list to an array (2,4) so that the results for each variable are in a single element.
[[0,1,2,3],
[0,1,2,3]]

You can think of reshaping that the new shape is filled row by row (last dimension varies fastest) from the flattened original list/array.
If you want to fill an array by column instead, an easy solution is to shape the list into an array with reversed dimensions and then transpose it:
x = np.reshape(list_data, (100, 28)).T
Above snippet results in a 28x100 array, filled column-wise.
To illustrate, here are the two options of shaping a list into a 2x4 array:
np.reshape([0, 0, 1, 1, 2, 2, 3, 3], (4, 2)).T
# array([[0, 1, 2, 3],
# [0, 1, 2, 3]])
np.reshape([0, 0, 1, 1, 2, 2, 3, 3], (2, 4))
# array([[0, 0, 1, 1],
# [2, 2, 3, 3]])

You can specify the interpretation order of the axes using the order parameter:
np.reshape(arr, (2, -1), order='F')

Step by step:
# import numpy library
import numpy as np
# create list
my_list = [0,0,1,1,2,2,3,3]
# convert list to numpy array
np_array=np.asarray(my_list)
# reshape array into 4 rows x 2 columns, and transpose the result
reshaped_array = np_array.reshape(4, 2).T
#check the result
reshaped_array
array([[0, 1, 2, 3],
[0, 1, 2, 3]])

The answers above are good. Adding a case that I used.
Just if you don't want to use numpy and keep it as list without changing the contents.
You can run a small loop and change the dimension from 1xN to Nx1.
tmp=[]
for b in bus:
tmp.append([b])
bus=tmp
It maybe not efficient in case of very large numbers. But it works for small set of numbers.
Thanks

Related

Python - How to remove some terms of a numpy array in specific intervals

Suppose I have the following array:
import numpy as np
x = np.array([1,2,3,4,5,
1,2,3,4,5,
1,2,3,4,5])
How can I manipulate it to remove the term in equally spaced intervals and adapt the new length for it? For example, I'd like to have:
x = [1,2,3,4,
1,2,3,4,
1,2,3,4]
Where the terms from positions 4, 9, and 14 were excluded (so every 5 terms, one gets excluded). If possible, I'd like to have a code that I could use for an array with length N. Thank you in advance!
In your case, you can simply run code below after initializing the x array(as you did your question):
x.reshape(3,5)[:,:4]
Output
array([[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4]])
If you are interested in getting a vector and not a matrix(such as the output above), you can call the flatten function on the code above:
x.reshape(3,5)[:,:4].flatten()
Output
array([1, 2, 3, 4,
1, 2, 3, 4,
1, 2, 3, 4])
Explanation
Since x is a numpy array, we can use NumPy in-built functions such as reshape. This function, which has a self-explanatory name, shapes the array into the desired format. x was a vector of 15 elements. Therefore, running x.reshape(3,5) gives us a matrix with 3 rows and five columns. [:, :4] is to reselect the first four columns. flatten function changes a matrix into a vector.
IIUC, you can use a boolean mask generated with the modulo (%) operator:
N = 5
mask = np.arange(len(x))%N != N-1
x[mask]
output: array([1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4])
This works even if your array has not a size that is a multiple of N

NumPy - Excluding all zero 2D arrays from a 3D array

I have multiple 3D arrays with different shapes but I'm going to assume I have an array named A with shape (53, 768, 768) for an example. It consists of 53 2D arrays and some of them may be empty images. Those empty images have only 0 pixel values.
If there are N slices with all 0 values, I want to slice A into a (53 - N, 768, 768) 3D array. Is this possible with indexing?
I tried something like this a[:, ~np.all(a == 0)], but it returns an array with shape (53, 1, 768, 768).
Let's assume your data is something like this:
z = np.array([
[[1, 2, 3], [4, 5, 6]],
[[7, 8, 9], [10, 11, 12]],
[[0, 0, 0], [0, 0, 0]],
[[1, 1, 1], [1, 1, 1]]
])
The shape of z is (4, 2, 3). We therefore need a vector with shape 4, aggregating over the other dimensions. We can use the axis= parameter in most Numpy functions for this:
mask = np.all(z != 0, axis=(1, 2))
a[mask]
In this example, mask will be array([False, False, True, False]).
Axes are numbered 0, 1, 2, etc. So we use 1 and 2 to refer to the 2nd and 3rd axes.
You can also use negative numbers as in the other answer; if you write axis=(-2, -1) that refers to the last and 2nd-to-last axes, i.e. axes 1 and 2 in this example.
In general, use axis= to specify which axes are to be collapsed by aggregating. Any axis not specified in axis= will not be aggregated.
Use:
import numpy as np
A = np.array(A) # if A is not a NumPy array
result = A[np.sum(A, axis = (-1, -2)) != 0]
This will do.

Numpy: vectorize matrix creation

If I want to create a matrix, I simply call
m = np.matrix([[x00, x01],
[x10, x11]])
, where x00, x01, x10 and x11 are numbers. However, I would like to vectorize this process. For example, if the x's are one-dimensional arrays with length l, then I would like m to become an array of matrices, or a lx2x2-dimensional array. Unfortunately,
zeros = np.zeros(10)
ones = np.ones(10)
m = np.matrix([[zeros, ones],
[zeros, ones]])
raises an error ("matrix must be 2-dimensional") and
m = np.array([[zeros, ones],
[zeros, ones]])
gives an 2x2xl-dimensional array instead. In order to solve this, I could call np.moveaxis(m, 2, 0), but I am looking for a direct solution that doesn't need to change the order of axes of a (potentially huge) array. This also only sets the axis-order right if I'm passing one-dimensional arrays as values for my matrix, not if they're higher dimensional.
Is there a general and efficient way of vectorizing the creation of matrices?
Let's try a 2d (4d after joining) case:
In [374]: ones = np.ones((3,4),int)
In [375]: arr = np.array([[ones*0, ones],[ones*2, ones*3]])
In [376]: arr
Out[376]:
array([[[[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]]],
[[[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2]],
[[3, 3, 3, 3],
[3, 3, 3, 3],
[3, 3, 3, 3]]]])
In [377]: arr.shape
Out[377]: (2, 2, 3, 4)
Notice that the original array elements are 'together'. arr has its own databuffer, with copies of the original arrays, but it was made with relatively efficient block copies.
We can easily transpose axes:
In [378]: arr.transpose(2,3,0,1)
Out[378]:
array([[[[0, 1],
[2, 3]],
[[0, 1],
[2, 3]],
...
[[0, 1],
[2, 3]]]])
Now it's 12 (2,2) arrays. It is a view, using arr's databuffer. It just has a different shape and strides. Doing this transpose is quite efficient, and isn't any slower when arr is very big. And a lot of math on the transposed array will be nearly as efficient as on the original arr (because of stridded iteration). If there are differences in speed it will be because of caching at a deep level.
But some actions will require a copy. For example the transposed array can't be raveled without a copy. The original 0s,1s etc are no longer together.
In [379]: arr.transpose(2,3,0,1).ravel()
Out[379]:
array([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1,
2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3,
0, 1, 2, 3])
I could construct the same 1d array with
In [380]: tarr = np.empty((3,4,2,2), int)
In [381]: tarr[...,0,0] = ones*0
In [382]: tarr[...,0,1] = ones*1
In [383]: tarr[...,1,0] = ones*2
In [384]: tarr[...,1,1] = ones*3
In [385]: tarr.ravel()
Out[385]:
array([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1,
2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3,
0, 1, 2, 3])
This tarr is effectively what you are trying to produce 'directly'.
Another way to look at this construction, is to assign the values to the array's .flat with strides - insert 0s at every 4th slot, 1s at the adjacent ones, etc.:
In [386]: tarr.flat[0::4] = ones*0
In [387]: tarr.flat[1::4] = ones*1
In [388]: tarr.flat[2::4] = ones*2
In [389]: tarr.flat[3::4] = ones*3
Here's another 'direct' way - use np.stack (a version of concatenate) to create a (3,4,4) array, which can then be reshaped:
np.stack((ones*0,ones*1,ones*2,ones*3),2).reshape(3,4,2,2)
That stack is, in essence:
In [397]: ones1 = ones[...,None]
In [398]: np.concatenate((ones1*0, ones1*1, ones1*2, ones1*3),axis=2)
Notice that this target (3,4,2,2) could be reshaped to (12,4) (and v.v) at no cost. So the original problem becomes: is it easier to construct a (4,12) and transpose, or construct the (12,4) first? It's really a 2d problem, not a (m+n)d one.
np.matrix must be a 2D array. From numpy documentation of np.matrix
Returns a matrix from an array-like object, or from a string of data.
A matrix is a specialized 2-D array that retains its 2-D nature
through operations. It has certain special operators, such as *
(matrix multiplication) and ** (matrix power).
Note
It is no longer recommended to use this class, even for linear
algebra. Instead use regular arrays. The class may be removed in the
future.
Is there any reason you want np.matrix? Most numpy operations should be doable in the array object as the matrix class is quasi-deprecated.
From your example I tried using the transpose (.T) method:
zeros = np.zeros(10)
ones = np.ones(10)
twos = np.ones(10) * 2
threes = np.ones(10) * 3
m = np.array([[zeros, ones], [twos, threes]]).T
>> array([[0,2],[1,3]],...)
or
m = np.transpose(np.array([[zeros, ones], [twos, threes]]), (2,0,1))
>> array([[0,1],[2,3]],...)
This yields a (10, 2, 2) array

Adding zeros to given positions in a numpy array

I have a numpy array and I want a function that takes as an input the numpy array and a list of indices and returns as an output another array which has the following property: a zero has been added to the initial array just before the position of each of the indices of the origional array.
Let me give a couple of examples:
If indices = [1] and the initial array is array([1, 1, 2]), then the output of the function should be array([1, 0, 1, 2]).
If indices = [0, 1, 3] and the initial array is array([1, 2, 3, 4]), then the output of the function should be array([0, 1, 0, 2, 3, 0, 4]).
I would like to do it in a vectorized manner without any for loops.
Had the same issue before. Found a solution using np.insert:
import numpy as np
np.insert([1, 1, 2], [1], 0)
>>> [1, 0, 1, 2]
I see #jdehesa has commented this already but adding as a permanent answer for future visitors.

Extracting required indices from an array of tuples

import numpy as np
from scipy import signal
y = np.array([[2, 1, 2, 3, 2, 0, 1, 0],
[2, 1, 2, 3, 2, 0, 1, 0]])
maximas = signal.argrelmax(y, axis=1)
print maximas
(array([0, 0, 1, 1], dtype=int64), array([3, 6, 3, 6], dtype=int64))
The maximas produced the index of tuples: (0,3) and (0,6) are for row one [2, 1, 2, 3, 2, 0, 1, 0]; and (1,6) and (1,6) are for another row [2, 1, 2, 3, 2, 0, 1, 0].
The following prints all the results, but I want to extract only the first maxima of both rows, i.e., [3,3] using the tuples. So, the tuples I need are (0,3) and (1,3).
How can I extract them from the array of tuples, i.e., 'maximas'?
>>> print y[kk]
[3 1 3 1]
Given the tuple maximas, here's one possible NumPy way:
>>> a = np.column_stack(maximas)
>>> a[np.unique(a[:,0], return_index=True)[1]]
array([[0, 3],
[1, 3]], dtype=int64)
This stacks the coordinate lists returned by signal.argrelmax into an array a. The return_index parameter of np.unique is used to find the first index of each row number. We can then retrieve the relevant rows from a using these first indexes.
This returns an array, but you could turn it into a list of lists with tolist().
To return the first column index of the maximum in each row, you just need to take the indices returned by np.unique from maximas[0] and use them to index maximas[1]. In one line, it's this:
>>> maximas[1][np.unique(maximas[0], return_index=True)[1]]
array([3, 3], dtype=int64)
To retrieve the corresponding values from each row of y, you can use np.choose:
>>> cols = maximas[1][np.unique(maximas[0], return_index=True)[1]]
>>> np.choose(cols, y.T)
array([3, 3])
Well, a pure Python approach will be to use itertools.groupby(group on the row's index) and a list comprehension:
>>> from itertools import groupby
>>> from operator import itemgetter
>>> [max(g, key=lambda x: y[x])
for k, g in groupby(zip(*maximas), itemgetter(0))]
[(0, 3), (1, 3)]

Categories

Resources