I have a 3d numpy table with shape=(2,3,4) like below:
a = np.array([[[1., 2., 3., 4.],
[1., 2., 3., 4.],
[1., 2., 3., 4.]],
[[5., 6., 7., 8.],
[5., 6., 7., 8.],
[5., 6., 7., 8.]]])
And want to reshape this in a way where the columns in each dimension are stacked into a new column in a 2d matrix.
1 5
1 5
1 5
2 6
2 6
2 6
3 7
3 7
3 7
4 8
4 8
4 8
Here you go:
res = a.T.reshape((-1,2))
Output:
array([[1., 5.],
[1., 5.],
[1., 5.],
[2., 6.],
[2., 6.],
[2., 6.],
[3., 7.],
[3., 7.],
[3., 7.],
[4., 8.],
[4., 8.],
[4., 8.]])
To reshape a numpy array, use the reshape method.
Basically it looks at the array as it was flattened and works over it with the new given shape. It does however iterates over the last index first, ie, the inner-most list will be processed, then the next and so on.
So both a np.array([1, 2, 3, 4, 5, 6]).reshape((3, 2)) and a np.array([[1, 2, 3], [4, 5, 6]]).reshape((3, 2)) will give [[1, 2], [3, 4], [5, 6]], since these two originating arrays are the same when flattened.
You want a (12, 2) array, or if you read the reshape docs, you can pass (-1, 2) and numpy will figure the other dimension.
So if you just give the new shape for your array as is, it will start working with the first list x[0, 0] = [1, 2, 3, 4], which would become [[1, 2], [3, 4]], ... That's not what you want.
But note that if you transpose your array first, then you'll have the items you want in the inner lists (fast varying index):
In : x.T
Out:
array([[[1., 5.],
[1., 5.],
[1., 5.]],
[[2., 6.],
[2., 6.],
[2., 6.]],
[[3., 7.],
[3., 7.],
[3., 7.]],
[[4., 8.],
[4., 8.],
[4., 8.]]])
Which is almost what you want, except for the extra dimension. So now you can just reshape this and get your (12, 2) array the way you want:
In : x.T.reshape((-1, 2))
Out:
array([[1., 5.],
[1., 5.],
[1., 5.],
[2., 6.],
[2., 6.],
[2., 6.],
[3., 7.],
[3., 7.],
[3., 7.],
[4., 8.],
[4., 8.],
[4., 8.]])
Related
I have a big numpy array and want to split it. I have read this solution but it could not help me. The target column can have several values but I know based on which one I want to split it. In my simplified example the target column is the third one and I want to split it based on the value 2.. This is my array.
import numpy as np
big_array = np.array([[0., 10., 2.],
[2., 6., 2.],
[3., 1., 7.1],
[3.3, 6., 7.8],
[4., 5., 2.],
[6., 6., 2.],
[7., 1., 2.],
[8., 5., 2.1]])
Rows that have this value (2.) make one split. Then, the next rows (number three and four) which are not 2., make another one. Again in my data set I see this value (2.) and make a split out of it and again I keep non 2. values (last row) as another split. The final result should look like this:
spl_array = [np.array([[0., 10., 2.],
[2., 6., 2.]]),
np.array([[3., 1., 7.1],
[3.3, 6., 7.8]]),
np.array([[4., 5., 2.],
[6., 6., 2.],
[7., 1., 2.]]),
np.array([[8., 5., 2.1]])]
In advance I do appreciate any help.
First you find all arrays which contains 2 or which do not contains 2. This array will be full with True and False values. Transform this array to an array with zeros and ones. Check where there are differences (like [0, 0, 1, 1, 0] will be: 0, 1, 0, -1.
Based on the change one can use numpy where to find the indices of those values.
Insert the index 0 and the last index for the big array, so you are able to zip them in a left and right slice.
import numpy as np
big_array = np.array([[0., 10., 2.],
[2., 6., 2.],
[3., 1., 7.1],
[3.3, 6., 7.8],
[4., 5., 2.],
[6., 6., 2.],
[7., 1., 2.],
[8., 5., 2.1]])
idx = [2 in array for array in big_array]
idx *= np.ones(len(idx))
slices = list(np.where(np.diff(idx) != 0)[0] + 1)
slices.insert(0,0)
slices.append(len(big_array))
result = list()
for left, right in zip(slices[:-1], slices[1:]):
result.append(big_array[left:right])
'''
[array([[ 0., 10., 2.],
[ 2., 6., 2.]]),
array([[3. , 1. , 7.1],
[3.3, 6. , 7.8]]),
array([[4., 5., 2.],
[6., 6., 2.],
[7., 1., 2.]]),
array([[8. , 5. , 2.1]])]
'''
You can do this with numpy
np.split(
big_array,
np.flatnonzero(np.diff(big_array[:,2] == 2) != 0) + 1
)
Output
[array([[ 0., 10., 2.],
[ 2., 6., 2.]]),
array([[3. , 1. , 7.1],
[3.3, 6. , 7.8]]),
array([[4., 5., 2.],
[6., 6., 2.],
[7., 1., 2.]]),
array([[8. , 5. , 2.1]])]
I have the following code which outputs 2 arrays in a list:
arr1 = np.array([[1.,2,3], [4,5,6], [7,8,9]])
arr_split = np.array_split(arr1,
indices_or_sections = 2,
axis = 0)
arr_split
Output:
[array([[1., 2., 3.],
[4., 5., 6.]]), array([[7., 8., 9.]])]
How do I cast these 2 arrays into PyTorch tensors and put them into a list using for (or while) loops, so that they look like this:
[tensor([[1., 2., 3.],
[4., 5., 6.]], dtype=torch.float64),
tensor([[7., 8., 9.]], dtype=torch.float64)]
Many thanks in advance!
Better you convert it to tensor at first place and then you can use torch.Tensor.split
arr1 = np.array([[1.,2,3], [4,5,6], [7,8,9]])
t_arr1 = torch.from_numpy(arr1)
t_arr1.split(split_size=2)
(tensor([[1., 2., 3.],
[4., 5., 6.]], dtype=torch.float64),
tensor([[7., 8., 9.]], dtype=torch.float64))
I'd like to return a 2D numpy.array with multiple rolls of a given 1D numpy.array.
>>> multiroll(np.arange(10), [-1, 0, 1, 2])
array([[1., 0., 9., 8.],
[2., 1., 0., 9.],
[3., 2., 1., 0.],
[4., 3., 2., 1.],
[5., 4., 3., 2.],
[6., 5., 4., 3.],
[7., 6., 5., 4.],
[8., 7., 6., 5.],
[9., 8., 7., 6.],
[0., 9., 8., 7.]])
Is there some combination of numpy.roll, numpy.tile, numpy.repeat, or other functions that does this?
Here's what I've tried
def multiroll(array, rolls):
"""Create multiple rolls of 1D vector"""
m = len(array)
n = len(rolls)
shape = (m, n)
a = np.empty(shape)
for i, roll in enumerate(rolls):
a[:,i] = np.roll(array, roll)
return a
I'd expected there's a more "Numpythonic" way of doing this that doesn't use the loop.
Approach #1 : For elegance
Here's one way with broadcasting -
In [44]: a
Out[44]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [45]: rolls
Out[45]: array([-1, 0, 1, 2])
In [46]: a[(np.arange(len(a))[:,None]-rolls) % len(a)]
Out[46]:
array([[1, 0, 9, 8],
[2, 1, 0, 9],
[3, 2, 1, 0],
[4, 3, 2, 1],
[5, 4, 3, 2],
[6, 5, 4, 3],
[7, 6, 5, 4],
[8, 7, 6, 5],
[9, 8, 7, 6],
[0, 9, 8, 7]])
Approach #2 : For memory/perf-efficiency
Idea mostly borrowed from - this post.
We can leverage np.lib.stride_tricks.as_strided based scikit-image's view_as_windows to get sliding windows. More info on use of as_strided based view_as_windows.
from skimage.util.shape import view_as_windows
def multiroll_stridedview(a, r):
r = np.asarray(r)
# Concatenate with sliced to cover all rolls
a_ext = np.concatenate((a,a[:-1]))
# Get sliding windows; use advanced-indexing to select appropriate ones
n = len(a)
return view_as_windows(a_ext,n)[:,(n-r)%n]
Approach #3 : For mathematical beauty (and efficiency ?)
Using a fft kernel in the frequency domain you can process a whole matrix at once. This method only work with integer
A = np.array([[1., 1., 1., 1.],
[2., 2., 2., 2.],
[3., 3., 3., 3.],
[4., 4., 4., 4.],
[5., 5., 5., 5.],
[6., 6., 6., 6.],
[7., 7., 7., 7.],
[8., 8., 8., 8.],
[9., 9., 9., 9.],
[0., 0., 0., 0.]]).transpose()
m,n = A.shape
#shift vector
s=[-1,0,1,2]
#transformation kernel (shift theorem)
fftkernel = np.exp(-2*1j*np.pi/n*np.outer(v,np.arange(0,n)))
#Apply the shift
res=np.round(np.fft.ifft(np.fft.fft(A,axis = 1) * fftkernel ,axis = 1)).real.transpose()
We get:
array([[1., 0., 9., 8.],
[2., 1., 0., 9.],
[3., 2., 1., 0.],
[4., 3., 2., 1.],
[5., 4., 3., 2.],
[6., 5., 4., 3.],
[7., 6., 5., 4.],
[8., 7., 6., 5.],
[9., 8., 7., 6.],
[0., 9., 8., 7.]])
You can get more information about how this code work here
For left circular shift you can use:
fftkernel = np.exp(2*1j*np.pi/n*np.outer(v,np.arange(0,n)))
Without minus sign.
I have a 3d numpy array of following form:
array([[[ 1., 5., 4.],
[ 1., 5., 4.],
[ 1., 2., 4.]],
[[ 3., 6., 4.],
[ 6., 6., 4.],
[ 6., 6., 4.]]])
Is there a efficient way to convert it to a 2d array of form:
array([[1, 1, 1, 5, 5, 2, 4, 4, 4],
[3, 6, 6, 6, 6, 6, 4, 4, 4]])
Thanks a lot!
In [54]: arr = np.array([[[ 1., 5., 4.],
[ 1., 5., 4.],
[ 1., 2., 4.]],
[[ 3., 6., 4.],
[ 6., 6., 4.],
[ 6., 6., 4.]]])
In [61]: arr.reshape((arr.shape[0], -1), order='F')
Out[61]:
array([[ 1., 1., 1., 5., 5., 2., 4., 4., 4.],
[ 3., 6., 6., 6., 6., 6., 4., 4., 4.]])
The array arr has shape (2, 3, 3). We wish to keep the first axis of length 2, and flatten the two axes of length 3.
If we call arr.reshape(h, w) then NumPy will attempt to reshape arr to shape (h, w). If we call arr.reshape(h, -1) then NumPy will replace the -1 with whatever integer is needed for the reshape to make sense -- in this case, arr.size/h.
Hence,
In [63]: arr.reshape((arr.shape[0], -1))
Out[63]:
array([[ 1., 5., 4., 1., 5., 4., 1., 2., 4.],
[ 3., 6., 4., 6., 6., 4., 6., 6., 4.]])
This is almost what we want, but notice that the values in each subarray, such as
[[ 1., 5., 4.],
[ 1., 5., 4.],
[ 1., 2., 4.]]
are being traversed by marching from left to right before going down to the next row.
We want to march down the rows before going on to the next column.
To achieve that, use order='F'.
Usually the elements in a NumPy array are visited in C-order -- where the last index moves fastest. If we visit the elements in F-order then the first index moves fastest. Since in a 2D array of shape (h, w), the first axis is associated with the rows and the last axis the columns, traversing the array in F-order marches down each row before moving on to the next column.
I'm working with 3-dimensional arrays (for the purpose of this example you can imagine they represent the RGB values at X, Y coordinates of the screen).
>>> import numpy as np
>>> a = np.floor(10 * np.random.random((2, 2, 3)))
>>> a
array([[[ 7., 3., 1.],
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 8., 1., 1.]]])
What I would like to do, is to set to an arbitrary value the G channel for those pixels whose G channel is already below 5. I can manage to isolate the pixel I am interested in using:
>>> a[np.where(a[:, :, 1] < 5)]
array([[ 7., 3., 1.],
[ 8., 1., 1.]])
but I am struggling to understand how to assign a new value to the G channel only. I tried:
>>> a[np.where(a[:, :, 1] < 5)][1] = 9
>>> a
array([[[ 7., 3., 1.],
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 8., 1., 1.]]])
...but it seems not to produce any effect. I also tried:
>>> a[np.where(a[:, :, 1] < 5), 1] = 9
>>> a
array([[[ 7., 3., 1.],
[ 9., 9., 9.]],
[[ 4., 6., 8.],
[ 9., 9., 9.]]])
...(failing to understand what is happening). Finally I tried:
>>> a[np.where(a[:, :, 1] < 5)][:, 1] = 9
>>> a
array([[[ 7., 3., 1.],
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 8., 1., 1.]]])
I suspect I am missing something fundamental on how NumPy works (this is the first time I use the library). I would appreciate some help in how to achieve what I want as well as some explanation on what happened with my previous attempts.
Many thanks in advance for your help and expertise!
EDIT: The outcome I would like to get is:
>>> a
array([[[ 7., 9., 1.], # changed the second number here
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 8., 9., 1.]]]) # changed the second number here
>>> import numpy as np
>>> a = np.array([[[ 7., 3., 1.],
... [ 9., 6., 9.]],
...
... [[ 4., 6., 8.],
... [ 8., 1., 1.]]])
>>> a
array([[[ 7., 3., 1.],
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 8., 1., 1.]]])
>>> a[:,:,1][a[:,:,1] <; 5 ] = 9
>>> a
array([[[ 7., 9., 1.],
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 8., 9., 1.]]])
a[:,:,1] gives you G channel, I subsetted it by a[:,:,1] < 5 using it as index. then assigned value 9 to that selected elements.
there is no need to use where, you can directly index an array with the boolean array resulting from your comparison operator.
a=array([[[ 7., 3., 1.],
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 8., 1., 1.]]])
>>> a[a[:, :, 1] < 5]
array([[ 7., 3., 1.],
[ 8., 1., 1.]])
>>> a[a[:, :, 1] < 5]=9
>>> a
array([[[ 9., 9., 9.],
[ 9., 6., 9.]],
[[ 4., 6., 8.],
[ 9., 9., 9.]]])
you do not list the expected output in your question, so I am not sure this is what you want.