I have a 3D numpy array that I need to reshape and arrange. For example, I have x=np.array([np.array([np.array([1,0,1]),np.array([1,1,1]),np.array([0,1,0]),np.array([1,1,0])]),np.array([np.array([0,0,1]),np.array([0,0,0]),np.array([0,1,1]),np.array([1,0,0])]),np.array([np.array([1,0,0]),np.array([1,0,1]),np.array([1,1,1]),np.array([0,0,0])])])
Which is a shape of (3,4,3), when printing it I get:
array([[[1, 0, 1],
[1, 1, 1],
[0, 1, 0],
[1, 1, 0]],
[[0, 0, 1],
[0, 0, 0],
[0, 1, 1],
[1, 0, 0]],
[[1, 0, 0],
[1, 0, 1],
[1, 1, 1],
[0, 0, 0]]])
Now I need to reshape this array to a (4,3,3) by selecting the same index in each subarray and putting them together to end up with something like this:
array([[[1,0,1],[0,0,1],[1,0,0]],
[[1,1,1],[0,0,0],[1,0,1]],
[[0,1,0],[0,1,1],[1,1,1]],
[[1,1,0],[1,0,0],[0,0,0]]]
I tried reshape, all kinds of stacking and nothing worked (arranged the array like I need). I know I can do it manually but for large arrays manually isn't a choice.
Any help will be much appreciated.
Thanks
swapaxes will do what you want. That is, if your input array is x and your desired output is y, then
np.all(y==np.swapaxes(x, 1, 0))
should give True.
For higher dimensional arrays, transpose will accept a tuple of axis numbers to permute the axes:
import numpy as np
foo = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]])
foo.transpose(1, 0, 2)
result:
array([[[ 1, 2],
[ 5, 6],
[ 9, 10]],
[[ 3, 4],
[ 7, 8],
[11, 12]]])
Related
hope doing well.
I have an extremely big numpy array and want to split it into several ones. My array has three columns and I want to split it where the all the columns are reaching their maximum values:
array = [[0, 0, 0],
[0, 0, 5],
[10, 5, 10],
[1, 1, 1],
[5, 5, 15],
[10, 8, 20],
[2, 0, 0],
[10, 10, 12],
[1, 2, 0],
[2, 5, 9]]
Now, I want to split it into four array:
sub_array_1=[[0, 0, 0],
[0, 0, 5],
[10, 5, 10]]
sub_array_2=[[1, 1, 1],
[5, 5, 15],
[10, 8, 20]]
sub_array_3=[[2, 0, 0],
[10, 10, 12]]
sub_array_4=[[1, 2, 0],
[2, 5, 9]]
I tried to it in a for loop having if statements saying that give me an array when each element of my input is bigger than the element stored in the both upper and lower rows. And I also should figure out the last row:
import numpy as np
sub_array_1=np.array([])
for i in array:
if array[i,:]>array[i+1,:] and array[i,:]>array[i+1,:]:
vert_1=np.append(sub_array_1,array[0:i,:])
My code doesn't work, but it simply shows my idea.
I am quite new in Python and I could not find the way to write my idea as a code. So, I appreciate any help and contribution.
Cheers,
Ali
IIUC, one way using numpy.diff with numpy.array_split:
indices = np.argwhere(np.all(np.diff(array, axis=0) < 0, axis=1))
np.array_split(array, indices.ravel()+1, axis=0)
Output:
[array([[ 0, 0, 0],
[ 0, 0, 5],
[10, 5, 10]]),
array([[ 1, 1, 1],
[ 5, 5, 15],
[10, 8, 20]]),
array([[ 2, 0, 0],
[10, 10, 12]]),
array([[1, 2, 0],
[2, 5, 9]])]
np.all and np.diff find a row where all elements of the row as a negative difference with a next row (i.e. where the peak ends)
np.array_split will then split the given array based on the locations of the peak found.
For example, I got the 3D array below
[[[1,2,3],
[4,5,6]
[7,8,9]],
[[1,3,5],
[2,4,6],
[5,7,9]]
[[1,4,6],
[2,4,7],
[5,8,9]]
]
The first question is that how I can make each element along the first axis become the triangular matrix, i.e
[[[0,2,3],
[0,0,6]
[0,0,0]],
[[0,3,5],
[0,0,6],
[0,0,0]]
[[0,4,6],
[0,0,7],
[0,0,0]]
]
Based on this, how can I then transpose each of them, like
[[[0,0,0],
[2,0,0]
[3,6,0]],
[[0,0,0],
[3,0,0],
[5,6,0]]
[[0,0,0],
[4,0,0],
[6,7,0]]
]
Use np.triu and then swap axes along last two axes to effectively do transpose -
In [10]: np.triu(a,1).swapaxes(1,2)
Out[10]:
array([[[0, 0, 0],
[2, 0, 0],
[3, 6, 0]],
[[0, 0, 0],
[3, 0, 0],
[5, 6, 0]],
[[0, 0, 0],
[4, 0, 0],
[6, 7, 0]]])
Swapping can also be achieved with ndarray.transpose(0,2,1).
You can do both your tasks in one go (a single loop):
for i in range(a.shape[0]):
a[i,...] = np.triu(a[i,...], k=1).T
The resul is:
array([[[0, 0, 0],
[2, 0, 0],
[3, 6, 0]],
[[0, 0, 0],
[3, 0, 0],
[5, 6, 0]],
[[0, 0, 0],
[4, 0, 0],
[6, 7, 0]]])
I need to remove the last arrays from a 3D numpy cube. I have:
a = np.array(
[[[1,2,3],
[4,5,6],
[7,8,9]],
[[9,8,7],
[6,5,4],
[3,2,1]],
[[0,0,0],
[0,0,0],
[0,0,0]],
[[0,0,0],
[0,0,0],
[0,0,0]]])
How do I remove the arrays with zero sub-arrays like at the bottom side of the cube, using np.delete?
(I cannot simply remove all zero values, because there will be zeros in the data on the top side)
For a 3D cube, you might check all against the last two axes
a = np.asarray(a)
a[~(a==0).all((2,1))]
array([[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
Here's one way to remove trailing all zeros slices, as mentioned in the question that we want to keep the all zeros slices in the data on the top side -
a[:-(a==0).all((1,2))[::-1].argmin()]
Sample run -
In [80]: a
Out[80]:
array([[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]],
[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]],
[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]]])
In [81]: a[:-(a==0).all((1,2))[::-1].argmin()]
Out[81]:
array([[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
If you know where they are already, the easiest thing to do is slice them off:
a[:-2]
Results in:
array([[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
Hope this helps,
a_new=[] #Create a empty list
for item in a:
if not (np.count_nonzero(item) == 0): #check if inner matrix is empty or not
a_new.append(item) #appending to inner matrix to the list
a_new=np.array(a_new) #creating numpy matrix with removed zero elements
Output:
array([[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[9, 8, 7],
[6, 5, 4],
[3, 2, 1]]])
Use any and select :)
a=np.array([[[1,2,3],
[4,5,6],
[7,8,9]],
[[9,8,7],
[6,5,4],
[3,2,1]],
[[0,0,0],
[0,0,0],
[0,0,0]],
[[0,0,0],
[0,0,0],
[0,0,0]]])
a[a.any(axis=2).any(axis=1)]
Given an two arrays: an input array and a repeat array, I would like to receive an array which is repeated along a new dimension a specified amount of times for each row and padded until the ending.
to_repeat = np.array([1, 2, 3, 4, 5, 6])
repeats = np.array([1, 2, 2, 3, 3, 1])
# I want final array to look like the following:
#[[1, 0, 0],
# [2, 2, 0],
# [3, 3, 0],
# [4, 4, 4],
# [5, 5, 5],
# [6, 0, 0]]
The issue is that I'm operating with large datasets (10M or so) so a list comprehension is too slow - what is a fast way to achieve this?
Here's one with masking based on this idea -
m = repeats[:,None] > np.arange(repeats.max())
out = np.zeros(m.shape,dtype=to_repeat.dtype)
out[m] = np.repeat(to_repeat,repeats)
Sample output -
In [44]: out
Out[44]:
array([[1, 0, 0],
[2, 2, 0],
[3, 3, 0],
[4, 4, 4],
[5, 5, 5],
[6, 0, 0]])
Or with broadcasted-multiplication -
In [67]: m*to_repeat[:,None]
Out[67]:
array([[1, 0, 0],
[2, 2, 0],
[3, 3, 0],
[4, 4, 4],
[5, 5, 5],
[6, 0, 0]])
For large datasets/sizes, we can leverage multi-cores and be more efficient on memory with numexpr module on that broadcasting -
In [64]: import numexpr as ne
# Re-using mask `m` from previous method
In [65]: ne.evaluate('m*R',{'m':m,'R':to_repeat[:,None]})
Out[65]:
array([[1, 0, 0],
[2, 2, 0],
[3, 3, 0],
[4, 4, 4],
[5, 5, 5],
[6, 0, 0]])
I am using numpy to tally a lot of values across many large arrays, and keep track of which positions the maximum values appear in.
In particular, imagine I have a 'counts' array:
data = numpy.array([[ 5, 10, 3],
[ 6, 9, 12],
[13, 3, 9],
[ 9, 3, 1],
...
])
counts = numpy.zeros(data.shape, dtype=numpy.int)
data is going to change a lot, but I want 'counts' to reflect the number of times the max has appeared in each position:
max_value_indices = numpy.argmax(data, axis=1)
# this is now [1, 2, 0, 0, ...] representing the positions of 10, 12, 13 and 9, respectively.
From what I understand of broadcasting in numpy, I should be able to say:
counts[max_value_indices] += 1
What I expect is the array to be updated:
[[0, 1, 0],
[0, 0, 1],
[1, 0, 0],
[1, 0, 0],
...
]
But instead this increments ALL the values in counts giving me:
[[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
...
]
I also though perhaps if I transformed max_value_indices to a 100x1 array, it might work:
counts[max_value_indices[:,numpy.newaxis]] += 1
but this has effect of updating just the elements in positions 0, 1, and 2:
[[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[0, 0, 0],
...
]
I'm also happy to turn the indices array into an array of 0's and 1's, and then add it to the counts array each time, but I'm not sure how to construct that.
You could use so-called advanced integer indexing (aka Multidimensional list-of-locations indexing):
In [24]: counts[np.arange(data.shape[0]),
np.argmax(data, axis=1)] += 1
In [25]: counts
Out[25]:
array([[0, 1, 0],
[0, 0, 1],
[1, 0, 0],
[1, 0, 0]])
The first array, np.arange(data.shape[0]) specifies the row. The second array, np.argmax(data, axis=1) specifies the column.