I would like to reduce a NumPy matrix using the vector u and the numpy.compress() method, first going across the rows and then columns. Now my code looks like this:
n = 4 #number of rows/columns
square_matrix = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
u = np.array([1,0,1,0])
v = []
for i in range(n):
v.append(np.compress(u,square_matrix[i]))
print(v)
I get the following output:
[array([1, 3]), array([5, 7]), array([ 9, 11]), array([13, 15])]
I have two questions:
How can I now create a matrix from the output again.
How could I repeat the same process for the columns. (My initial idea was to use a transpose of u, something like this:
for j in range((len(v_matrix[0])-1)):
w.append(np.compress(u.transpose(),v_matrix[:][j]))
compress like many numpy reduction functions takes an axis parameter:
In [166]: np.compress(u,square_matrix, axis=1)
Out[166]:
array([[ 1, 3],
[ 5, 7],
[ 9, 11],
[13, 15]])
In [167]: np.compress(u,square_matrix, axis=0)
Out[167]:
array([[ 1, 2, 3, 4],
[ 9, 10, 11, 12]])
Applying the compression sequentially:
In [168]: np.compress(u,np.compress(u,square_matrix, axis=0),axis=1)
Out[168]:
array([[ 1, 3],
[ 9, 11]])
I didn't realize np.compress existed, though from the source file it must have been there from the start. Boolean indexing is the same, and more common.
In [169]: bu = u.astype(bool)
In [170]: square_matrix[bu,:]
Out[170]:
array([[ 1, 2, 3, 4],
[ 9, 10, 11, 12]])
In [171]: square_matrix[:,bu]
Out[171]:
array([[ 1, 3],
[ 5, 7],
[ 9, 11],
[13, 15]])
Boolean indexing is equivalent to indexing with the nonzero result:
In [177]: np.nonzero(u)
Out[177]: (array([0, 2]),)
In [178]: idx = np.nonzero(u)[0]
In [179]: square_matrix[idx,:]
Out[179]:
array([[ 1, 2, 3, 4],
[ 9, 10, 11, 12]])
And that can be applied to both dimensions at once with:
In [180]: square_matrix[idx[:,None],idx]
Out[180]:
array([[ 1, 3],
[ 9, 11]])
Without the reshaping (making the first a column), we get the diagonal:
In [181]: square_matrix[idx,idx]
Out[181]: array([ 1, 11])
and using the ix_ utility:
In [185]: np.ix_(bu,bu)
Out[185]:
(array([[0],
[2]]),
array([[0, 2]]))
In [186]: square_matrix[np.ix_(bu,bu)]
Out[186]:
array([[ 1, 3],
[ 9, 11]])
How can I now create a matrix from the output again.
You can do the operation in a vectorized manner, just specify the axis keyword for np.compress
np.compress(u, square_matrix, axis=1)
output:
array([[ 1, 3],
[ 5, 7],
[ 9, 11],
[13, 15]])
How could I repeat the same process for the columns. (My initial idea was to use a transpose of u)
Your suggestion is correct, but transpose the matrix instead of u. This will switch the columns with rows.
np.compress(u, square_matrix.T, axis=1)
output:
array([[ 1, 9],
[ 2, 10],
[ 3, 11],
[ 4, 12]])
Related
I have been going over this issue with numpy for a while and cant figure out if there is a intuitive way of converting the array while maintaining the position of the sub-array. The sizes of the array will change depending on the input so doing it manually with concatenate is not an option but i do have the dimensions.
a= np.array([[[0,1],[2,3]],[[4,5],[6,7]],[[8,9],[10,11]],[[12,13],[14,15]]])
reshaping just flattens the array like
[1,2,3,4]
[5,6,7,8]
etc
I have also tried np.block but besides setting the positions manually i have not had any success
The result i would like to get in this case is (4,4):
[[ 0, 1, 4, 5],
[ 2, 3, 6, 7],
[ 8, 9,12,13],
[10,11,14,15]]
Does anyone of you smart people know if there is something in numpy that i could use to get this result?
Your original has the 16 consecutive values reshaped into 4d array:
In [67]: x=np.arange(16).reshape(2,2,2,2)
In [68]: x
Out[68]:
array([[[[ 0, 1],
[ 2, 3]],
[[ 4, 5],
[ 6, 7]]],
[[[ 8, 9],
[10, 11]],
[[12, 13],
[14, 15]]]])
Reshape to (4,4) keeps that original order - see the 0,1,2,3...
In [69]: x.reshape(4,4)
Out[69]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
You want to swap values:
In [70]: x.transpose(0,2,1,3)
Out[70]:
array([[[[ 0, 1],
[ 4, 5]],
[[ 2, 3],
[ 6, 7]]],
[[[ 8, 9],
[12, 13]],
[[10, 11],
[14, 15]]]])
which can then be reshaped to (4,4):
In [71]: x.transpose(0,2,1,3).reshape(4,4)
Out[71]:
array([[ 0, 1, 4, 5],
[ 2, 3, 6, 7],
[ 8, 9, 12, 13],
[10, 11, 14, 15]])
Imagine I have an n x d python array, e.g. a=np.array([[1,2,3],[4,5,6], [7,8,9], [10,11,12], [13,14,15]])
so in this case n=5, d=3 and imagine I have some number c which is smaller or equal than n and what I want to calculate is the following:
Consider every column independently and calculate the sum of every c values; e.g. if c=2, the solution would be
solution=np.array([[1+4, 2+5, 3+6], [7+10,8+11,9+12]])
The last row is skipped because 5 mod 2 = 1, so we need to leave out one line in the end;
If c=1, the solution would be the original array and if e.g. c=3 the solution would be
solution=np.array([[1+4+7, 2+5+8, 3+6+9]]), while the last two lines are omitted;
Now what would be the most elegant and efficient solution to do that? I have searched a lot online but could not find a similar problem
Here's one way -
def sum_in_blocks(a, c):
# Get extent of each col for summing
l = c*(len(a)//c)
# Reshape to 3D considering first l rows, and "cutting" after each c rows
# Then sum along second axis
return a[:l].reshape(-1,c,a.shape[1]).sum(1)
More info on second step - General idea for nd to nd transformation.
Sample runs -
In [79]: a
Out[79]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12],
[13, 14, 15]])
In [80]: sum_in_blocks(a, c=1)
Out[80]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12],
[13, 14, 15]])
In [81]: sum_in_blocks(a, c=2)
Out[81]:
array([[ 5, 7, 9],
[17, 19, 21]])
In [82]: sum_in_blocks(a, c=3)
Out[82]: array([[12, 15, 18]])
Explanation with given sample
In [84]: a
Out[84]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12],
[13, 14, 15]])
In [85]: c = 2
In [87]: l = c*(len(a)//c) # = 4; Get extent of each col for summing
In [89]: a[:l] # hence not relevant rows are skipped
Out[89]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])
# Reshape to 3D "cutting" after every c=2 rows
In [90]: a[:l].reshape(-1,c,a.shape[1])
Out[90]:
array([[[ 1, 2, 3],
[ 4, 5, 6]],
[[ 7, 8, 9],
[10, 11, 12]]])
# Sum along axis=1 for final o/p
In [91]: a[:l].reshape(-1,c,a.shape[1]).sum(axis=1)
Out[91]:
array([[ 5, 7, 9],
[17, 19, 21]])
I have numpy array of floats with shape (x,14) and I would like to add to the end of each "row" one more value (to each row different value), so that end result has shape (x,15).
We can suppose that I have those values in some list, so that part of the question is also defined.
How to do it with numpy functions?
Define a 2d array and a list:
In [73]: arr = np.arange(12).reshape(4,3)
In [74]: arr
Out[74]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
In [75]: alist = [10,11,12,13]
Note their shapes:
In [76]: arr.shape
Out[76]: (4, 3)
In [77]: np.array(alist).shape
Out[77]: (4,)
To join alist to arr it needs to have the same number of dimensions, and same number of 'rows'. We can do that by adding a dimension with the None idiom:
In [78]: np.array(alist)[:,None].shape
Out[78]: (4, 1)
Now we can concatenate on the 2nd axis:
In [79]: np.concatenate((arr, np.array(alist)[:,None]),axis=1)
Out[79]:
array([[ 0, 1, 2, 10],
[ 3, 4, 5, 11],
[ 6, 7, 8, 12],
[ 9, 10, 11, 13]])
column_stack does the same thing, taking care that each input is at least 2d (I'd suggest reading its code.) In the long run you should be familiar enough with dimensions and shapes to do this with plain concatenate.
In [81]: np.column_stack((arr, alist))
Out[81]:
array([[ 0, 1, 2, 10],
[ 3, 4, 5, 11],
[ 6, 7, 8, 12],
[ 9, 10, 11, 13]])
np.c_ also does this - but note the use of [] instead of (). It's a clever use of indexing notation, convenient, but potentially confusing.
np.c_[arr, alist]
np.r_['-1,2,0', arr, alist] # for more clever obscurity
You can use numpy.insert function (https://numpy.org/doc/stable/reference/generated/numpy.insert.html)
a = np.array([[1, 1], [2, 2], [3, 3]])
np.insert(a, 2, 0, axis=1)
Output:
array([[1, 1, 0],
[2, 2, 0],
[3, 3, 0]])
I am looking for a way to apply a function n items at the time along an axis. E.g.
array([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8]])
If I apply sum across the rows 2 items at a time I get:
array([[ 4, 6],
[ 12, 14]])
Which is the sum of 1st 2 rows and the last 2 rows.
NB: I am dealing with much larger array and I have to apply the function to n items which I can be decided at runtime.
The data extends along different axis. E.g.
array([[... [ 1, 2, ...],
[ 3, 4, ...],
[ 5, 6, ...],
[ 7, 8, ...],
...], ...])
This is a reduction:
numpy.add.reduceat(a, [0,2])
>>> array([[ 4, 6],
[12, 14]], dtype=int32)
As long as by "larger" you mean longer in the "y" axis, you can extend:
a = numpy.array([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8],
[ 9, 10],
[11, 12]])
numpy.add.reduceat(a, [0,2,4])
>>> array([[ 4, 6],
[12, 14],
[20, 22]], dtype=int32)
EDIT: actually, this works fine for "larger in both dimensions", too:
a = numpy.arange(24).reshape(6,4)
numpy.add.reduceat(a, [0,2,4])
>>> array([[ 4, 6, 8, 10],
[20, 22, 24, 26],
[36, 38, 40, 42]], dtype=int32)
I will leave it up to you to adapt the indices to your specific case.
Reshape splitting the first axis into two axes, such that the second split axis is of length n to have a 3D array and then sum along that split axis, like so -
a.reshape(a.shape[0]//n,n,a.shape[1]).sum(1)
It should be pretty efficient as reshaping just creates a view into input array.
Sample run -
In [55]: a
Out[55]:
array([[2, 8, 0, 0],
[1, 5, 3, 3],
[6, 1, 4, 7],
[0, 4, 0, 7],
[8, 0, 8, 1],
[8, 3, 3, 8]])
In [56]: n = 2 # Sum every two rows
In [57]: a.reshape(a.shape[0]//n,n,a.shape[1]).sum(1)
Out[57]:
array([[ 3, 13, 3, 3],
[ 6, 5, 4, 14],
[16, 3, 11, 9]])
How about something like this?
n = 2
# calculate the cumsum along axis 0 and take one row from every n rows
cumarr = arr.cumsum(axis = 0)[(n-1)::n]
# calculate the difference of the resulting numpy array along axis 0
np.vstack((cumarr[0][None, :], np.diff(cumarr, axis=0)))
# array([[ 4, 6],
# [12, 14]])
I am trying to understand numpy's combined slicing and indexing concept, however I am not sure how to correctly get the below results from numpy's output (by hand so that we can understand how numpy process combined slicing and indexing, which one will be process first?):
>>> import numpy as np
>>> a=np.arange(12).reshape(3,4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> i=np.array([[0,1],[2,2]])
>>> a[i,:]
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7]],
[[ 8, 9, 10, 11],
[ 8, 9, 10, 11]]])
>>> j=np.array([[2,1],[3,3]])
>>> a[:,j]
array([[[ 2, 1],
[ 3, 3]],
[[ 6, 5],
[ 7, 7]],
[[10, 9],
[11, 11]]])
>>> aj=a[:,j]
>>> aj.shape
(3L, 2L, 2L)
I am bit confused about how aj's shape becomes (3,2,2) with the above output, any detailed explanations are very appreciated, thanks!
Whenever you use an array of indices, the result has the same shape as the indices; for example:
>>> x = np.ones(5)
>>> i = np.array([[0, 1], [1, 0]])
>>> x[i]
array([[ 1., 1.],
[ 1., 1.]])
We've indexed with a 2x2 array, and the result is a 2x2 array.
When combined with a slice, the size of the slice is preserved. For example:
>>> x = np.ones((5, 3))
>>> x[i, :].shape
(2, 2, 3)
Where the first example was a 2x2 array of items, this example is a 2x2 array of (length-3) rows.
The same is true when you switch the order of the slice:
>>> x = np.ones((5, 3))
>>> x[:, i].shape
(5, 2, 2)
This can be thought of as a list of five 2x2 arrays.
Just remember: when any dimension is indexed with a list or array, the result has the shape of the indices, not the shape of the input.
a[:,j][0] is equivalent to a[0,j] or [0, 1, 2, 3][j] which gives you [[2, 1], [3, 3]])
a[:,j][1] is equivalent to a[1,j] or [4, 5, 6, 7][j] which gives you [[6, 5], [7, 7]])
a[:,j][2] is equivalent to a[2,j] or [8, 9, 10, 11][j] which gives you [[10, 9], [11, 11]])