Numpy append and normal append - python

x = [[1,2],[2,3],[10,1],[10,10]]
def duplicatingRows(x, l):
severity = x[l][1]
if severity == 1 or severity == 2:
for k in range(1,6):
x.append(x[l])
for l in range(len(x)):
duplicatingRows(x,l)
print(x)
x = np.array([[1,2],[2,3],[10,1],[10,10]])
def duplicatingRows(x, l):
severity = x[l][1]
if severity == 1 or severity == 2:
for k in range(1,6):
x = np.append(x, x[l])
for l in range(len(x)):
duplicatingRows(x,l)
print(x)
I would like it to print an array with extra appended rows.
Giving out a list of [[1, 2], [2, 3], [10, 1], [10, 10], [1, 2], [1, 2], [1, 2], [1, 2], [1, 2], [10, 1], [10, 1], [10, 1], [10, 1], [10, 1]]. Why does it not work? I tried different combinations with concatenate as well, but it didnt work.

You have some bugs in your code. Here's a little bit improved, correct, and (partially) vectorized implementation of your code which prints your desired output.
Here we leverage numpy.tile for repeating the rows, followed by a reshape so that we can append it along axis 0, which is what is needed.
In [24]: x = np.array([[1,2],[2,3],[10,1],[10,10]])
def duplicatingRows(x, l):
severity = x[l][1]
if severity == 1 or severity == 2:
# replaced your `for` loop
# 5 corresponds to `range(1, 6)`
reps = np.tile(x[l], 5).reshape(5, -1)
x = np.append(x, reps, axis=0)
return x
for l in range(len(x)):
x = duplicatingRows(x,l)
print(x)
Output:
[[ 1 2]
[ 2 3]
[10 1]
[10 10]
[ 1 2]
[ 1 2]
[ 1 2]
[ 1 2]
[ 1 2]
[10 1]
[10 1]
[10 1]
[10 1]
[10 1]]

Let's take a whole-array approach
In [140]: arr = np.array([[1,2],[2,3],[10,1],[10,10]])
In [141]: arr
Out[141]:
array([[ 1, 2],
[ 2, 3],
[10, 1],
[10, 10]])
We want to replicate the rows where the 2nd column has a 1 or 2, right? isin makes a nice 'mask' (we could also use == and any):
In [142]: np.isin(arr[:,1],[1,2])
Out[142]: array([ True, False, True, False])
In [143]: torepeat = arr[np.isin(arr[:,1],[1,2])]
In [144]: torepeat
Out[144]:
array([[ 1, 2],
[10, 1]])
np.repeat does a nice job of replicating the values, which we can simply concatenate with the original:
In [145]: repeated = np.repeat(torepeat,5, axis=0)
In [146]: np.concatenate((arr, repeated),axis=0)
Out[146]:
array([[ 1, 2],
[ 2, 3],
[10, 1],
[10, 10],
[ 1, 2],
[ 1, 2],
[ 1, 2],
[ 1, 2],
[ 1, 2],
[10, 1],
[10, 1],
[10, 1],
[10, 1],
[10, 1]])
np.append uses concatenate. It works ok with adding a single element to a 1d array, but becomes trickier to use with higher dimensions. It's a poor imitation of the list append. Also repeated concatenates in a loop is relatively slow. We usually recommend list appends, with a single array build at the end.
Another way to use repeat iteratively:
In [164]: np.concatenate([np.repeat(a[None,:], 5, axis=0) for a in arr if (a[1]==1 or a[1]==2)], axis=0)
Out[164]:
array([[ 1, 2],
[ 1, 2],
[ 1, 2],
[ 1, 2],
[ 1, 2],
[10, 1],
[10, 1],
[10, 1],
[10, 1],
[10, 1]])

Related

vectorize a function on a 3D numpy array using a specific signature

I'd like to apply a function f(x, y) on a numpy array a of shape (N,M,2), whose last axis (2) contains the variables x and y to give in input to f.
Example.
a = np.array([[[1, 1],
[2, 1],
[3, 1]],
[[1, 2],
[2, 2],
[3, 2]],
[[1, 3],
[2, 3],
[3, 3]]])
def function_to_vectorize(x, y):
# the function body is totaly random and not important
if x>2 and y-x>0:
sum = 0
for i in range(y):
sum+=i
return sum
else:
sum = y
for i in range(x):
sum-=i
return sum
I'd like to apply function_to_vectorize in this way:
[[function_to_vectorize(element[0], element[1]) for element in vector] for vector in a]
#array([[ 1, 0, -2],
# [ 2, 1, -1],
# [ 3, 2, 3]])
How can I vectorize this function with np.vectorize?
With that function, the np.vectorize result will also expect 2 arguments. 'signature' is determined by the function, not by the array(s) you expect to supply.
In [184]: f = np.vectorize(function_to_vectorize)
In [185]: f(1,2)
Out[185]: array(2)
In [186]: a = np.array([[[1, 1],
...: [2, 1],
...: [3, 1]],
...:
...: [[1, 2],
...: [2, 2],
...: [3, 2]],
...:
...: [[1, 3],
...: [2, 3],
...: [3, 3]]])
Just supply the 2 columns of a:
In [187]: f(a[:,:,0],a[:,:,1])
Out[187]:
array([[ 1, 0, -2],
[ 2, 1, -1],
[ 3, 2, 0]])

Combine index and value of tenor to from a new tensor

I have a tensor like a = torch.tensor([1,2,0,1,2]). I want to calculate a tensor b which has indices and values of tensor a such that:
b = tensor([ [0,1], [1,2], [2,0], [3,1], [4,2] ]).
Edit: a[i] is >= 0.
One way of doing this is:
b = torch.IntTensor(list(zip(range(0, list(a.size())[0], 1), a.numpy())))
Output:
tensor([[0, 1],
[1, 2],
[2, 0],
[3, 1],
[4, 2]], dtype=torch.int32)
Alternatively, you can also use torch.cat() as below:
a = torch.tensor([1,2,0,1,2])
indices = torch.arange(0, list(a.size())[0])
res = torch.cat([indices.view(-1, 1), a.view(-1, 1)], 1)
Output:
tensor([[0, 1],
[1, 2],
[2, 0],
[3, 1],
[4, 2]])
a = torch.tensor([1,2,0,1,2])
print(a)
i = torch.arange(a.size(0))
print(i)
r = torch.stack((i, a), dim=1)
print(r)
tensor([1, 2, 0, 1, 2])
tensor([0, 1, 2, 3, 4])
tensor([[0, 1],
[1, 2],
[2, 0],
[3, 1],
[4, 2]])

Create 4D upper diagonal array from 3D

Let's say that I have a (x, y, z) sized matrix. Now, I wish to create a new matrix of dimension (x, y, i, i), where the (i, i) matrix is upper diagonal and constructed from the values on the z-dimension. Is there some easy way of doing this in numpy without using more than 1 for-loop (looping over x)? Thanks.
EDIT
original = np.array([
[
[0, 1, 3],
[4, 5, 6]
],
[
[7, 8, 9],
[3, 2, 1]
],
])
new = np.array([
[
[
[0, 1],
[0, 3]
],
[
[4, 5],
[0, 6]
]
],
[
[
[7, 8],
[0, 9]
],
[
[3, 2],
[0, 1]
]
]
])
So, using the above we see that
original[0, 0, :] = [0 1 3]
new[0, 0, :, :] = [[0 1]
[0 3]]
Here's an approach using boolean-indexing -
n = 2 # This would depend on a.shape[-1]
out = np.zeros(a.shape[:2] + (n,n,),dtype=a.dtype)
out[:,:,np.arange(n)[:,None] <= np.arange(n)] = a
Sample run -
In [247]: a
Out[247]:
array([[[0, 1, 3],
[4, 5, 6]],
[[7, 8, 9],
[3, 2, 1]]])
In [248]: out
Out[248]:
array([[[[0, 1],
[0, 3]],
[[4, 5],
[0, 6]]],
[[[7, 8],
[0, 9]],
[[3, 2],
[0, 1]]]])
Another approach could be suggested using subscripted-indexing to replace the last step -
r,c = np.triu_indices(n)
out[:,:,r,c] = a
Note : As stated earlier, n would depend on a.shape[-1]. Here, we had a.shape[-1] as 3, so n was 2. If a.shape[-1] were 6, n would be 3 and so on. The relationship is : (n*(n+1))//2 == a.shape[-1].

Python - Matrix outer product

Given two matrices
A: m * r
B: n * r
I want to generate another matrix C: m * n, with each entry C_ij being a matrix calculated by the outer product of A_i and B_j.
For example,
A: [[1, 2],
[3, 4]]
B: [[3, 1],
[1, 2]]
gives
C: [[[3, 1], [[1 ,2],
[6, 2]], [2 ,4]],
[9, 3], [[3, 6],
[12,4]], [4, 8]]]
I can do it using for loops, like
for i in range (A.shape(0)):
for j in range (B.shape(0)):
C_ij = np.outer(A_i, B_j)
I wonder If there is a vectorised way of doing this calculation to speed it up?
The Einstein notation expresses this problem nicely
In [85]: np.einsum('ac,bd->abcd',A,B)
Out[85]:
array([[[[ 3, 1],
[ 6, 2]],
[[ 1, 2],
[ 2, 4]]],
[[[ 9, 3],
[12, 4]],
[[ 3, 6],
[ 4, 8]]]])
temp = numpy.multiply.outer(A, B)
C = numpy.swapaxes(temp, 1, 2)
NumPy ufuncs, such as multiply, have an outer method that almost does what you want. The following:
temp = numpy.multiply.outer(A, B)
produces a result such that temp[a, b, c, d] == A[a, b] * B[c, d]. You want C[a, b, c, d] == A[a, c] * B[b, d]. The swapaxes call rearranges temp to put it in the order you want.
Simple Solution with Numpy Array Broadcasting
Since, you want C_ij = A_i * B_j, this can be achieved simply by numpy broadcasting on element-wise-product of column-vector-A and row-vector-B, as shown below:
# import numpy as np
# A = [[1, 2], [3, 4]]
# B = [[3, 1], [1, 2]]
A, B = np.array(A), np.array(B)
C = A.reshape(-1,1) * B.reshape(1,-1)
# same as:
# C = np.einsum('i,j->ij', A.flatten(), B.flatten())
print(C)
Output:
array([[ 3, 1, 1, 2],
[ 6, 2, 2, 4],
[ 9, 3, 3, 6],
[12, 4, 4, 8]])
You could then get your desired four sub-matrices by using numpy.dsplit() or numpy.array_split() as follows:
np.dsplit(C.reshape(2, 2, 4), 2)
# same as:
# np.array_split(C.reshape(2,2,4), 2, axis=2)
Output:
[array([[[ 3, 1],
[ 6, 2]],
[[ 9, 3],
[12, 4]]]),
array([[[1, 2],
[2, 4]],
[[3, 6],
[4, 8]]])]
Use numpy;
In [1]: import numpy as np
In [2]: A = np.array([[1, 2], [3, 4]])
In [3]: B = np.array([[3, 1], [1, 2]])
In [4]: C = np.outer(A, B)
In [5]: C
Out[5]:
array([[ 3, 1, 1, 2],
[ 6, 2, 2, 4],
[ 9, 3, 3, 6],
[12, 4, 4, 8]])
Once you have the desired result, you can use numpy.reshape() to mold it in almost any shape you want;
In [6]: C.reshape([4,2,2])
Out[6]:
array([[[ 3, 1],
[ 1, 2]],
[[ 6, 2],
[ 2, 4]],
[[ 9, 3],
[ 3, 6]],
[[12, 4],
[ 4, 8]]])

Replace subarrays in numpy

Given an array,
>>> n = 2
>>> a = numpy.array([[[1,1,1],[1,2,3],[1,3,4]]]*n)
>>> a
array([[[1, 1, 1],
[1, 2, 3],
[1, 3, 4]],
[[1, 1, 1],
[1, 2, 3],
[1, 3, 4]]])
I know that it's possible to replace values in it succinctly like so,
>>> a[a==2] = 0
>>> a
array([[[1, 1, 1],
[1, 0, 3],
[1, 3, 4]],
[[1, 1, 1],
[1, 0, 3],
[1, 3, 4]]])
Is it possible to do the same for an entire row (last axis) in the array? I know that a[a==[1,2,3]] = 11 will work and replace all the elements of the matching subarrays with 11, but I'd like to substitute a different subarray. My intuition tells me to write the following, but an error results,
>>> a[a==[1,2,3]] = [11,22,33]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: array is not broadcastable to correct shape
In summary, what I'd like to get is:
array([[[1, 1, 1],
[11, 22, 33],
[1, 3, 4]],
[[1, 1, 1],
[11, 22, 33],
[1, 3, 4]]])
... and n of course is, in general, a lot larger than 2, and the other axes are also larger than 3, so I don't want to loop over them if I don't need to.
Update: The [1,2,3] (or whatever else I'm looking for) is not always at index 1. An example:
a = numpy.array([[[1,1,1],[1,2,3],[1,3,4]], [[1,2,3],[1,1,1],[1,3,4]]])
You can achieve this with a much higher performance using np.all to check if all the columns have a True value for your comparison, then using the created mask to replace the values:
mask = np.all(a==[1,2,3], axis=2)
a[mask] = [11, 22, 23]
print(a)
#array([[[ 1, 1, 1],
# [11, 22, 33],
# [ 1, 3, 4]],
#
# [[ 1, 1, 1],
# [11, 22, 33],
# [ 1, 3, 4]]])
You have to do something a little more complicated to acheive what you want.
You can't select slices of arrays as such, but you can select all the specific indexes you want.
So first you need to construct an array that represents the rows you wish to select. ie.
data = numpy.array([[1,2,3],[55,56,57],[1,2,3]])
to_select = numpy.array([1,2,3]*3).reshape(3,3) # three rows of [1,2,3]
selected_indices = data == to_select
# array([[ True, True, True],
# [False, False, False],
# [ True, True, True]], dtype=bool)
data = numpy.where(selected_indices, [4,5,6], data)
# array([[4, 5, 6],
# [55, 56, 57],
# [4, 5, 6]])
# done in one step, but perhaps not very clear as to its intent
data = numpy.where(data == numpy.array([1,2,3]*3).reshape(3,3), [4,5,6], data)
numpy.where works by selecting from the second argument if true and the third argument if false.
You can use where to select from 3 different types of data. The first is an array that has the same shape as selected_indices, the second is just a value on its own (like 2 or 7). The first is most complicated as can be of shape that can be broadcast into the same shape as selected_indices. In this case we provided [1,2,3] which can be stacked together to get an array with shape 3x3.
Note sure if this is what you want, your code example does not create the array you say it does. But:
>>> a = np.array([[[1,1,1],[1,2,3],[1,3,4]], [[1,1,1],[1,2,3],[1,3,4]]])
>>> a
array([[[1, 1, 1],
[1, 2, 3],
[1, 3, 4]],
[[1, 1, 1],
[1, 2, 3],
[1, 3, 4]]])
>>> a[:,1,:] = [[8, 8, 8], [8,8,8]]
>>> a
array([[[1, 1, 1],
[8, 8, 8],
[1, 3, 4]],
[[1, 1, 1],
[8, 8, 8],
[1, 3, 4]]])
>>> a[:,1,:] = [88, 88, 88]
>>> a
array([[[ 1, 1, 1],
[88, 88, 88],
[ 1, 3, 4]],
[[ 1, 1, 1],
[88, 88, 88],
[ 1, 3, 4]]])

Categories

Resources