I love numpy because it allows vectorized operation such as:
mat1 = np.array([[1,2],[3,4]])
mat2 = np.array([[10,20],[30,40]])
mat3 = (mat1 + mat2)*2.0 # vectorization way. nice.
But, I can not find how to do this kind of operation with diagonal elements. What I'd like to do is
Is it possible to operate above in a vectorization way with numpy?
For the first exemple :
With :
In [3]: A
"""
array([[1, 3, 4, 0, 4],
[2, 3, 3, 3, 0],
[1, 0, 4, 1, 0],
[0, 3, 3, 2, 0],
[2, 1, 0, 3, 2]])
"""
In [4]: Aii=vstack((diag(A),)*A.shape[0])
"""
array([[1, 3, 4, 2, 2],
[1, 3, 4, 2, 2],
[1, 3, 4, 2, 2],
[1, 3, 4, 2, 2],
[1, 3, 4, 2, 2]])
"""
In [5]: Ajj=Aii.T # transpose
In [6]: B= 1/ (Aii+Ajj-2*A)
Or, with more abstract tools :
B1 = 1 / (np.add.outer(diag(A),diag(A))-2*A)
B2 = A / np.sqrt(np.multiply.outer(diag(A),diag(A)))
For cases like those with no dependency between iterations, broadcasting seems like the obvious approach after extending the dimensions of the diagonal 1D array to a 2D version with None/np.newaxis and performing the computations with its 1D version to simulate A[i,i] and A[j,j] respectively. Thus, to solve those two cases, one could proceed like so -
Ad = np.diag(A)
case1_out_vectorized = 1/(Ad[:,None] + Ad - 2*A)
case2_out_vectorized = A/np.sqrt(Ad[:,None]*Ad)
Sample run -
In [33]: # Random input array
...: A = np.random.rand(4,4)
...:
...: # Naive loopy implmentation (used here for verification)
...: m = A.shape[0]
...: case1_out_loopy = np.zeros((m,m))
...: case2_out_loopy = np.zeros((m,m))
...: for i in range(m):
...: for j in range(m):
...: case1_out_loopy[i,j] = 1/(A[i,i] + A[j,j] - 2*A[i,j])
...: case2_out_loopy[i,j] = A[i,j]/np.sqrt(A[i,i]*A[j,j])
...:
In [34]: # Proposed approach
...: Ad = np.diag(A)
...: case1_out_vectorized = 1/(Ad[:,None] + Ad - 2*A)
...: case2_out_vectorized = A/np.sqrt(Ad[:,None]*Ad)
...:
In [35]: np.allclose(case1_out_loopy,case1_out_vectorized)
Out[35]: True
In [36]: np.allclose(case2_out_loopy,case2_out_vectorized)
Out[36]: True
Related
Given python code with numpy:
import numpy as np
a = np.arange(6).reshape(3, 2) # a = [[0, 1], [2, 3], [4, 5]]; a.shape = (3, 2)
b = np.arange(3) + 1 # b = [1, 2, 3] ; b.shape = (3,)
How can I multiply each value in b with each corresponding row ('vector') in a? So here, I want the result as:
result = [[0, 1], [4, 6], [12, 15]] # result.shape = (3, 2)
I can do this with a loop, but I am wondering about a vectorized approach. I found an Octave solution here. Apart from this, I didn't find anything else. Any pointers for this?
Thank you in advance.
Probably the simplest is to do the following.
import numpy as np
a = np.arange(6).reshape(3, 2) # a = [[0, 1], [2, 3], [4, 5]]; a.shape = (3, 2)
b = np.arange(3) + 1
ans = np.diag(b)#a
Here's a method that exploits numpy multiplication broadcasting:
ans = (b*a.T).T
These two solutions basically take the same approach
ans = np.tile(b,(2,1)).T*a
ans = np.vstack([b for _ in range(a.shape[1])]).T*a
In [123]: a = np.arange(6).reshape(3, 2) # a = [[0, 1], [2, 3], [4, 5]]; a.
...: shape = (3, 2)
...: b = np.arange(3) + 1 # b = [1, 2, 3] ; b.
...: shape = (3,)
In [124]: a
Out[124]:
array([[0, 1],
[2, 3],
[4, 5]])
A (3,1) will multiply a (3,2) via broadcasting:
In [125]: a*b[:,None]
Out[125]:
array([[ 0, 1],
[ 4, 6],
[12, 15]])
I have a list of coefficients and a list of times
a = np.array([0,1,2,3])
t = np.array([1,2,3])
I would like to perform some multiplicative operation on the two where each coefficient is multiplied by each of the times to result in an array like:
array([[0, 0, 0],
[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
I can do this with a for loop like:
np.array([i * t for i in a])
However I was wondering whether there was a more efficient numpythonic way of performing this operation without the for loop as in reality I have much bigger arrays and multiple sets of coefficients?
Try this (uses broadcasting).
>>> import numpy as np
>>> a = np.array([0, 1, 2, 3])
>>> t = np.array([1, 2, 3])
>>> res1 = t * a[:, None]
>>> res1
array([[0, 0, 0],
[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
My prefered way is:
a[:, None] * t
But there is also a special method for that:
np.outer(a, t)
I have an array like this and need to replace every 1 with 2, every 3 with 4, every 4 with 1. Is there a way to do this just with np and not loops?
import numpy as np
np.random.seed(2)
arr=np.random.randint(1,5,(3,3),int)
arr
array([[1, 4, 2],
[1, 3, 4],
[3, 4, 1]])
If I use array mask sequentially, it doesn't give the expected outcome:
array([[2, 1, 2],
[2, 4, 1],
[4, 1, 2]])
It is based on a conditional logic and not maths formula
If the array values don't necessarely range between 1 and 4 you can use np.select:
import numpy as np
a = np.random.randint(1,5, (3,3))
condlist = [np.logical_or(a==1, a==2), a==3, a==4]
choicelist= [2, 4, 1]
b = np.select(condlist, choicelist)
which does not care about the order of the conditions
Here's one with np.searchsorted for performance efficiency -
def map_values(arr, old_val, new_val):
sidx = old_val.argsort()
idx = np.searchsorted(old_val,arr,sorter=sidx)
return np.where(old_val[idx]==arr, new_val[sidx[idx]], arr)
Sample run -
In [40]: arr
Out[40]:
array([[1, 4, 2],
[1, 3, 4],
[3, 4, 1]])
In [41]: old_val = np.array([1,3,4])
...: new_val = np.array([2,4,1])
In [42]: map_values(arr, old_val, new_val)
Out[42]:
array([[2, 1, 2],
[2, 4, 1],
[4, 1, 2]])
Could do this with a lambda function and np.vectorize():
import numpy as np
np.random.seed(2)
arr=np.random.randint(1,5,(3,3),int)
f = lambda x: x%4 + 1 if x in [1,3,4] else x
vfunc = np.vectorize(f)
Usage:
>>> vfunc(arr)
array([[2, 1, 2],
[2, 4, 1],
[4, 1, 2]])
You have to be careful about the order of assignments. For example, if you do
arr[arr == 4] = 1
arr[arr == 1] = 2
Now all of the elements that were originally 4 will be 2, not 1 as you intend.
One solution is to carefully craft the order of assignments:
arr[arr == 1] = 2
arr[arr == 4] = 1
However, this is very brittle and will fall apart as you introduce more of them. It would be better to create the masks up front from the original array:
ones = arr == 1
fours = arr == 4
arr[ones] = 2
arr[fours] = 1
Now the order of the assignments won't matter because the masks are determined before modifying the array.
You want arr % 4 + 1, except in the case of 2, which stays the same. So use np.where to find all the 2s. Then do arr % 4 + 1, then reset all the 2s.
import numpy as np
np.random.seed(2)
arr=np.random.randint(1,5,(3,3),int)
twos = np.where(arr == 2)
arr = arr % 4 + 1
arr[twos] = 2
print(arr)
I've come up with this question while trying to apply a Cesar Cipher to a matrix with different shift values for each row, i.e. given a matrix X
array([[1, 0, 8],
[5, 1, 4],
[2, 1, 1]])
with shift values of S = array([0, 1, 1]), the output needs to be
array([[1, 0, 8],
[1, 4, 5],
[1, 1, 2]])
This is easy to implement by the following code:
Y = []
for i in range(X.shape[0]):
if (S[i] > 0):
Y.append( X[i,S[i]::].tolist() + X[i,:S[i]:].tolist() )
else:
Y.append(X[i,:].tolist())
Y = np.array(Y)
This is a left-cycle-shift. I wonder how to do this in a more efficient way using numpy arrays?
Update: This example applies the shift to the columns of a matrix. Suppose that we have a 3D array
array([[[8, 1, 8],
[8, 6, 2],
[5, 3, 7]],
[[4, 1, 0],
[5, 9, 5],
[5, 1, 7]],
[[9, 8, 6],
[5, 1, 0],
[5, 5, 4]]])
Then, the cyclic right shift of S = array([0, 0, 1]) over the columns leads to
array([[[8, 1, 7],
[8, 6, 8],
[5, 3, 2]],
[[4, 1, 7],
[5, 9, 0],
[5, 1, 5]],
[[9, 8, 4],
[5, 1, 6],
[5, 5, 0]]])
Approach #1 : Use modulus to implement the cyclic pattern and get the new column indices and then simply use advanced-indexing to extract the elements, giving us a vectorized solution, like so -
def cyclic_slice(X, S):
m,n = X.shape
idx = np.mod(np.arange(n) + S[:,None],n)
return X[np.arange(m)[:,None], idx]
Approach #2 : We can also leverage the power of strides for further speedup. The idea would be to concatenate the sliced off portion from the start and append it at the end, then create sliding windows of lengths same as the number of cols and finally index into the appropriate window numbers to get the same rolled over effect. The implementation would be like so -
def cyclic_slice_strided(X, S):
X2 = np.column_stack((X,X[:,:-1]))
s0,s1 = X2.strides
strided = np.lib.stride_tricks.as_strided
m,n1 = X.shape
n2 = X2.shape[1]
X2_3D = strided(X2, shape=(m,n2-n1+1,n1), strides=(s0,s1,s1))
return X2_3D[np.arange(len(S)),S]
Sample run -
In [34]: X
Out[34]:
array([[1, 0, 8],
[5, 1, 4],
[2, 1, 1]])
In [35]: S
Out[35]: array([0, 1, 1])
In [36]: cyclic_slice(X, S)
Out[36]:
array([[1, 0, 8],
[1, 4, 5],
[1, 1, 2]])
Runtime test -
In [75]: X = np.random.rand(10000,100)
...: S = np.random.randint(0,100,(10000))
# #Moses Koledoye's soln
In [76]: %%timeit
...: Y = []
...: for i, x in zip(S, X):
...: Y.append(np.roll(x, -i))
10 loops, best of 3: 108 ms per loop
In [77]: %timeit cyclic_slice(X, S)
100 loops, best of 3: 14.1 ms per loop
In [78]: %timeit cyclic_slice_strided(X, S)
100 loops, best of 3: 4.3 ms per loop
Adaption for 3D case
Adapting approach #1 for the 3D case, we would have -
shift = 'left'
axis = 1 # axis along which S is to be used (axis=1 for rows)
n = X.shape[axis]
if shift == 'left':
Sa = S
else:
Sa = -S
# For rows
idx = np.mod(np.arange(n)[:,None] + Sa,n)
out = X[:,idx, np.arange(len(S))]
# For columns
idx = np.mod(Sa[:,None] + np.arange(n),n)
out = X[:,np.arange(len(S))[:,None], idx]
# For axis=0
idx = np.mod(np.arange(n)[:,None] + Sa,n)
out = X[idx, np.arange(len(S))]
There could be a way to have a generic solution for a generic axis, but I will keep it to this point.
You could shift each row using np.roll and use the new rows to build the output array:
Y = []
for i, x in zip(S, X):
Y.append(np.roll(x, -i))
print(np.array(Y))
array([[1, 0, 8],
[1, 4, 5],
[1, 1, 2]])
I'm trying to lexicographically rank array components. The below code works fine, but I'd like to assign equal ranks to equal elements.
import numpy as np
values = np.asarray([
[1, 2, 3],
[1, 1, 1],
[2, 2, 3],
[1, 2, 3],
[1, 1, 2]
])
# need to flip, because for `np.lexsort` last
# element has highest priority.
values_reversed = np.fliplr(values)
# this returns the order, i.e. the order in
# which the elements should be in a sorted
# array (not the rank by index).
order = np.lexsort(values_reversed.T)
# convert order to ranks.
n = values.shape[0]
ranks = np.empty(n, dtype=int)
# use order to assign ranks.
ranks[order] = np.arange(n)
The rank variable contains [2, 0, 4, 3, 1], but a rank array of [2, 0, 4, 2, 1] is required because elements [1, 2, 3] (index 0 and 3) share the same rank. Continuous rank numbers are ok, so [2, 0, 3, 2, 1] is also an acceptable rank array.
Here's one approach -
# Get lexsorted indices and hence sorted values by those indices
lexsort_idx = np.lexsort(values.T[::-1])
lexsort_vals = values[lexsort_idx]
# Mask of steps where rows shift (there are no duplicates in subsequent rows)
mask = np.r_[True,(lexsort_vals[1:] != lexsort_vals[:-1]).any(1)]
# Get the stepped indices (indices shift at non duplicate rows) and
# the index values are scaled corresponding to row numbers
stepped_idx = np.maximum.accumulate(mask*np.arange(mask.size))
# Re-arrange the stepped indices based on the original order of rows
# This is basically same as the original code does in last 4 steps,
# just in a concise manner
out_idx = stepped_idx[lexsort_idx.argsort()]
Sample step-by-step intermediate outputs -
In [55]: values
Out[55]:
array([[1, 2, 3],
[1, 1, 1],
[2, 2, 3],
[1, 2, 3],
[1, 1, 2]])
In [56]: lexsort_idx
Out[56]: array([1, 4, 0, 3, 2])
In [57]: lexsort_vals
Out[57]:
array([[1, 1, 1],
[1, 1, 2],
[1, 2, 3],
[1, 2, 3],
[2, 2, 3]])
In [58]: mask
Out[58]: array([ True, True, True, False, True], dtype=bool)
In [59]: stepped_idx
Out[59]: array([0, 1, 2, 2, 4])
In [60]: lexsort_idx.argsort()
Out[60]: array([2, 0, 4, 3, 1])
In [61]: stepped_idx[lexsort_idx.argsort()]
Out[61]: array([2, 0, 4, 2, 1])
Performance boost
For more performance efficiency to compute lexsort_idx.argsort(), we could use and this is identical to the original code in last 4 lines -
def argsort_unique(idx):
# Original idea : http://stackoverflow.com/a/41242285/3293881 by #Andras
n = idx.size
sidx = np.empty(n,dtype=int)
sidx[idx] = np.arange(n)
return sidx
Thus, lexsort_idx.argsort() could be alternatively computed with argsort_unique(lexsort_idx).
Runtime test
Applying few more optimization tricks, we would have a version like so -
def numpy_app(values):
lexsort_idx = np.lexsort(values.T[::-1])
lexsort_v = values[lexsort_idx]
mask = np.concatenate(( [False],(lexsort_v[1:] == lexsort_v[:-1]).all(1) ))
stepped_idx = np.arange(mask.size)
stepped_idx[mask] = 0
np.maximum.accumulate(stepped_idx, out=stepped_idx)
return stepped_idx[argsort_unique(lexsort_idx)]
#Warren Weckesser's rankdata based method as a func for timings -
def scipy_app(values):
v = values.view(np.dtype(','.join([values.dtype.str]*values.shape[1])))
return rankdata(v, method='min') - 1
Timings -
In [97]: a = np.random.randint(0,9,(10000,3))
In [98]: out1 = numpy_app(a)
In [99]: out2 = scipy_app(a)
In [100]: np.allclose(out1, out2)
Out[100]: True
In [101]: %timeit scipy_app(a)
100 loops, best of 3: 5.32 ms per loop
In [102]: %timeit numpy_app(a)
100 loops, best of 3: 1.96 ms per loop
Here's a way to do it using scipy.stats.rankdata (with method='min'), by viewing the 2-d array as a 1-d structured array:
In [15]: values
Out[15]:
array([[1, 2, 3],
[1, 1, 1],
[2, 2, 3],
[1, 2, 3],
[1, 1, 2]])
In [16]: v = values.view(np.dtype(','.join([values.dtype.str]*values.shape[1])))
In [17]: rankdata(v, method='min') - 1
Out[17]: array([2, 0, 4, 2, 1])