While trying to convert MATLAB code to Python, I am running into a multidimensional (4D) array multiplication issue.
How do I get the same result as MATLAB's using Python/NumPy?
Python 3 NumPy code & result:
A = np.arange(1,25).reshape((2, 3, 2, 2))
B = np.array([1,10,100])
A * B[np.newaxis,:, np.newaxis, np.newaxis]
array([[[[ 1, 2],
[ 3, 4]],
[[ 50, 60],
[ 70, 80]],
[[ 900, 1000],
[1100, 1200]]],
[[[ 13, 14],
[ 15, 16]],
[[ 170, 180],
[ 190, 200]],
[[2100, 2200],
[2300, 2400]]]])
MATLAB Code & result:
A = reshape(1:24, 2,3,2,2)
B = [1 10 100]
A .* B
ans(:,:,1,1) =
1 30 500
2 40 600
ans(:,:,2,1) =
7 90 1100
8 100 1200
ans(:,:,1,2) =
13 150 1700
14 160 1800
ans(:,:,2,2) =
19 210 2300
20 220 2400
Numpy has row-major indexing as Divakar commented. Furthermore it indexes from 0.
So you can do as follows:
import numpy as np
A = np.arange(1,25).reshape((2, 2, 3, 2))
B = np.array([1,10,100])
ans = A * B[np.newaxis, np.newaxis, :, np.newaxis]
ans = np.transpose(ans)
print(ans[:,:,0,0])
print(ans[:,:,1,0])
print(ans[:,:,0,1])
print(ans[:,:,1,1])
Out:
[[ 1 30 500]
[ 2 40 600]]
[[ 7 90 1100]
[ 8 100 1200]]
[[ 13 150 1700]
[ 14 160 1800]]
[[ 19 210 2300]
[ 20 220 2400]]
Related
I have 3d array and I need to set to zero its right part. For each 2d slice (n, :, :) of the array the index of the column should be taken from vector b. This index defines separating point - the left and right parts, as shown in the figure below.
a_before = [[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]
[13 14 15 16]]
[[17 18 19 20]
[21 22 23 24]
[25 26 27 28]
[29 30 31 32]]
[[33 34 35 36]
[37 38 39 40]
[41 42 43 44]
[45 46 47 48]]]
a_before.shape = (3, 4, 4)
b = (2, 3, 1)
a_after_1 = [[[ 1 2 0 0]
[ 5 6 0 0]
[ 9 10 0 0]
[13 14 0 0]]
[[17 18 19 0]
[21 22 23 0]
[25 26 27 0]
[29 30 31 0]]
[[33 0 0 0]
[37 0 0 0]
[41 0 0 0]
[45 0 0 0]]]
After this, for each 2d slice (n, :, :) I have to take index of the column from c vector and multiply by the corresponding value taken from the vector d.
c = (1, 2, 0)
d = (50, 100, 150)
a_after_2 = [[[ 1 100 0 0]
[ 5 300 0 0]
[ 9 500 0 0]
[13 700 0 0]]
[[17 18 1900 0]
[21 22 2300 0]
[25 26 2700 0]
[29 30 3100 0]]
[[4950 0 0 0]
[5550 0 0 0]
[6150 0 0 0]
[6750 0 0 0]]]
I did it but my version looks ugly. Maybe someone can help me.
P.S. I would like to avoid for loops and use only numpy methods.
Thank You.
Here's a version without loops.
In [232]: A = np.arange(1,49).reshape(3,4,4)
In [233]: b = np.array([2,3,1])
In [234]: d = np.array([50,100,150])
In [235]: I,J = np.nonzero(b[:,None]<=np.arange(4))
In [236]: A[I,:,J]=0
In [237]: A[np.arange(3),:,b-1] *= d[:,None]
In [238]: A
Out[238]:
array([[[ 1, 100, 0, 0],
[ 5, 300, 0, 0],
[ 9, 500, 0, 0],
[ 13, 700, 0, 0]],
[[ 17, 18, 1900, 0],
[ 21, 22, 2300, 0],
[ 25, 26, 2700, 0],
[ 29, 30, 3100, 0]],
[[4950, 0, 0, 0],
[5550, 0, 0, 0],
[6150, 0, 0, 0],
[6750, 0, 0, 0]]])
Before I developed this, I wrote an iterative version. It helped me visualize the problem.
In [240]: Ac = np.arange(1,49).reshape(3,4,4)
In [241]:
In [241]: for i,v in enumerate(b):
...: Ac[i,:,v:]=0
...:
In [242]: for i,(bi,di) in enumerate(zip(b,d)):
...: Ac[i,:,bi-1]*=di
It may be easier to understand, and in that sense, less ugly!
The fact that your A has middle dimension that is "just-going-along" for the ride, complicates "vectorizing" the problem.
With a (3,4) 2d array, the solution is just:
In [251]: Ab = Ac[:,0,:]
In [252]: Ab[b[:,None]<=np.arange(4)]=0
In [253]: Ab[np.arange(3),b-1]*=d
Here it is:
import numpy as np
a = np.arange(1,49).reshape(3,4,4)
b = np.array([2,3,1])
c = np.array([1,2,0])
d = np.array([50,100,150])
for i in range(len(b)):
a[i,:,b[i]:] = 0
for i,j in enumerate(c):
a[i,:,j] = a[i,:,j]* d[i]
print(a)
#
[[[ 1 100 0 0]
[ 5 300 0 0]
[ 9 500 0 0]
[ 13 700 0 0]]
[[ 17 18 1900 0]
[ 21 22 2300 0]
[ 25 26 2700 0]
[ 29 30 3100 0]]
[[4950 0 0 0]
[5550 0 0 0]
[6150 0 0 0]
[6750 0 0 0]]]
I'm using scikit.morphology to do an erosion on a two-dimensional array. I need to also ascertain the distance of each cell to the minimum value identified in the erosion.
Example:
np.reshape(np.arange(1,126,step=5),[5,5])
array([[ 1, 6, 11, 16, 21],
[ 26, 31, 36, 41, 46],
[ 51, 56, 61, 66, 71],
[ 76, 81, 86, 91, 96],
[101, 106, 111, 116, 121]])
erosion(np.reshape(np.arange(1,126,step=5),[5,5]),selem=disk(3))
array([[ 1, 1, 1, 1, 6],
[ 1, 1, 1, 6, 11],
[ 1, 1, 1, 6, 11],
[ 1, 6, 11, 16, 21],
[26, 31, 36, 41, 46]])
Now what I want to do is also return an array that gives me the distance to the minimum like this:
array([[ 0, 1, 2, 3, 3],
[ 1, 1, 2, 3, 3],
[ 2, 2, 3, 3, 3],
[ 3, 3, 3, 3, 3],
[ 3, 3, 3, 3, 3]])
Is there a scikit tool that can do this? If not, any tips on how to efficiently achieve this result?
You can find the distances from the centre of your footprint using scipy.ndimage.distance_transform_cdt, then use SciPy's ndimage.generic_filter to return those values:
import numpy as np
from skimage.morphology import erosion, disk
from scipy import ndimage as ndi
input_arr = np.reshape(np.arange(1,126,step=5),[5,5])
footprint = disk(3)
def distance_from_min(values, distance_values):
d = np.inf
min_val = np.inf
for i in range(len(values)):
if values[i] <= min_val:
min_val = values[i]
d = distance_values[i]
return d
full_footprint = np.ones_like(footprint, dtype=float)
full_footprint[tuple(i//2 for i in footprint.shape)] = 0
# use `ndi.distance_transform_edt` instead for the euclidean distance
distance_footprint = ndi.distance_transform_cdt(
full_footprint, metric='taxicab'
)
# set values outside footprint to 0 for pretty-printing
distance_footprint[~footprint.astype(bool)] = 0
# then, extract it into values matching the values in generic_filter
distance_values = distance_footprint[footprint.astype(bool)]
output = ndi.generic_filter(
input_arr.astype(float),
distance_from_min,
footprint=footprint,
mode='constant',
cval=np.inf,
extra_arguments=(distance_values,),
)
print('input:\n', input_arr)
print('footprint:\n', footprint)
print('distance_footprint:\n', distance_footprint)
print('output:\n', output)
Which gives:
input:
[[ 1 6 11 16 21]
[ 26 31 36 41 46]
[ 51 56 61 66 71]
[ 76 81 86 91 96]
[101 106 111 116 121]]
footprint:
[[0 0 0 1 0 0 0]
[0 1 1 1 1 1 0]
[0 1 1 1 1 1 0]
[1 1 1 1 1 1 1]
[0 1 1 1 1 1 0]
[0 1 1 1 1 1 0]
[0 0 0 1 0 0 0]]
distance_footprint:
[[0 0 0 3 0 0 0]
[0 4 3 2 3 4 0]
[0 3 2 1 2 3 0]
[3 2 1 0 1 2 3]
[0 3 2 1 2 3 0]
[0 4 3 2 3 4 0]
[0 0 0 3 0 0 0]]
output:
[[0. 1. 2. 3. 3.]
[1. 2. 3. 3. 3.]
[2. 3. 4. 4. 4.]
[3. 3. 3. 3. 3.]
[3. 3. 3. 3. 3.]]
This function will be very slow, however. If you want to make it faster, you will need (a) a solution like Numba or Cython for the filter function, in conjunction with SciPy LowLevelCallables and (b) to hardcode the distance array into the distance function, because for LowLevelCallables it is more difficult to pass in extra arguments. Here is a full example with llc-tools, which you can install with pip install numba llc-tools.
import numpy as np
from scipy import ndimage as ndi
from skimage.morphology import erosion, disk
import llc
def filter_func_from_footprint(footprint):
# first, create a footprint where the values are the distance from the
# center
full_footprint = np.ones_like(footprint, dtype=float)
full_footprint[tuple(i//2 for i in footprint.shape)] = 0
# use `ndi.distance_transform_edt` instead for the euclidean distance
distance_footprint = ndi.distance_transform_cdt(
full_footprint, metric='taxicab'
)
# then, extract it into values matching the values in generic_filter
distance_footprint[~footprint.astype(bool)] = 0
distance_values = distance_footprint[footprint.astype(bool)]
# finally, create a filter function with the values hardcoded
#llc.jit_filter_function
def distance_from_min(values):
d = np.inf
min_val = np.inf
for i in range(len(values)):
if values[i] <= min_val:
min_val = values[i]
d = distance_values[i]
return d
return distance_from_min
if __name__ == '__main__':
input_arr = np.reshape(np.arange(1,126,step=5),[5,5])
footprint = disk(3)
eroded = erosion(input_arr, selem=footprint)
filter_func = filter_func_from_footprint(footprint)
result = ndi.generic_filter(
# use input_arr.astype(float) when using euclidean dist
input_arr,
filter_func,
footprint=disk(3),
mode='constant',
cval=np.inf,
)
print('input:\n', input_arr)
print('output:\n', result)
Which gives:
input:
[[ 1 6 11 16 21]
[ 26 31 36 41 46]
[ 51 56 61 66 71]
[ 76 81 86 91 96]
[101 106 111 116 121]]
output:
[[0 1 2 3 3]
[1 2 3 3 3]
[2 3 4 4 4]
[3 3 3 3 3]
[3 3 3 3 3]]
For more reading on low-level callables and llc-tools, in addition to the LowLevelCallable documentation on the SciPy site (linked above, plus links therein), you can read these two blog posts I wrote a few years ago:
SciPy's new LowLevelCallable is a game-changer
Prettier LowLevelCallables with Numba JIT and decorators
I have a very simple question ,How to get numpy array from multiple lists of same length and sort along an axis ?
I'm looking for something like:
a = [1,1,2,3,4,5,6]
b = [10,10,11,09,22,20,20]
c = [100,100,111,090,220,200,200]
d = np.asarray(a,b,c)
print d
>>>[[1,10,100],[1,10,100],[2,11,111].........[6,20,200]]
2nd Question : And if this could be achieved can i sort it along an axis (for eg. on the values of List b)?
3rd Question : Can the sorting be done over a range ? for eg. for values between b+10 and b-10 while looking at List c for further sorting. like
[[1,11,111][1,10,122][1,09,126][1,11,154][1,11,191]
[1,20,110][1,25,122][1,21,154][1,21,155][1,21,184]]
You can zip to get the array:
a = [1, 1, 2, 3, 4, 5, 6]
b = [10, 10, 11, 9, 22, 20, 20]
c = [100, 100, 111, 90, 220, 200, 200]
d = np.asarray(zip(a,b,c))
print(d)
[[ 1 10 100]
[ 1 10 100]
[ 2 11 111]
[ 3 9 90]
[ 4 22 220]
[ 5 20 200]
[ 6 20 200]]
print(d[np.argsort(d[:, 1])]) # a sorted copy
[[ 3 9 90]
[ 1 10 100]
[ 1 10 100]
[ 2 11 111]
[ 5 20 200]
[ 6 20 200]
[ 4 22 220]]
I don't know how you would do an inplace sort without doing something like:
d = np.asarray(zip(a,b,c))
d.dtype = [("0", int), ("1", int), ("2", int)]
d.shape = d.size
d.sort(order="1")
The leading 0 would make the 090 octal in python2 or invalid syntax in python3 so I removed it.
You can also sort the zipped elements before you pass the:
from operator import itemgetter
zipped = sorted(zip(a,b,c),key=itemgetter(1))
d = np.asarray(zipped)
print(d)
[[ 3 9 90]
[ 1 10 100]
[ 1 10 100]
[ 2 11 111]
[ 5 20 200]
[ 6 20 200]
[ 4 22 220]]
You can use np.dstack and np.lexsort . for example if you want to sort based on the array b(second axis) then a and then c :
>>> d=np.dstack((a,b,c))[0]
>>> indices=np.lexsort((d[:,1],d[:,0],d[:,2]))
>>> d[indices]
array([[ 3, 9, 90],
[ 1, 10, 100],
[ 1, 10, 100],
[ 2, 11, 111],
[ 5, 20, 200],
[ 6, 20, 200],
[ 4, 22, 220]])
I am worrying that this might be a really stupid question. However I can't find a solution.
I want to do the following operation in python without using a loop, because I am dealing with large size arrays.
Is there any suggestion?
import numpy as np
a = np.array([1,2,3,..., N]) # arbitrary 1d array
b = np.array([[1,2,3],[4,5,6],[7,8,9]]) # arbitrary 2d array
c = np.zeros((N,3,3))
c[0,:,:] = a[0]*b
c[1,:,:] = a[1]*b
c[2,:,:] = a[2]*b
c[3,:,:] = ...
...
...
c[N-1,:,:] = a[N-1]*b
My answer uses only numpy primitives, in particular for the array multiplication (what you want to do has a name, it is an outer product).
Due to a restriction in numpy's outer multiplication function we have to reshape the result, but this is very cheap because the data block of the ndarray is not involved.
% python
Python 2.7.8 (default, Oct 18 2014, 12:50:18)
[GCC 4.9.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> a = np.array((1,2))
>>> b = np.array([[n*m for m in (1,2,3,4,5,6)] for n in (10,100,1000)])
>>> print b
[[ 10 20 30 40 50 60]
[ 100 200 300 400 500 600]
[1000 2000 3000 4000 5000 6000]]
>>> print np.outer(a,b)
[[ 10 20 30 40 50 60 100 200 300 400 500 600
1000 2000 3000 4000 5000 6000]
[ 20 40 60 80 100 120 200 400 600 800 1000 1200
2000 4000 6000 8000 10000 12000]]
>>> print "Almost there!"
Almost there!
>>> print np.outer(a,b).reshape(a.shape[0],b.shape[0], b.shape[1])
[[[ 10 20 30 40 50 60]
[ 100 200 300 400 500 600]
[ 1000 2000 3000 4000 5000 6000]]
[[ 20 40 60 80 100 120]
[ 200 400 600 800 1000 1200]
[ 2000 4000 6000 8000 10000 12000]]]
>>>
To avoid Python-level loops, you could use np.newaxis to expand a (or None, which is the same thing):
>>> a = np.arange(1,5)
>>> b = np.arange(1,10).reshape((3,3))
>>> a[:,None,None]*b
array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[ 2, 4, 6],
[ 8, 10, 12],
[14, 16, 18]],
[[ 3, 6, 9],
[12, 15, 18],
[21, 24, 27]],
[[ 4, 8, 12],
[16, 20, 24],
[28, 32, 36]]])
Or np.einsum, which is overkill here, but is often handy and makes it very explicit what you want to happen with the coordinates:
>>> c2 = np.einsum('i,jk->ijk', a, b)
>>> np.allclose(c2, a[:,None,None]*b)
True
Didn't understand this multiplication.. but here is a way to make matrix multiplication in python using numpy:
import numpy as np
a = np.matrix([1, 2])
b = np.matrix([[1, 2], [3, 4]])
result = a*b
print(result)
>>>result
matrix([7, 10])
Hi there I have the following matrix
[[ 47 43 51 81 54 81 52 54 31 46]
[ 35 21 30 16 37 11 35 30 39 37]
[ 8 17 11 2 5 4 11 9 17 10]
[ 5 9 4 0 1 1 0 3 9 3]
[ 2 7 2 0 0 0 0 1 2 1]
[215 149 299 199 159 325 179 249 249 199]
[ 27 49 24 4 21 8 35 15 45 25]
[100 100 100 100 100 100 100 100 100 100]]
I need to iterate over the matrix summing all elements in rows 0,1,2,3,4 only
example: I need
row_0_sum = 47+43+51+81....46
Furthermore I need to store each rows sum in an array like this
[row0_sum, row1_sum, row2_sum, row3_sum, row4_sum]
So far I have tried this code but its not doing the job:
mu = np.zeros(shape=(1,6))
#get an average
def standardize_ratings(matrix):
sum = 0
for i, eli in enumerate(matrix):
for j, elj in enumerate(eli):
if(i<5):
sum = sum + matrix[i][j]
if(j==elj.len -1):
mu[i] = sum
sum = 0
print "mu[i]="
print mu[i]
This just gives me an Error: numpy.int32 object has no attribute 'len'
So can someone help me. What's the best way to do this and which type of array in Python should I use to store this. Im new to Python but have done programming....
Thannks
Make your data, matrix, a numpy.ndarray object, instead of a list of lists, and then just do matrix.sum(axis=1).
>>> matrix = np.asarray([[ 47, 43, 51, 81, 54, 81, 52, 54, 31, 46],
[ 35, 21, 30, 16, 37, 11, 35, 30, 39, 37],
[ 8, 17, 11, 2, 5, 4, 11, 9, 17, 10],
[ 5, 9, 4, 0, 1, 1, 0, 3, 9, 3],
[ 2, 7, 2, 0, 0, 0, 0, 1, 2, 1],
[215, 149, 299, 199, 159, 325, 179, 249, 249, 199],
[ 27, 49, 24, 4, 21, 8, 35, 15, 45, 25],
[100, 100, 100, 100, 100, 100, 100, 100, 100, 100]])
>>> print matrix.sum(axis=1)
[ 540 291 94 35 15 2222 253 1000]
To get the first five rows from the result, you can just do:
>>> row_sums = matrix.sum(axis=1)
>>> rows_0_through_4_sums = row_sums[:5]
>>> print rows_0_through_4_sums
[540 291 94 35 15]
Or, you can alternatively sub-select only those rows to begin with and only apply the summation to them:
>>> rows_0_through_4 = matrix[:5,:]
>>> print rows_0_through_4.sum(axis=1)
[540 291 94 35 15]
Some helpful links will be:
NumPy for Matlab Users, if you are familiar with these things in Matlab/Octave
Slicing/Indexing in NumPy