Getting a subset of numpy array indicies an easy way - python

Can this for loop be written in a simpler way?
import itertools
import numpy as np
def f(a, b, c): # placeholder for a complex function
print(a+b+c)
a = np.arange(12).reshape(3, 4)
for y, x in itertools.product(range(a.shape[0]-1), range(a.shape[1]-1)):
f(a[y, x], a[y, x+1], a[y+1, x])
The other options I tried, look more convoluted, e.g.:
it = np.nditer(a[:-1, :-1], flags=['multi_index'])
for e in it:
y, x = it.multi_index
f(a[y, x], a[y, x+1], a[y+1, x])

Posting it as an answer, and sorry if this is too obvious, but isn't this simply
for y in range(a.shape[0]-1):
for x in range(a.shape[1]-1):
f(a[y, x], a[y, x+1], a[y+1, x])

If I use your method I got:
expected = [5, 8, 11, 17, 20, 23]
but you can vectorize the computation by generating an array containing the data in a more suitable way:
a_stacked = np.stack([a[:-1, :-1], a[:-1, 1:], a[1:, :-1]], axis=0)
From there multiple solutions:
If you already know the function will be the sum:
>>> a_stacked.sum(axis=0)
array([[ 5, 8, 11],
[17, 20, 23]])
If you know that your function is already vectorized:
>>> f(*a_stacked)
array([[ 5, 8, 11],
[17, 20, 23]])
If your function does not vectorize, you can use np.vectorize for convenience (no performance improvement):
>>> np.vectorize(f)(*a_stacked)
array([[ 5, 8, 11],
[17, 20, 23]])
Obviously you can flatten the array next.

Related

Is it possible to make this function on numpy array more efficient?

Here a is a 1D array of integer indices. To give some context, a is the atom indices of the 1st molecules. The return is the atom indices for n identical molecules, each of which contains step atoms. This function basically applies the same atom selection to many molecules
def f(a, step, n):
a.shape=1,-1
got = np.repeat(a, n, axis=0)
blocks = np.arange(0, n*step, step)
blocks.shape = n, -1
return got + blocks
For example
In [52]: f(np.array([1,3,4]), 10, 4)
Out[52]:
array([[ 1, 3, 4],
[11, 13, 14],
[21, 23, 24],
[31, 33, 34]])
It looks like broadcasting should be enough:
def f(a, step, n):
return a + np.arange(0, n*step, step)[:, None]
f(np.array([1,3,4]), 10, 4)
output:
array([[ 1, 3, 4],
[11, 13, 14],
[21, 23, 24],
[31, 33, 34]])

Iterate and replace values through a numpy array Python

I'm looking for creating a random dimension numpy array, iterate and replace values per 10 for example.
I tried :
# Import numpy library
import numpy as np
def Iter_Replace(x):
print(x)
for i in range(x):
x[i] = 10
print(x)
def main():
x = np.array(([1,2,2], [1,4,3]))
Iter_Replace(x)
main()
But I'm getting this error :
TypeError: only integer scalar arrays can be converted to a scalar index
There is a numpy function for this, numpy.full or numpy.full_like:
>>> x = np.array(([1,2,2], [1,4,3]))
>>> np.full(x.shape, 10)
array([[10, 10, 10],
[10, 10, 10]])
# OR,
>>> np.full_like(x, 10)
array([[10, 10, 10],
[10, 10, 10]])
If you want to iterate you can either use itertools.product:
>>> from itertools import product
>>> def Iter_Replace(x):
indices = product(*map(range, x.shape))
for index in indices:
x[tuple(index)] = 10
return x
>>> x = np.array([[1,2,2], [1,4,3]])
>>> Iter_Replace(x)
array([[10, 10, 10],
[10, 10, 10]])
Or, use np.nditer
>>> x = np.array([[1,2,2], [1,4,3]])
>>> for index in np.ndindex(x.shape):
x[index] = 10
>>> x
array([[10, 10, 10],
[10, 10, 10]])
You have two errors. There are missing parenthesis in the first line of main:
x = np.array(([1,2,2], [1,4,3]))
And you must replace range(x) by range(len(x)) in the Iter_Replace function.

More "pythonic" way to show a 4d matrix in 2d

I would like to plot a 4d matrix as a 2d matrix with indices:
[i][j][k][l] --> [i * nj + j][ k * nl + l]
I have a working version here.
This is working as I want, but it's not very elegant. I looked into "reshape" but this is not exactly what I'm looking for, or perhaps I am using it incorrectly.
Given a 4d array "r" with shape (100000,4), the relevant snippet I want to replace is:
def transform(i,j,k,l, s1, s2):
return [i * s1 + j, k * s2 + l]
nx = 5
ny = 11
iedges=np.linspace(0,100, nx)
jedges=np.linspace(0, 20, ny)
bins = ( iedges,jedges,iedges,jedges )
H, edges = np.histogramdd(r, bins=bins )
H2 = np.zeros(( (nx-1)*(ny-1),(nx-1)*(ny-1)))
for i in range(nx-1):
for j in range(ny-1):
for k in range(nx-1):
for l in range(ny-1):
x,y = transform(i,j,k,l,ny-1,ny-1)
H2[x][y] = H[i][j][k][l]
In this case the values of H2 will correspond to the values of H, but the entry i,j,k,l will display as i*ny + j, k * ny + l.
Example plot:
Are you sure reshape doesn't work?
I ran your code on a small random r. The nonzero terms of H are:
In [13]: np.argwhere(H)
Out[13]:
array([[0, 9, 3, 1],
[1, 1, 1, 2],
[1, 2, 1, 3],
[2, 2, 2, 3],
[3, 1, 1, 8]])
and for the transformed H2:
In [14]: np.argwhere(H2)
Out[14]:
array([[ 9, 31],
[11, 12],
[12, 13],
[22, 23],
[31, 18]])
And one of the H indices transforms to H2 indices with:
In [16]: transform(0,9,3,1,4,10)
Out[16]: [9, 31]
If I simply reshape H, I get the same array as H2:
In [17]: H3=H.reshape(40,40)
In [18]: np.argwhere(H3)
Out[18]:
array([[ 9, 31],
[11, 12],
[12, 13],
[22, 23],
[31, 18]])
In [19]: np.allclose(H2,H3)
Out[19]: True
So without delving into the details of your code, it looks to me like a simple reshape.
Looks like you can calculate i,j,k,l from x,y? This should be something like:
from functools import partial
def get_ijkl(x, y, s1, s2):
# "Reverse" of `transform`
i, j = divmod(x, s1)
k, l = divmod(y, s2)
return (i, j, k, l)
def get_2d_val(x, y, s1, s2, four_dim_array):
return four_dim_array[get_ijkl(x, y, s1, s2)]
smaller_shape = ((nx-1)*(ny-1), (nx-1)*(ny-1))
Knowing this there are several approaches possible:
numpy.fromfunction:
H3 = np.fromfunction(
partial(get_2d_val, s1=ny-1, s2=ny-1, four_dim_array=H),
shape=smaller_shape,
dtype=int,
)
assert np.all(H2 == H3)
by indexing:
indices_to_take = np.array([
[list(get_ijkl(x, y, ny-1, ny-1)) for x in range(smaller_shape[0])] for y in range(smaller_shape[1])
]).transpose()
H4 = H[tuple(indices_to_take)]
assert np.all(H2 == H4)
as answered by #hpaulj you can simply reshape array and it will be faster. But If you have some different transform and can calculate appropriate "reverse" function then using fromfunction or custom indexing will get useful

Merging rows in numpy to form new array

This is a sample of what I am trying to accomplish. I am very new to python and have searched for hours to find out what I am doing wrong. I haven't been able to find what my issue is. I am still new enough that I may be searching for the wrong phrases. If so, could you please point me in the right direction?
I want to combine n mumber of arrays to make one array. I want to have the first row from x as the first row in the combined the first row from y as the second row in combined, the first row from z as the third row in combined the the second row in x as the fourth row in combined, etc.
so I would look something like this.
x = [x1 x2 x3]
[x4 x5 x6]
[x7 x8 x9]
y = [y1 y2 y3]
[y4 y5 y6]
[y7 y8 y9]
x = [z1 z2 z3]
[z4 z5 z6]
[z7 z8 z9]
combined = [x1 x2 x3]
[y1 y2 y3]
[z1 z2 z3]
[x4 x5 x6]
[...]
[z7 z8 z9]
The best I can come up with is the
import numpy as np
x = np.random.rand(6,3)
y = np.random.rand(6,3)
z = np.random.rand(6,3)
combined = np.zeros((9,3))
for rows in range(len(x)):
combined[0::3] = x[rows,:]
combined[1::3] = y[rows,:]
combined[2::3] = z[rows,:]
print(combined)
All this does is write the last value of the input array to every third row in the output array instead of what I wanted. I am not sure if this is even the best way to do this. Any advice would help out.
*I just figure out this works but if someone knows a higher performance method, *please let me know.
import numpy as np
x = np.random.rand(6,3)
y = np.random.rand(6,3)
z = np.random.rand(6,3)
combined = np.zeros((18,3))
for rows in range(6):
combined[rows*3,:] = x[rows,:]
combined[rows*3+1,:] = y[rows,:]
combined[rows*3+2,:] = z[rows,:]
print(combined)
You can do this using a list comprehension and zip:
combined = np.array([row for row_group in zip(x, y, z) for row in row_group])
Using vectorised operations only:
A = np.vstack((x, y, z))
idx = np.arange(A.shape[0]).reshape(-1, x.shape[0]).T.flatten()
A = A[idx]
Here's a demo:
import numpy as np
x, y, z = np.random.rand(3,3), np.random.rand(3,3), np.random.rand(3,3)
print(x, y, z)
[[ 0.88259564 0.17609363 0.01067734]
[ 0.50299357 0.35075811 0.47230915]
[ 0.751129 0.81839586 0.80554345]]
[[ 0.09469396 0.33848691 0.51550685]
[ 0.38233976 0.05280427 0.37778962]
[ 0.7169351 0.17752571 0.49581777]]
[[ 0.06056544 0.70273453 0.60681583]
[ 0.57830566 0.71375038 0.14446909]
[ 0.23799775 0.03571076 0.26917939]]
A = np.vstack((x, y, z))
idx = np.arange(A.shape[0]).reshape(-1, x.shape[0]).T.flatten()
print(idx) # [0 3 6 1 4 7 2 5 8]
A = A[idx]
print(A)
[[ 0.88259564 0.17609363 0.01067734]
[ 0.09469396 0.33848691 0.51550685]
[ 0.06056544 0.70273453 0.60681583]
[ 0.50299357 0.35075811 0.47230915]
[ 0.38233976 0.05280427 0.37778962]
[ 0.57830566 0.71375038 0.14446909]
[ 0.751129 0.81839586 0.80554345]
[ 0.7169351 0.17752571 0.49581777]
[ 0.23799775 0.03571076 0.26917939]]
I have changed your code a little bit to get the desired output
import numpy as np
x = np.random.rand(6,3)
y = np.random.rand(6,3)
z = np.random.rand(6,3)
combined = np.zeros((18,3))
combined[0::3] = x
combined[1::3] = y
combined[2::3] = z
print(combined)
You had the shape of the combined matrix wrong and there is no real need for the for loop.
This might not be the most pythonic way to do it but you could
for block in range(len(combined)/3):
for rows in range(len(x)):
combined[block*3+0::3] = x[rows,:]
combined[block*3+1::3] = y[rows,:]
combined[block*3+2::3] = z[rows,:]
A simple numpy solution is to stack the arrays on a new middle axis, and reshape the result to 2d:
In [5]: x = np.arange(9).reshape(3,3)
In [6]: y = np.arange(9).reshape(3,3)+10
In [7]: z = np.arange(9).reshape(3,3)+100
In [8]: np.stack((x,y,z),axis=1).reshape(-1,3)
Out[8]:
array([[ 0, 1, 2],
[ 10, 11, 12],
[100, 101, 102],
[ 3, 4, 5],
[ 13, 14, 15],
[103, 104, 105],
[ 6, 7, 8],
[ 16, 17, 18],
[106, 107, 108]])
It may be easier to see what's happening if we give each dimension a different value; e.g. 2 3x4 arrays:
In [9]: x = np.arange(12).reshape(3,4)
In [10]: y = np.arange(12).reshape(3,4)+10
np.array combines them on a new 1st axis, making a 2x3x4 array. To get the interleaving you want, we can transpose the first 2 dimensions, producing a 3x2x4. Then reshape to a 6x4.
In [13]: np.array((x,y))
Out[13]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[10, 11, 12, 13],
[14, 15, 16, 17],
[18, 19, 20, 21]]])
In [14]: np.array((x,y)).transpose(1,0,2)
Out[14]:
array([[[ 0, 1, 2, 3],
[10, 11, 12, 13]],
[[ 4, 5, 6, 7],
[14, 15, 16, 17]],
[[ 8, 9, 10, 11],
[18, 19, 20, 21]]])
In [15]: np.array((x,y)).transpose(1,0,2).reshape(-1,4)
Out[15]:
array([[ 0, 1, 2, 3],
[10, 11, 12, 13],
[ 4, 5, 6, 7],
[14, 15, 16, 17],
[ 8, 9, 10, 11],
[18, 19, 20, 21]])
np.vstack produces a 6x4, but with the wrong order. We can't transpose that directly.
np.stack with default axis behaves just like np.array. But with axis=1, it creates a 3x2x4, which we can reshape:
In [16]: np.stack((x,y), 1)
Out[16]:
array([[[ 0, 1, 2, 3],
[10, 11, 12, 13]],
[[ 4, 5, 6, 7],
[14, 15, 16, 17]],
[[ 8, 9, 10, 11],
[18, 19, 20, 21]]])
The list zip in the accepted answer is a list version of transpose, creating a list of 3 2-element tuples.
In [17]: list(zip(x,y))
Out[17]:
[(array([0, 1, 2, 3]), array([10, 11, 12, 13])),
(array([4, 5, 6, 7]), array([14, 15, 16, 17])),
(array([ 8, 9, 10, 11]), array([18, 19, 20, 21]))]
np.array(list(zip(x,y))) produces the same thing as the stack, a 3x2x4 array.
As for speed, I suspect the allocate and assign (as in Ash's answer) is fastest:
In [27]: z = np.zeros((6,4),int)
...: for i, arr in enumerate((x,y)):
...: z[i::2,:] = arr
...:
In [28]: z
Out[28]:
array([[ 0, 1, 2, 3],
[10, 11, 12, 13],
[ 4, 5, 6, 7],
[14, 15, 16, 17],
[ 8, 9, 10, 11],
[18, 19, 20, 21]])
For serious timings, use much larger examples than this.

getting multiple array after performing subtraction operation within array elements

import numpy as np
m = []
k = []
a = np.array([[1,2,3,4,5,6],[50,51,52,40,20,30],[60,71,82,90,45,35]])
for i in range(len(a)):
m.append(a[i, -1:])
for j in range(len(a[i])-1):
n = abs(m[i] - a[i,j])
k.append(n)
k.append(m[i])
print(k)
Expected Output in k:
[5,4,3,2,1,6],[20,21,22,10,10,30],[25,36,47,55,10,35]
which is also a numpy array.
But the output that I am getting is
[array([5]), array([4]), array([3]), array([2]), array([1]), array([6]), array([20]), array([21]), array([22]), array([10]), array([10]), array([30]), array([25]), array([36]), array([47]), array([55]), array([10]), array([35])]
How can I solve this situation?
You want to subtract the last column of each sub array from themselves. Why don't you use a vectorized approach? You can do all the subtractions at once by subtracting the last column from the rest of the items and then column_stack together with unchanged version of the last column. Also note that you need to change the dimension of the last column inorder to be subtractable from the 2D array. For that sake we can use broadcasting.
In [71]: np.column_stack((abs(a[:, :-1] - a[:, None, -1]), a[:,-1]))
Out[71]:
array([[ 5, 4, 3, 2, 1, 6],
[20, 21, 22, 10, 10, 30],
[25, 36, 47, 55, 10, 35]])

Categories

Resources