Related
Here a is a 1D array of integer indices. To give some context, a is the atom indices of the 1st molecules. The return is the atom indices for n identical molecules, each of which contains step atoms. This function basically applies the same atom selection to many molecules
def f(a, step, n):
a.shape=1,-1
got = np.repeat(a, n, axis=0)
blocks = np.arange(0, n*step, step)
blocks.shape = n, -1
return got + blocks
For example
In [52]: f(np.array([1,3,4]), 10, 4)
Out[52]:
array([[ 1, 3, 4],
[11, 13, 14],
[21, 23, 24],
[31, 33, 34]])
It looks like broadcasting should be enough:
def f(a, step, n):
return a + np.arange(0, n*step, step)[:, None]
f(np.array([1,3,4]), 10, 4)
output:
array([[ 1, 3, 4],
[11, 13, 14],
[21, 23, 24],
[31, 33, 34]])
I'm looking for creating a random dimension numpy array, iterate and replace values per 10 for example.
I tried :
# Import numpy library
import numpy as np
def Iter_Replace(x):
print(x)
for i in range(x):
x[i] = 10
print(x)
def main():
x = np.array(([1,2,2], [1,4,3]))
Iter_Replace(x)
main()
But I'm getting this error :
TypeError: only integer scalar arrays can be converted to a scalar index
There is a numpy function for this, numpy.full or numpy.full_like:
>>> x = np.array(([1,2,2], [1,4,3]))
>>> np.full(x.shape, 10)
array([[10, 10, 10],
[10, 10, 10]])
# OR,
>>> np.full_like(x, 10)
array([[10, 10, 10],
[10, 10, 10]])
If you want to iterate you can either use itertools.product:
>>> from itertools import product
>>> def Iter_Replace(x):
indices = product(*map(range, x.shape))
for index in indices:
x[tuple(index)] = 10
return x
>>> x = np.array([[1,2,2], [1,4,3]])
>>> Iter_Replace(x)
array([[10, 10, 10],
[10, 10, 10]])
Or, use np.nditer
>>> x = np.array([[1,2,2], [1,4,3]])
>>> for index in np.ndindex(x.shape):
x[index] = 10
>>> x
array([[10, 10, 10],
[10, 10, 10]])
You have two errors. There are missing parenthesis in the first line of main:
x = np.array(([1,2,2], [1,4,3]))
And you must replace range(x) by range(len(x)) in the Iter_Replace function.
I would like to plot a 4d matrix as a 2d matrix with indices:
[i][j][k][l] --> [i * nj + j][ k * nl + l]
I have a working version here.
This is working as I want, but it's not very elegant. I looked into "reshape" but this is not exactly what I'm looking for, or perhaps I am using it incorrectly.
Given a 4d array "r" with shape (100000,4), the relevant snippet I want to replace is:
def transform(i,j,k,l, s1, s2):
return [i * s1 + j, k * s2 + l]
nx = 5
ny = 11
iedges=np.linspace(0,100, nx)
jedges=np.linspace(0, 20, ny)
bins = ( iedges,jedges,iedges,jedges )
H, edges = np.histogramdd(r, bins=bins )
H2 = np.zeros(( (nx-1)*(ny-1),(nx-1)*(ny-1)))
for i in range(nx-1):
for j in range(ny-1):
for k in range(nx-1):
for l in range(ny-1):
x,y = transform(i,j,k,l,ny-1,ny-1)
H2[x][y] = H[i][j][k][l]
In this case the values of H2 will correspond to the values of H, but the entry i,j,k,l will display as i*ny + j, k * ny + l.
Example plot:
Are you sure reshape doesn't work?
I ran your code on a small random r. The nonzero terms of H are:
In [13]: np.argwhere(H)
Out[13]:
array([[0, 9, 3, 1],
[1, 1, 1, 2],
[1, 2, 1, 3],
[2, 2, 2, 3],
[3, 1, 1, 8]])
and for the transformed H2:
In [14]: np.argwhere(H2)
Out[14]:
array([[ 9, 31],
[11, 12],
[12, 13],
[22, 23],
[31, 18]])
And one of the H indices transforms to H2 indices with:
In [16]: transform(0,9,3,1,4,10)
Out[16]: [9, 31]
If I simply reshape H, I get the same array as H2:
In [17]: H3=H.reshape(40,40)
In [18]: np.argwhere(H3)
Out[18]:
array([[ 9, 31],
[11, 12],
[12, 13],
[22, 23],
[31, 18]])
In [19]: np.allclose(H2,H3)
Out[19]: True
So without delving into the details of your code, it looks to me like a simple reshape.
Looks like you can calculate i,j,k,l from x,y? This should be something like:
from functools import partial
def get_ijkl(x, y, s1, s2):
# "Reverse" of `transform`
i, j = divmod(x, s1)
k, l = divmod(y, s2)
return (i, j, k, l)
def get_2d_val(x, y, s1, s2, four_dim_array):
return four_dim_array[get_ijkl(x, y, s1, s2)]
smaller_shape = ((nx-1)*(ny-1), (nx-1)*(ny-1))
Knowing this there are several approaches possible:
numpy.fromfunction:
H3 = np.fromfunction(
partial(get_2d_val, s1=ny-1, s2=ny-1, four_dim_array=H),
shape=smaller_shape,
dtype=int,
)
assert np.all(H2 == H3)
by indexing:
indices_to_take = np.array([
[list(get_ijkl(x, y, ny-1, ny-1)) for x in range(smaller_shape[0])] for y in range(smaller_shape[1])
]).transpose()
H4 = H[tuple(indices_to_take)]
assert np.all(H2 == H4)
as answered by #hpaulj you can simply reshape array and it will be faster. But If you have some different transform and can calculate appropriate "reverse" function then using fromfunction or custom indexing will get useful
This is a sample of what I am trying to accomplish. I am very new to python and have searched for hours to find out what I am doing wrong. I haven't been able to find what my issue is. I am still new enough that I may be searching for the wrong phrases. If so, could you please point me in the right direction?
I want to combine n mumber of arrays to make one array. I want to have the first row from x as the first row in the combined the first row from y as the second row in combined, the first row from z as the third row in combined the the second row in x as the fourth row in combined, etc.
so I would look something like this.
x = [x1 x2 x3]
[x4 x5 x6]
[x7 x8 x9]
y = [y1 y2 y3]
[y4 y5 y6]
[y7 y8 y9]
x = [z1 z2 z3]
[z4 z5 z6]
[z7 z8 z9]
combined = [x1 x2 x3]
[y1 y2 y3]
[z1 z2 z3]
[x4 x5 x6]
[...]
[z7 z8 z9]
The best I can come up with is the
import numpy as np
x = np.random.rand(6,3)
y = np.random.rand(6,3)
z = np.random.rand(6,3)
combined = np.zeros((9,3))
for rows in range(len(x)):
combined[0::3] = x[rows,:]
combined[1::3] = y[rows,:]
combined[2::3] = z[rows,:]
print(combined)
All this does is write the last value of the input array to every third row in the output array instead of what I wanted. I am not sure if this is even the best way to do this. Any advice would help out.
*I just figure out this works but if someone knows a higher performance method, *please let me know.
import numpy as np
x = np.random.rand(6,3)
y = np.random.rand(6,3)
z = np.random.rand(6,3)
combined = np.zeros((18,3))
for rows in range(6):
combined[rows*3,:] = x[rows,:]
combined[rows*3+1,:] = y[rows,:]
combined[rows*3+2,:] = z[rows,:]
print(combined)
You can do this using a list comprehension and zip:
combined = np.array([row for row_group in zip(x, y, z) for row in row_group])
Using vectorised operations only:
A = np.vstack((x, y, z))
idx = np.arange(A.shape[0]).reshape(-1, x.shape[0]).T.flatten()
A = A[idx]
Here's a demo:
import numpy as np
x, y, z = np.random.rand(3,3), np.random.rand(3,3), np.random.rand(3,3)
print(x, y, z)
[[ 0.88259564 0.17609363 0.01067734]
[ 0.50299357 0.35075811 0.47230915]
[ 0.751129 0.81839586 0.80554345]]
[[ 0.09469396 0.33848691 0.51550685]
[ 0.38233976 0.05280427 0.37778962]
[ 0.7169351 0.17752571 0.49581777]]
[[ 0.06056544 0.70273453 0.60681583]
[ 0.57830566 0.71375038 0.14446909]
[ 0.23799775 0.03571076 0.26917939]]
A = np.vstack((x, y, z))
idx = np.arange(A.shape[0]).reshape(-1, x.shape[0]).T.flatten()
print(idx) # [0 3 6 1 4 7 2 5 8]
A = A[idx]
print(A)
[[ 0.88259564 0.17609363 0.01067734]
[ 0.09469396 0.33848691 0.51550685]
[ 0.06056544 0.70273453 0.60681583]
[ 0.50299357 0.35075811 0.47230915]
[ 0.38233976 0.05280427 0.37778962]
[ 0.57830566 0.71375038 0.14446909]
[ 0.751129 0.81839586 0.80554345]
[ 0.7169351 0.17752571 0.49581777]
[ 0.23799775 0.03571076 0.26917939]]
I have changed your code a little bit to get the desired output
import numpy as np
x = np.random.rand(6,3)
y = np.random.rand(6,3)
z = np.random.rand(6,3)
combined = np.zeros((18,3))
combined[0::3] = x
combined[1::3] = y
combined[2::3] = z
print(combined)
You had the shape of the combined matrix wrong and there is no real need for the for loop.
This might not be the most pythonic way to do it but you could
for block in range(len(combined)/3):
for rows in range(len(x)):
combined[block*3+0::3] = x[rows,:]
combined[block*3+1::3] = y[rows,:]
combined[block*3+2::3] = z[rows,:]
A simple numpy solution is to stack the arrays on a new middle axis, and reshape the result to 2d:
In [5]: x = np.arange(9).reshape(3,3)
In [6]: y = np.arange(9).reshape(3,3)+10
In [7]: z = np.arange(9).reshape(3,3)+100
In [8]: np.stack((x,y,z),axis=1).reshape(-1,3)
Out[8]:
array([[ 0, 1, 2],
[ 10, 11, 12],
[100, 101, 102],
[ 3, 4, 5],
[ 13, 14, 15],
[103, 104, 105],
[ 6, 7, 8],
[ 16, 17, 18],
[106, 107, 108]])
It may be easier to see what's happening if we give each dimension a different value; e.g. 2 3x4 arrays:
In [9]: x = np.arange(12).reshape(3,4)
In [10]: y = np.arange(12).reshape(3,4)+10
np.array combines them on a new 1st axis, making a 2x3x4 array. To get the interleaving you want, we can transpose the first 2 dimensions, producing a 3x2x4. Then reshape to a 6x4.
In [13]: np.array((x,y))
Out[13]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[10, 11, 12, 13],
[14, 15, 16, 17],
[18, 19, 20, 21]]])
In [14]: np.array((x,y)).transpose(1,0,2)
Out[14]:
array([[[ 0, 1, 2, 3],
[10, 11, 12, 13]],
[[ 4, 5, 6, 7],
[14, 15, 16, 17]],
[[ 8, 9, 10, 11],
[18, 19, 20, 21]]])
In [15]: np.array((x,y)).transpose(1,0,2).reshape(-1,4)
Out[15]:
array([[ 0, 1, 2, 3],
[10, 11, 12, 13],
[ 4, 5, 6, 7],
[14, 15, 16, 17],
[ 8, 9, 10, 11],
[18, 19, 20, 21]])
np.vstack produces a 6x4, but with the wrong order. We can't transpose that directly.
np.stack with default axis behaves just like np.array. But with axis=1, it creates a 3x2x4, which we can reshape:
In [16]: np.stack((x,y), 1)
Out[16]:
array([[[ 0, 1, 2, 3],
[10, 11, 12, 13]],
[[ 4, 5, 6, 7],
[14, 15, 16, 17]],
[[ 8, 9, 10, 11],
[18, 19, 20, 21]]])
The list zip in the accepted answer is a list version of transpose, creating a list of 3 2-element tuples.
In [17]: list(zip(x,y))
Out[17]:
[(array([0, 1, 2, 3]), array([10, 11, 12, 13])),
(array([4, 5, 6, 7]), array([14, 15, 16, 17])),
(array([ 8, 9, 10, 11]), array([18, 19, 20, 21]))]
np.array(list(zip(x,y))) produces the same thing as the stack, a 3x2x4 array.
As for speed, I suspect the allocate and assign (as in Ash's answer) is fastest:
In [27]: z = np.zeros((6,4),int)
...: for i, arr in enumerate((x,y)):
...: z[i::2,:] = arr
...:
In [28]: z
Out[28]:
array([[ 0, 1, 2, 3],
[10, 11, 12, 13],
[ 4, 5, 6, 7],
[14, 15, 16, 17],
[ 8, 9, 10, 11],
[18, 19, 20, 21]])
For serious timings, use much larger examples than this.
import numpy as np
m = []
k = []
a = np.array([[1,2,3,4,5,6],[50,51,52,40,20,30],[60,71,82,90,45,35]])
for i in range(len(a)):
m.append(a[i, -1:])
for j in range(len(a[i])-1):
n = abs(m[i] - a[i,j])
k.append(n)
k.append(m[i])
print(k)
Expected Output in k:
[5,4,3,2,1,6],[20,21,22,10,10,30],[25,36,47,55,10,35]
which is also a numpy array.
But the output that I am getting is
[array([5]), array([4]), array([3]), array([2]), array([1]), array([6]), array([20]), array([21]), array([22]), array([10]), array([10]), array([30]), array([25]), array([36]), array([47]), array([55]), array([10]), array([35])]
How can I solve this situation?
You want to subtract the last column of each sub array from themselves. Why don't you use a vectorized approach? You can do all the subtractions at once by subtracting the last column from the rest of the items and then column_stack together with unchanged version of the last column. Also note that you need to change the dimension of the last column inorder to be subtractable from the 2D array. For that sake we can use broadcasting.
In [71]: np.column_stack((abs(a[:, :-1] - a[:, None, -1]), a[:,-1]))
Out[71]:
array([[ 5, 4, 3, 2, 1, 6],
[20, 21, 22, 10, 10, 30],
[25, 36, 47, 55, 10, 35]])