i need some help to create a script to calculate the "pn" automatically.
Now I have this code:
import numpy as np
from itertools import product
a=np.arange(1,4,1)
po= []
po = list(product(a, repeat =2))
array1= np.array(po)
array2= np.array([[2,40],[3,40],[4,43]])
p1=array1[0,0]*array2[:,1:]**array1[0,1]
p2=array1[1,0]*array2[:,1:]**array1[1,1]
p3=array1[2,0]*array2[:,1:]**array1[2,1]
array1 represents the ordered pairs and the array2 represents some values of depth.
The equation is pn = array1(first element of pn line)*array2(the second column)**array1(second element of pn line)
How can I solve that? How can I calculate all the p automatically?
Thanks a lot.
You could compute all the pi for i = 1,...,n all at once:
ps = (array1[:, 0] * (array2[:, 1:]**array1[:, 1])).T[..., None]
where
p1 equals ps[0],
p2 equals ps[1],
...
pn equals ps[n-1]
For example,
import numpy as np
from itertools import product
a = np.arange(1, 4, 1)
po = []
po = list(product(a, repeat=2))
array1 = np.array(po)
array2 = np.array([[2, 40], [3, 40], [4, 43]])
p1 = array1[0, 0] * array2[:, 1:]**array1[0, 1]
p2 = array1[1, 0] * array2[:, 1:]**array1[1, 1]
p3 = array1[2, 0] * array2[:, 1:]**array1[2, 1]
ps = (array1[:, 0] * (array2[:, 1:]**array1[:, 1])).T[..., None]
assert np.allclose(p1, ps[0])
assert np.allclose(p2, ps[1])
assert np.allclose(p3, ps[2])
This expression was found by considering the shapes of the component arrays.
In [294]: array2[:, 1:].shape
Out[294]: (3, 1)
In [295]: array1[:, 1].shape
Out[295]: (9,)
Broadcasting allows us to compute (array2[:, 1:]**array1[:, 1]), creating an array of shape (3, 9):
In [296]: (array2[:, 1:]**array1[:, 1]).shape
Out[296]: (3, 9)
Since array1[:, 0] is a 1D array of shape (9,):
In [297]: array1[:, 0].shape
Out[297]: (9,)
we can again use broadcasting to multiply the two together, resulting in an array of shape (3, 9):
In [299]: (array1[:, 0] * (array2[:, 1:]**array1[:, 1])).shape
Out[299]: (3, 9)
Since we want to p1 to become ps[0], and p2 to become ps[1], and so on,
we want the dimension of length 9 to be the first axis. So transpose:
In [300]: (array1[:, 0] * (array2[:, 1:]**array1[:, 1])).T.shape
Out[300]: (9, 3)
And since p1 has shape (3, 1) instead of just (3,), we need to add another dimension to the result. This is the purpose of indexing by [..., None].
In [304]: (array1[:, 0] * (array2[:, 1:]**array1[:, 1])).T[..., None].shape
Out[304]: (9, 3, 1)
Create a variable, n, and use it where the array index needs to change. I put it in a function call for convenience and had to subtract 1 from n because arrays start at 0.
def calculate_pn(n):
pn = array1[n-1,0]*array2[:,n-1:]**array1[n-1,1]
return pn
> calculate_pn(n=1)
array([[40],
[40],
[43]], dtype=int32)
You can call this with a range of values to calculate multiple p values. Below I use a dict comprehension to make a lookup table of p values between 1 and the number of elements in array1.
> p = { n:calculate_pn(n) for n in range(1, len(array1)) }
> p[1]
array([[ 2, 40],
[ 3, 40],
[ 4, 43]], dtype=int32)
(You may wish to edit calculate_pn to accept array1 and array2 as parameters as well)
Related
I want to multiply two 3D tensors in a specific way.
The two tensors have shapes T1 = (a,b,c) and T2 = (d,b,c).
What I want is to multiply a times T2 by the successive 'slices' (b,c) of a.
In other words, I want to have the same as this code :
import numpy as np
a=2
b=3
c=4
d=5
T1 = np.random.rand(a,b,c)
T2 = np.random.rand(d,b,c)
L= []
for j in range(a) :
L+=[T1[j,:,:]*T2]
L = np.array(L)
L.shape
I have the iterative solution and I try with axes arguments but I didn't succeed in the second way.
Ok, now I think I got the solution:
a=2
b=3
c=4
d=5
T1 = np.random.rand(a,b,c)
T2 = np.random.rand(d,b,c)
L = np.zeros(shape=(a,d,b,c))
for i1 in range(len(T1)):
for i2 in range(len(T2)):
L[i1,i2] = np.multiply(np.array(T1[i1]),np.array(T2[i2]))
Since the shapes:
In [26]: T1.shape, T2.shape
Out[26]: ((2, 3, 4), (5, 3, 4))
produce a:
In [27]: L.shape
Out[27]: (2, 5, 3, 4)
Let's try a broadcasted pair of arrays:
In [28]: res = T1[:,None]*T2[None,:]
Shape and values match:
In [29]: res.shape
Out[29]: (2, 5, 3, 4)
In [30]: np.allclose(L,res)
Out[30]: True
tensordot, dot, or matmul don't apply; just plain elementwise multiplication, with broadcasting.
I have many of small, say 5 x 5, matrices, A = numpy.random.rand(5, 5, 7, 77) with one right-hand side y = numpy.random.rand(5). I'd like to solve all 7 x 77 problems A_{ij} x = b, such that the result x has shape 5, 7, 77. I can simply loop over them,
from scipy.linalg import solve
import numpy
A = numpy.random.rand(5, 5, 7, 77)
b = numpy.random.rand(5)
x = []
for i in range(A.shape[2]):
x.append([])
for j in range(A.shape[3]):
x[-1].append(solve(A[:, :, i, j], b))
x = numpy.array(x)
x = numpy.moveaxis(x, -1, 0)
print(x.shape)
but this is slow. It feels that it should be possible to vectorize by treating A not as a 5 x 5 x 7 x 77 tensor or floats, but as a 5 x 5 matrix of 7 x 77 float arrays, and perform all operations in solve on those arrays. Any hints?
(I come across these kind of problems rather often, so if there's a library handling them, I'd also be glad to hear about it.)
You can do that with np.linalg.solve if you reorder the dimensions first.
import numpy as np
# Make random problem
np.random.seed(0)
a = np.random.rand(5, 5, 7, 77)
b = np.random.rand(5)
# Put additional axes at the end
at = np.moveaxis(a, (0, 1), (2, 3))
# Solve
xt = np.linalg.solve(at, b[np.newaxis, np.newaxis])
# Put axes back in place
x = np.moveaxis(xt, 2, 0)
print(x.shape)
# (5, 7, 77)
# Test some result
print(np.allclose(a[:, :, 4, 36] # x[:, 4, 36], b))
# True
Here is a 3-dimensional numpy array:
import numpy as np
m = np.array([
[
[1,2,3,2], [4,5,6,3]
],
[
[7,8,9,4], [1,2,3,5]
]
])
For each tuple, I need to multiply the first three values by the last one (divided by 10 and rounded), and then to keep only the 3 results. For example in [1,2,3,2]:
The 1 becomes: round(1 * 2 / 10) = 0
The 2 becomes: round(2 * 2 / 10) = 0
The 3 becomes: round(3 * 2 / 10) = 1
So, [1,2,3,2] becomes: [0,0,1].
And the complete result will be:
[
[
[0,0,1], [1,2,2]
],
[
[3,3,4], [1,1,2]
]
]
I tried to separate the last value of each tuple in a alpha variable, and the 3 first values in a rgb variable.
alpha = m[:, :, 3] / 10
rgb = m[:, :, :3]
But after that I'm a beginner in Python and I really don't know how to process these arrays.
A little help from an experienced Python-guy will be most welcome.
Try this
n = np.rint(m[:,:,:3] * m[:,:,[-1]] / 10).astype(int)
Out[192]:
array([[[0, 0, 1],
[1, 2, 2]],
[[3, 3, 4],
[0, 1, 2]]])
I have to take a random integer 50x50x50 array and determine which contiguous 3x3x3 cube within it has the largest sum.
It seems like a lot of splitting features in Numpy don't work well unless the smaller cubes are evenly divisible into the larger one. Trying to work through the thought process I made a 48x48x48 cube that is just in order from 1 to 110,592. I then was thinking of reshaping it to a 4D array with the following code and assessing which of the arrays had the largest sum? when I enter this code though it splits the array in an order that is not ideal. I want the first array to be the 3x3x3 cube that would have been in the corner of the 48x48x48 cube. Is there a syntax that I can add to make this happen?
import numpy as np
arr1 = np.arange(0,110592)
arr2=np.reshape(arr1, (48,48,48))
arr3 = np.reshape(arr2, (4096, 3,3,3))
arr3
output:
array([[[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[ 12, 13, 14],
[ 15, 16, 17]],
[[ 18, 19, 20],
[ 21, 22, 23],
[ 24, 25, 26]]],
desired output:
array([[[[ 0, 1, 2],
[ 48, 49, 50],
[ 96, 97, 98]],
etc etc
Solution
There's a live version of this solution online you can try for yourself
There's a simple (kind of) solution to your original problem of finding the maximum 3x3x3 subcube in a 50x50x50 cube that's based on changing the input array's strides. This solution is completely vectorized (meaning no looping), and so should get the best possible performance out of Numpy:
import numpy as np
def cubecube(arr, cshape):
strides = (*arr.strides, *arr.strides)
shape = (*np.array(arr.shape) - cshape + 1, *cshape)
return np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
def maxcube(arr, cshape):
cc = cubecube(arr, cshape)
ccsums = cc.sum(axis=tuple(range(-arr.ndim, 0)))
ix = np.unravel_index(np.argmax(ccsums), ccsums.shape)[:arr.ndim]
return ix, cc[ix]
The maxcube function takes an array and the shape of the subcubes, and returns a tuple of (first-index-of-largest-cube, largest-cube). Here's an example of how to use maxcube:
shape = (50, 50, 50)
cshape = (3, 3, 3)
# set up a 50x50x50 array
arr = np.arange(np.prod(shape)).reshape(*shape)
# set one of the subcubes as the largest
arr[37, 26, 11] = 999999
ix, cube = maxcube(arr, cshape)
print('first index of largest cube: {}'.format(ix))
print('largest cube:\n{}'.format(cube))
which outputs:
first index of largest cube: (37, 26, 11)
largest cube:
[[[999999 93812 93813]
[ 93861 93862 93863]
[ 93911 93912 93913]]
[[ 96311 96312 96313]
[ 96361 96362 96363]
[ 96411 96412 96413]]
[[ 98811 98812 98813]
[ 98861 98862 98863]
[ 98911 98912 98913]]]
In depth explanation
A cube of cubes
What you have is a 48x48x48 cube, but what you want is a cube of smaller cubes. One can be converted to the other by altering its strides. For a 48x48x48 array of dtype int64, the stride will originally be set as (48*48*8, 48*8, 8). The first value of each non-overlapping 3x3x3 subcube can be iterated over with a stride of (3*48*48*8, 3*48*8, 3*8). Combine these strides to get the strides of the cube of cubes:
# Set up a 48x48x48 array, like in OP's example
arr = np.arange(48**3).reshape(48,48,48)
shape = (16,16,16,3,3,3)
strides = (3*48*48*8, 3*48*8, 3*8, 48*48*8, 48*8, 8)
# restride into a 16x16x16 array of 3x3x3 cubes
arr2 = np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
arr2 is a view of arr (meaning that they share data, so no copy needs to be made) with a shape of (16,16,16,3,3,3). The ijkth 3x3 cube in arr can be accessed by passing the indices to arr2:
i,j,k = 0,0,0
print(arr2[i,j,k])
Output:
[[[ 0 1 2]
[ 48 49 50]
[ 96 97 98]]
[[2304 2305 2306]
[2352 2353 2354]
[2400 2401 2402]]
[[4608 4609 4610]
[4656 4657 4658]
[4704 4705 4706]]]
You can get the sums of all of the subcubes by just summing across the inner axes:
sumOfSubcubes = arr2.sum(3,4,5)
This will yield a 16x16x16 array in which each value is the sum of a non-overlapping 3x3x3 subcube from your original array. This solves the specific problem about the 48x48x48 array that the OP asked about. Restriding can also be used to find all of the overlapping 3x3x3 cubes, as in the cubecube function above.
Your thought process with the 48x48x48 cube goes in the right direction insofar that there are 48³ different contiguous 3x3x3 cubes within the 50x50x50 array, though I don't understand why you would want to reshape it.
What you could do is add all 27 values of each 3x3x3 cube to a 48x48x48 dimensional array by going through all 27 permutations of adjacent slices and find the maximum over it. The found entry will give you the index tuple coordinate_index of the cube corner that is closest to the origin of your original array.
import numpy as np
np.random.seed(0)
array_shape = np.array((50,50,50), dtype=int)
cube_dim = np.array((3,3,3), dtype=int)
original_array = np.random.randint(array_shape)
reduced_shape = array_shape - cube_dim + 1
sum_array = np.zeros(reduced shape, dtype=int)
for i in range(cube_dim[0]):
for j in range(cube_dim[1]):
for k in range(cube_dim[2]):
sum_array += original_array[
i:-cube_dim[0]+1+i, j:-cube_dim[1]+1+j, k:-cube_dim[2]+1+k
]
flat_index = np.argmax(sum_array)
coordinate_index = np.unravel_index(flat_index, reduced_shape)
This method should be faster than looping over each of the 48³ index combinations to find the desired cube as it uses in place summation but in turn requires more memory. I'm not sure about it, but defining an (48³, 27) array with slices and using np.sum over the second axis could be even faster.
You can easily change the above code to find a cuboid with arbitrary side lengths instead.
This is a solution without many numpy functions, just numpy.sum. First define a squared matrix and then the size of the cube cs you are going to perform the summation within.
Just change cs to adjust the cube size and find other solutions. Following #Divakar suggestion, I have used a 4x4x4 array and I also store the location where the cube is location (just the vertex of the cube's origin)
import numpy as np
np.random.seed(0)
a = np.random.randint(0,9,(4,4,4))
print(a)
cs = 2 # Cube size
my_sum = 0
idx = None
for i in range(a.shape[0]-cs+2):
for j in range(a.shape[1]-cs+2):
for k in range(a.shape[2]-cs+2):
cube_sum = np.sum(a[i:i+cs, j:j+cs, k:k+cs])
print(cube_sum)
if cube_sum > my_sum:
my_sum = cube_sum
idx = (i,j,k)
print(my_sum, idx) # 42 (0, 0, 0)
This 3D array a is
[[[5 0 3 3]
[7 3 5 2]
[4 7 6 8]
[8 1 6 7]]
[[7 8 1 5]
[8 4 3 0]
[3 5 0 2]
[3 8 1 3]]
[[3 3 7 0]
[1 0 4 7]
[3 2 7 2]
[0 0 4 5]]
[[5 6 8 4]
[1 4 8 1]
[1 7 3 6]
[7 2 0 3]]]
And you get my_sum = 42 and idx = (0, 0, 0) for cs = 2. And my_sum = 112 and idx = (1, 0, 0) for cs = 3
Here is a cumsum based fast solution:
import numpy as np
nd = 3
cs = 3
N = 50
# create indices [cs-1:, ...], [:, cs-1:, ...], ...
fromcsm = *zip(*np.where(np.identity(nd, bool), np.s_[cs-1:], np.s_[:])),
# create indices [cs:, ...], [:, cs:, ...], ...
fromcs = *zip(*np.where(np.identity(nd, bool), np.s_[cs:], np.s_[:])),
# create indices [:cs, ...], [:, :cs, ...], ...
tocs = *zip(*np.where(np.identity(nd, bool), np.s_[:cs], np.s_[:])),
# create indices [:-cs, ...], [:, :-cs, ...], ...
tomcs = *zip(*np.where(np.identity(nd, bool), np.s_[:-cs], np.s_[:])),
# create indices [cs-1, ...], [:, cs-1, ...], ...
atcsm = *zip(*np.where(np.identity(nd, bool), cs-1, np.s_[:])),
def windowed_sum(a):
out = a.copy()
for i, (fcsm, fcs, tcs, tmcs, acsm) \
in enumerate(zip(fromcsm, fromcs, tocs, tomcs, atcsm)):
out[fcs] -= out[tmcs]
out[acsm] = out[tcs].sum(axis=i)
out = out[fcsm].cumsum(axis=i)
return out
This returns the sums over all the sub cubes. We can then use argmax and unravel_index to get the offset of the maximum cube. Example:
np.random.seed(0)
a = np.random.randint(0,9,(N,N,N))
s = windowed_sum(a)
idx = np.unravel_index(np.argmax(s,), s.shape)
I have numpy ndarrays which could be 3 or 4 dimensional. I'd like to find maximum values and their indices in a moving subarray window with specified strides.
For example, suppose I have a 4x4 2d array and my moving subarray window is 2x2 with stride 2 for simplicity:
[[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9,10,11,12],
[13,14,15,16]].
I'd like to find
[[ 6 8],
[14 16]]
for max values and
[(1,1), (3,1),
(3,1), (3,3)]
for indices as output.
Is there a concise, efficient implementation for this for ndarray without using loops?
Here's a solution using stride_tricks:
def make_panes(arr, window):
arr = np.asarray(arr)
r,c = arr.shape
s_r, s_c = arr.strides
w_r, w_c = window
if c % w_c != 0 or r % w_r != 0:
raise ValueError("Window doesn't fit array.")
shape = (r / w_r, c / w_c, w_r, w_c)
strides = (w_r*s_r, w_c*s_c, s_r, s_c)
return np.lib.stride_tricks.as_strided(arr, shape, strides)
def max_in_panes(arr, window):
w_r, w_c = window
r, c = arr.shape
panes = make_panes(arr, window)
v = panes.reshape((-1, w_r * w_c))
ix = np.argmax(v, axis=1)
max_vals = v[np.arange(r/w_r * c/w_c), ix]
i = np.repeat(np.arange(0,r,w_r), c/w_c)
j = np.tile(np.arange(0, c, w_c), r/w_r)
rel_i, rel_j = np.unravel_index(ix, window)
max_ix = i + rel_i, j + rel_j
return max_vals, max_ix
A demo:
>>> vals, ix = max_in_panes(x, (2,2))
>>> print vals
[[ 6 8]
[14 16]]
>>> print ix
(array([1, 1, 3, 3]), array([1, 3, 1, 3]))
Note that this is pretty untested, and is designed to work with 2d arrays. I'll leave the generalization to n-d arrays to the reader...