I have a 2-dimensional array of integers, we'll call it "A".
I want to create a 3-dimensional array "B" of all 1s and 0s such that:
for any fixed (i,j) sum(B[i,j,:])==A[i.j], that is, B[i,j,:] contains A[i,j] 1s in it
the 1s are randomly placed in the 3rd dimension.
I know how I would do this using standard python indexing but this turns out to be very slow.
I am looking for a way to do this that takes advantage of the features that can make Numpy fast.
Here is how I would do it using standard indexing:
B=np.zeros((X,Y,Z))
indexoptions=range(Z)
for i in xrange(Y):
for j in xrange(X):
replacedindices=np.random.choice(indexoptions,size=A[i,j],replace=False)
B[i,j,[replacedindices]]=1
Can someone please explain how I can do this in a faster way?
Edit: Here is an example "A":
A=np.array([[0,1,2,3,4],[0,1,2,3,4],[0,1,2,3,4],[0,1,2,3,4],[0,1,2,3,4]])
in this case X=Y=5 and Z>=5
Essentially the same idea as #JohnZwinck and #DSM, but with a shuffle function for shuffling a given axis:
import numpy as np
def shuffle(a, axis=-1):
"""
Shuffle `a` in-place along the given axis.
Apply numpy.random.shuffle to the given axis of `a`.
Each one-dimensional slice is shuffled independently.
"""
b = a.swapaxes(axis,-1)
# Shuffle `b` in-place along the last axis. `b` is a view of `a`,
# so `a` is shuffled in place, too.
shp = b.shape[:-1]
for ndx in np.ndindex(shp):
np.random.shuffle(b[ndx])
return
def random_bits(a, n):
b = (a[..., np.newaxis] > np.arange(n)).astype(int)
shuffle(b)
return b
if __name__ == "__main__":
np.random.seed(12345)
A = np.random.randint(0, 5, size=(3,4))
Z = 6
B = random_bits(A, Z)
print "A:"
print A
print "B:"
print B
Output:
A:
[[2 1 4 1]
[2 1 1 3]
[1 3 0 2]]
B:
[[[1 0 0 0 0 1]
[0 1 0 0 0 0]
[0 1 1 1 1 0]
[0 0 0 1 0 0]]
[[0 1 0 1 0 0]
[0 0 0 1 0 0]
[0 0 1 0 0 0]
[1 0 1 0 1 0]]
[[0 0 0 0 0 1]
[0 0 1 1 1 0]
[0 0 0 0 0 0]
[0 0 1 0 1 0]]]
Related
I have a numpy array which contains vectorised data. I need to compare each of these vectors (a row in the array) euclidean distances to itself and every other row.
The vectors are of the form
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
I know I need two loops, here is what I have so far
def euclidean_distance_loop(termdoc):
i = 0
j = 0
matrix = np.array([])
while( j < (len(termdoc-1))):
matrix = np.append(matrix,[euclidean_distance(termdoc[i],termdoc[j])])
j = j + 1
return np.array([matrix])
euclidean_distance_loop(termdoc)
I know this is an index problem and I need another index or an incremented index in another loop but not sure how to construct it
You don’t need loops.
def self_distance(x):
return np.linalg.norm(x[:,np.newaxis] - x, axis=-1)
See also:
Numpy. Compare all vector row in one array with every other one in the same array
How can the Euclidean distance be calculated with NumPy?
I want to create a 64 components array showing all the squares in which the two rooks of an empty chessboard could move from their current position. So far I am doing it with for and while loops.
I first create a function just to better visualize the board:
import numpy as np
def from_array_to_matrix(v):
m=np.zeros((8,8)).astype('int')
for row in range(8):
for column in range(8):
m[row,column]=v[row*8+column]
return m
and here I show how I actually build the array:
# positions of the two rooks
a=np.zeros(64).astype('int')
a[15] = 1
a[25] = 1
print from_array_to_matrix(a)
# attack_a will be all the squares where they could move in the empty board
attack_a=np.zeros(64).astype('int')
for piece in np.where(a)[0]:
j=0
square=piece+j*8
while square<64:
attack_a[square]=1
j+=1
square=piece+j*8
j=0
square=piece-j*8
while square>=0:
attack_a[square]=1
j+=1
square=piece-j*8
j=0
square=piece+j
while square<8*(1+piece//8):
attack_a[square]=1
j+=1
square=piece+j
j=0
square=piece-j
while square>=8*(piece//8):
attack_a[square]=1
j+=1
square=piece-j
print attack_a
print from_array_to_matrix(attack_a)
I have been advised to avoid for and while loops whenever it is possible to use other ways, because they tend to be time consuming. Is there any way to achieve the same result without iterating the process with for and while loops ?
Perhaps using the fact that the indices to which I want to assign the value 1 can be determined by a function.
There are a couple of different ways to do this. The simplest thing is of course to work with matrices.
But you can vectorize operations on the raveled array as well. For example, say you had a rook at position 0 <= n < 64 in the linear array. To set the row to one, use integer division:
array[8 * (n // 8):8 * (n // 8 + 1)] = True
To set the column, use modulo:
array[n % 8::8] = True
You can convert to a matrix using reshape:
matrix = array.reshape(8, 8)
And back using ravel:
array = martix.ravel()
Or reshape:
array = matrix.reshape(-1)
Setting ones in a matrix is even simpler, given a specific row 0 <= m < 8 and column 0 <= n < 8:
matrix[m, :] = matrix[:, n] = True
Now the only question is how to vectorize multiple indices simultaneously. As it happens, you can use a fancy index in one axis. I.e, the expression above can be used with an m and n containing multiple elements:
m, n = np.nonzero(matrix)
matrix[m, :] = matrix[:, n] = True
You could even play games and do this with the array, also using fancy indexing:
n = np.nonzero(array)[0]
r = np.linspace(8 * (n // 8), 8 * (n // 8 + 1), 8, False).T.ravel()
c = np.linspace(n % 8, n % 8 + 64, 8, False)
array[r] = array[c] = True
Using linspace allows you to generate multiple sequences of the same size simultaneously. Each sequence is a column, so we transpose before raveling, although this is not required.
Use reshaping to convert 1-D array to 8x8 2-D matrix and then numpy advance indexing to select rows and columns to set to 1:
import numpy as np
def from_array_to_matrix(v):
return v.reshape(8,8)
# positions of the two rooks
a=np.zeros(64).astype('int')
a[15] = 1
a[25] = 1
a = from_array_to_matrix(a)
# attack_a will be all the squares where they could move in the empty board
attack_a=np.zeros(64).astype('int')
attack_a = from_array_to_matrix(attack_a)
#these two lines replace your for and while loops
attack_a[np.where(a)[0],:] = 1
attack_a[:,np.where(a)[1]] = 1
output:
a:
[[0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 1]
[0 0 0 0 0 0 0 0]
[0 1 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0]]
attack_a:
[[0 1 0 0 0 0 0 1]
[1 1 1 1 1 1 1 1]
[0 1 0 0 0 0 0 1]
[1 1 1 1 1 1 1 1]
[0 1 0 0 0 0 0 1]
[0 1 0 0 0 0 0 1]
[0 1 0 0 0 0 0 1]
[0 1 0 0 0 0 0 1]]
I currently have a matrix with the following values:
[[0,0,0,0,0],
[0,0,1,0,0],
[1,1,0,0,0],
[0,0,0,1,0],
[0,0,0,0,1]]
I would like to expand the values of 1 above and below by 1 resulting in an matrix like:
[[0,0,1,0,0],
[1,1,1,0,0],
[1,1,1,1,0],
[1,1,0,1,1],
[0,0,0,1,1]]
Create an empty row above and below, and then just add the array twice: shifted up one row and shifted down one row.
import numpy as np
a = np.array([[0,0,0,0,0],
[0,0,1,0,0],
[1,1,0,0,0],
[0,0,0,1,0],
[0,0,0,0,1]])
space = np.zeros((1,a.shape[1]),dtype=int)
c = np.vstack((space,a,space))
c[:a.shape[0]] += a
c[-a.shape[0]:]+= a
c = (c[1:-1]!=0).astype(int)
print(c)
Out:
[[0 0 1 0 0]
[1 1 1 0 0]
[1 1 1 1 0]
[1 1 0 1 1]
[0 0 0 1 1]]
I have a 3D image with size: Deep x Weight x Height (for example: 10x20x30, means 10 images, and each image has size 20x30.
Given a patch size is pd x pw x ph (such as pd <Deep, pw<Weight, ph<Height), for example patch size: 4x4x4. The center point location of the path will be: pd/2 x pw/2 x ph/2. Let's call the distance between time t and time t+1 of the center point be stride, for example stride=2.
I want to extract the original 3D image into patches with size and stride given above. How can I do it in python? Thank you
.
Use np.lib.stride_tricks.as_strided. This solution does not require the strides to divide the corresponding dimensions of the input stack. It even allows for overlapping patches (Just do not write to the result in this case, or make a copy.). It therefore is more flexible than other approaches:
import numpy as np
from numpy.lib import stride_tricks
def cutup(data, blck, strd):
sh = np.array(data.shape)
blck = np.asanyarray(blck)
strd = np.asanyarray(strd)
nbl = (sh - blck) // strd + 1
strides = np.r_[data.strides * strd, data.strides]
dims = np.r_[nbl, blck]
data6 = stride_tricks.as_strided(data, strides=strides, shape=dims)
return data6#.reshape(-1, *blck)
#demo
x = np.zeros((5, 6, 12), int)
y = cutup(x, (2, 2, 3), (3, 3, 5))
y[...] = 1
print(x[..., 0], '\n')
print(x[:, 0, :], '\n')
print(x[0, ...], '\n')
Output:
[[1 1 0 1 1 0]
[1 1 0 1 1 0]
[0 0 0 0 0 0]
[1 1 0 1 1 0]
[1 1 0 1 1 0]]
[[1 1 1 0 0 1 1 1 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]]
[[1 1 1 0 0 1 1 1 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]]
Explanation. Numpy arrays are organised in terms of strides, one for each dimension, data point [x,y,z] is located in memory at address base + stridex * x + stridey * y + stridez * z.
The stride_tricks.as_strided factory allows to directly manipulate the strides and shape of a new array sharing its memory with a given array. Try this only if you know what you're doing because no checks are performed, meaning you are allowed to shoot your foot by addressing out-of-bounds memory.
The code uses this function to split up each of the three existing dimensions into two new ones, one for the corresponding within block coordinate (this will have the same stride as the original dimension, because adjacent points in a block corrspond to adjacent points in the whole stack) and one dimension for the block index along this axis; this will have stride = original stride x block stride.
All the code does is computing the correct strides and dimensions (= block dimensions and block counts along the three axes).
Since the data are shared with the original array, when we set all points of the 6d array to 1, they are also set in the original array exposing the block structure in the demo. Note that the commented out reshape in the last line of the function breaks this link, because it forces a copy.
the skimage module offer you an integrated solution with view_as_blocks.
The source is on line.
Take care to choose Deep,Weight,Height multiple of pd, pw, ph, because as_strided do not check bounds.
I have a problem with the instruction np.nonzero() in python. I want to take all the indices of a given list that are non zero. So, consider that I have the following code:
import numpy as np
from scipy.special import binom
M=4
N=3
def generate(N,nb):
states = np.zeros((int(binom(nb+N-1, nb)), N), dtype=int)
states[0, 0]=nb
ni = 0 # init
for i in xrange(1, states.shape[0]):
states[i,:N-1] = states[i-1, :N-1]
states[i,ni] -= 1
states[i,ni+1] += 1+states[i-1, N-1]
if ni >= N-2:
if np.any(states[i, :N-1]):
ni = np.nonzero(states[i, :N-1])[0][-1]
else:
ni += 1
return states
base = generate(M,N)
The result of base is given by:
base = [[3 0 0 0]
[2 1 0 0]
[2 0 1 0]
[2 0 0 1]
[1 2 0 0]
[1 1 1 0]
[1 1 0 1]
[1 0 2 0]
[1 0 1 1]
[1 0 0 2]
[0 3 0 0]
[0 2 1 0]
[0 2 0 1]
[0 1 2 0]
[0 1 1 1]
[0 1 0 2]
[0 0 3 0]
[0 0 2 1]
[0 0 1 2]
[0 0 0 3]]
The point is that for a given index j,k I want to take all the items in base that has non-zero components in the sites j,k, for example:
Taking j=0,k=1 I have to obtain:
result = [1 4 5 6]
which corresponds to the elements 1,4,5,6 of base that satisfies this condition. On the other hand, I have used the command:
np.nonzero((base[:, j]) & (base[:, k]))[0]
but it doesn't work correctly, any idea why?
First of all, the syntax for list index base[:, j] is wrong, use : [:][j] instead
also:
np.nonzero((base[:, j]) & (base[:, k]))[0]
won't work ,because the & sign is not applicable here..
you could use numpy like this:
b = np.array(base);
j=0;k=1;
np.nonzero(b.T[j]* b.T[k])[0]
which will give:
array([1, 4, 5, 6])