import numpy as np
import itertools as it
SPIN_POS = np.array([[0, 0, 0], [1, 1, 0], [1, 0, 1], [0, 1, 1],
[2, 2, 0], [3, 3, 0], [3, 2, 1], [2, 3, 1],
[2, 0, 2], [3, 1, 2], [3, 0, 3], [2, 1, 3],
[0, 2, 2], [1, 3, 2], [1, 2, 3], [0, 3, 3]
]) / 4
def gen_posvecs(xdim:int, ydim:int, zdim:int):
"""
Generates position vectors of site pairs in the lattice of size xdim,ydim,zdim
:param x,y,z is the number of unit cells in the x,y,z directions;
:returns array containing the position vectors
"""
poss = np.zeros((xdim,ydim,zdim,16,3))
for x,y,z,s in it.product(range(xdim), range(ydim), range(zdim), range(16)):
poss[x,y,z,s] = np.array([x,y,z]) + SPIN_POS[s]
return poss
A = gen_sepvecs(4,4,4) # A.shape = (4,4,4,16,3)
B = np.subtract.outer(A[...,-1], A) # my attempt at a soln
assert all(A[1,2,0,12] - A[0,1,3,11] == B[1,2,0,12,0,1,3,11]) # should give true
Consider the above code. I have an array A of shape (4,4,4,16,3), which represents 3D position vectors in a lattice (the last axis of dim 3 are the x,y,z coordinates). The first 4 dimensions index the site in the lattice.
What I want
I would like to generate from A, an array containing all possible separation vectors between sites in the lattice. This means an output array B, of shape (4,4,4,16,4,4,4,16,3). The first 4 dimensions being of site i, next 4 dimensions of site j, then the last dimension of the (x,y,z) coordinate of the position vector difference.
i.e., A[a,b,c,d]: shape (3,) is the (x,y,z) of first site; A[r,s,t,u]: shape (3,) is the (x,y,z) of second site; Then I want B[a,b,c,d,r,s,t,u] to be (x,y,z) difference between the first two.
My attempt
I know about the ufunc.outer function, as you can see in my attempt in code. But I'm stuck at applying it together with performing element-wise subtraction on the last axis (the (x,y,z)) of each A.
In my attempt, B has the correct dimensions I want, but it is obviously wrong. Any hints? (barring the use of any for-loops)
I think you just need to do:
B = (A[:, :, :, :, np.newaxis, np.newaxis, np.newaxis, np.newaxis] -
A[np.newaxis, np.newaxis, np.newaxis, np.newaxis])
In your code:
import numpy as np
import itertools as it
SPIN_POS = np.array([[0, 0, 0], [1, 1, 0], [1, 0, 1], [0, 1, 1],
[2, 2, 0], [3, 3, 0], [3, 2, 1], [2, 3, 1],
[2, 0, 2], [3, 1, 2], [3, 0, 3], [2, 1, 3],
[0, 2, 2], [1, 3, 2], [1, 2, 3], [0, 3, 3]
]) / 4
def gen_posvecs(xdim:int, ydim:int, zdim:int):
"""
Generates position vectors of site pairs in the lattice of size xdim,ydim,zdim
:param x,y,z is the number of unit cells in the x,y,z directions;
:returns array containing the position vectors
"""
poss = np.zeros((xdim,ydim,zdim,16,3))
for x,y,z,s in it.product(range(xdim), range(ydim), range(zdim), range(16)):
poss[x,y,z,s] = np.array([x,y,z]) + SPIN_POS[s]
return poss
A = gen_posvecs(4,4,4) # A.shape = (4,4,4,16,3)
B = A[:, :, :, :, np.newaxis, np.newaxis, np.newaxis, np.newaxis] - A[np.newaxis, np.newaxis, np.newaxis, np.newaxis]
assert all(A[1,2,0,12] - A[0,1,3,11] == B[1,2,0,12,0,1,3,11])
# Does not fail
Related
Given a matrix, I want to split it in equally smaller matrices of m x n size. If the matrix is not divisible by the given size, we just put the remainder into a different matrix.
For example, given the matrix below and m=2 and n=2:
[[1, 0, 1],
[0, 0, 0],
[0, 1, 1]]
Result:
[[1, 0],
[0, 0]],
[[1],
[0]],
[[0, 1]],
[[1]],
I was using np.reshape but it fails to split when the numbers don't match, as in the example above.
matrix_size = matrix.shape[0] * matrix.shape[1]
n_matrix = math.ceil(matrix_size / (m * n))
matrix.reshape(n_matrix, m, n)
One way you could do this is using multiple calls to numpy.array_split
import numpy as np
matrix = [
[1, 0, 1],
[0, 0, 0],
[0, 1, 1],
]
sub_matrices = np.array_split(matrix, 2, axis=0)
sub_matrices = [m for sub_matrix in sub_matrices for m in np.array_split(sub_matrix, 2, axis=1)]
Where the first call to array_split splits it vertically, and the second call splits it horizontally.
Suppose I have a tensor 2D tensor x of shape (n,m). How can I extend the first dimension of the tensor by appending zero rows in x by specifying the indices of where the zero rows will be located in the resulting tensor? For a concrete example:
x = torch.tensor([[1,1,1],
[2,2,2],
[3,3,3],
[4,4,4]])
And I want to append 2 zero rows such that their row-index will be 1,3, respectively, in the resulting tensor? I.e. in the example the result would be
X = torch.tensor([1,1,1],
[0,0,0],
[2,2,2],
[0,0,0],
[3,3,3],
[4,4,4]])
I tried using F.pad and reshape.
You can use torch.tensor.index_add_.
import torch
zero_index = [1, 3]
size = (6, 3)
x = torch.tensor([[1,1,1],
[2,2,2],
[3,3,3],
[4,4,4]])
t = torch.zeros(size, dtype=torch.int64)
index = torch.tensor([i for i in range(size[0]) if i not in zero_index])
# index -> tensor([0, 2, 4, 5])
t.index_add_(0, index, x)
print(t)
Output:
tensor([[1, 1, 1],
[0, 0, 0],
[2, 2, 2],
[0, 0, 0],
[3, 3, 3],
[4, 4, 4]])
You can use torch.cat:
def insert_zeros(x, all_j):
zeros_ = torch.zeros_like(x[:1])
pieces = []
i = 0
for j in all_j + [len(x)]:
pieces.extend([x[i:j],
zeros_])
i = j
return torch.cat(pieces[:-1],
dim=0 )
# insert_zeros(x, [1,2])
# tensor([[1, 1, 1],
# [0, 0, 0],
# [2, 2, 2],
# [0, 0, 0],
# [3, 3, 3],
# [4, 4, 4]])
This code is compatible with backpropagation, since the tensors are not modified in-place.
More information:
What's the difference between torch.stack() and torch.cat()?
If I have the array [[1,0,0],[0,1,0],[0,0,1]] (let's call it So) which is done as numpy.eye(3).
How can I get that the elements below the diagonal are only 2 and 3 like this [[1,0,0],[2,1,0],[3,2,1]] ?? How can I assign vectors of an array to a different set of values?
I know I could use numpy.concatenate to join 3 vectors and I know how to change rows/columns but I can't figure out how to change diagonals below the main diagonal.
I tried to do np.diagonal(So,-1)=2*np.diagonal(So,-1) to change the diagonal right below the main diagonal but I get the error message cannot assign to function call.
I would not start from numpy.eye but rather numpy.ones and use numpy.tril+cumsum to compute the next numbers on the lower triangle:
import numpy as np
np.tril(np.ones((3,3))).cumsum(axis=0).astype(int)
output:
array([[1, 0, 0],
[2, 1, 0],
[3, 2, 1]])
reversed output (from comment)
Assuming the array is square
n = 3
a = np.tril(np.ones((n,n)))
(a*(n+2)-np.eye(n)*n-a.cumsum(axis=0)).astype(int)
Output:
array([[1, 0, 0],
[3, 1, 0],
[2, 3, 1]])
Output for n=5:
array([[1, 0, 0, 0, 0],
[5, 1, 0, 0, 0],
[4, 5, 1, 0, 0],
[3, 4, 5, 1, 0],
[2, 3, 4, 5, 1]])
You can use np.fill_diagonal and index the matrix so the principal diagonal of your matrix is the one you want. This suposing you want to put other values than 2 and 3 is the a good solution:
import numpy as np
q = np.eye(3)
#if you want the first diagonal below the principal
# you can call q[1:,:] (this is not a 3x3 or 2x3 matrix but it'll work)
val =2
np.fill_diagonal(q[1:,:], val)
#note that here you can use an unique value 'val' or
# an array with values of corresponding size
#np.fill_diagonal(q[1:,:], [2, 2])
#then you can do the same on the last one column
np.fill_diagonal(q[2:,:], 3)
You could follow this approach:
def func(n):
... return np.array([np.array(list(range(i, 0, -1)) + [0,] * (n - i)) for i in range(1, n + 1)])
func(3)
OUTPUT
array([[1, 0, 0],
[2, 1, 0],
[3, 2, 1]])
I'm attempting to slice a 3-D instance of ndarray representing a color image, where each element (or pixel) in the 2-D array contains an array with 3 bytes corresponding to the values of Red, Green, and Blue respectively. I want to slice out a 2-D ndarray for each color individually so that I can flatten them and concatenate them end-to-end as per our implementation requirements. The code I'm currently trying is...
red = image[:, :, 0]
green = image[:, :, 1]
blue = image[:, :, 2]
collapsed_image = numpy.concatenate((red.flatten('C'), green.flatten('C'), blue.flatten('C')), axis=0)
Where image is my numpy.ndarray object containing the 3-D array of bytes. Would this work to slice out each individual color's 2-D array and flatten/concatenate them together end-to-end?
Do you mean to achieve an output like this?
from scipy.ndimage import *
import matplotlib.pyplot as p
%matplotlib inline
im=imread('rgb.png')
print np.shape(im)
p.subplot(121)
p.imshow(im)
red = im[:, :, 0]
green = im[:, :, 1]
blue = im[:, :, 2]
imchannels = np.concatenate((red, green, blue))
p.subplot(122)
p.imshow(imchannels)
Output:
(215L, 235L, 3L)
an ndarray is already a flatten collection of bytes of memory, but not always in the desired order. np.rollaxis can modify it.
As a simple example :
First a classic 2x2 image (each digit is associated to a canal):
image=np.arange(12).reshape(2,2,3)%3
In [08]: image
Out[08]:
array([[[0, 1, 2],
[0, 1, 2]],
[[0, 1, 2],
[0, 1, 2]]], dtype=int32)
An other view, canal first :
bycolor= r,g,b = np.rollaxis(image,axis=2)
In [10]: bycolor
Out[10]:
array([[[0, 0],
[0, 0]],
[[1, 1],
[1, 1]],
[[2, 2],
[2, 2]]], dtype=int32)
and flattened layouts :
In [11]: image.flatten()
Out[11]: array([0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2], dtype=int32)
In [12]: bycolor.flatten()
Out[12]: array([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2], dtype=int32)
I think the last is what you want : np.rollaxis(image,2).flatten()
I am seeing behaviour with numpy bincount that I cannot make sense of. I want to bin the values in a 2D array in a row-wise manner and see the behaviour below. Why would it work with dbArray but fail with simarray?
>>> dbArray
array([[1, 0, 1, 0, 1],
[1, 1, 1, 1, 1],
[1, 1, 0, 1, 1],
[1, 0, 0, 0, 0],
[0, 0, 0, 1, 1],
[0, 1, 0, 1, 0]])
>>> N.apply_along_axis(N.bincount,1,dbArray)
array([[2, 3],
[0, 5],
[1, 4],
[4, 1],
[3, 2],
[3, 2]], dtype=int64)
>>> simarray
array([[2, 0, 2, 0, 2],
[2, 1, 2, 1, 2],
[2, 1, 1, 1, 2],
[2, 0, 1, 0, 1],
[1, 0, 1, 1, 2],
[1, 1, 1, 1, 1]])
>>> N.apply_along_axis(N.bincount,1,simarray)
Traceback (most recent call last):
File "<pyshell#31>", line 1, in <module>
N.apply_along_axis(N.bincount,1,simarray)
File "C:\Python27\lib\site-packages\numpy\lib\shape_base.py", line 118, in apply_along_axis
outarr[tuple(i.tolist())] = res
ValueError: could not broadcast input array from shape (2) into shape (3)
The problem is that bincount isn't always returning the same shaped objects, in particular when values are missing. For example:
>>> m = np.array([[0,0,1],[1,1,0],[1,1,1]])
>>> np.apply_along_axis(np.bincount, 1, m)
array([[2, 1],
[1, 2],
[0, 3]])
>>> [np.bincount(m[i]) for i in range(m.shape[1])]
[array([2, 1]), array([1, 2]), array([0, 3])]
works, but:
>>> m = np.array([[0,0,0],[1,1,0],[1,1,0]])
>>> m
array([[0, 0, 0],
[1, 1, 0],
[1, 1, 0]])
>>> [np.bincount(m[i]) for i in range(m.shape[1])]
[array([3]), array([1, 2]), array([1, 2])]
>>> np.apply_along_axis(np.bincount, 1, m)
Traceback (most recent call last):
File "<ipython-input-49-72e06e26a718>", line 1, in <module>
np.apply_along_axis(np.bincount, 1, m)
File "/usr/local/lib/python2.7/dist-packages/numpy/lib/shape_base.py", line 117, in apply_along_axis
outarr[tuple(i.tolist())] = res
ValueError: could not broadcast input array from shape (2) into shape (1)
won't.
You could use the minlength parameter and pass it using a lambda or partial or something:
>>> np.apply_along_axis(lambda x: np.bincount(x, minlength=2), axis=1, arr=m)
array([[3, 0],
[1, 2],
[1, 2]])
As #DSM has already mentioned, bincount of a 2d array cannot be done without knowing the maximum value of the array, because it would mean an inconsistency of array sizes.
But thanks to the power of numpy's indexing, it was fairly easy to make a faster implementation of 2d bincount, as it doesn't use concatenation or anything.
def bincount2d(arr, bins=None):
if bins is None:
bins = np.max(arr) + 1
count = np.zeros(shape=[len(arr), bins], dtype=np.int64)
indexing = np.arange(len(arr))
for col in arr.T:
count[indexing, col] += 1
return count
t = np.array([[1,2,3],[4,5,6],[3,2,2]], dtype=np.int64)
print(bincount2d(t))
P.S.
This:
t = np.empty(shape=[10000, 100], dtype=np.int64)
s = time.time()
bincount2d(t)
e = time.time()
print(e - s)
gives ~2 times faster result, than this:
t = np.empty(shape=[100, 10000], dtype=np.int64)
s = time.time()
bincount2d(t)
e = time.time()
print(e - s)
because of the for loop iterating over columns. So, it's better to transpose your 2d array, if shape[0] < shape[1].
UPD
Better than this can't be done (using python alone, I mean):
def bincount2d(arr, bins=None):
if bins is None:
bins = np.max(arr) + 1
count = np.zeros(shape=[len(arr), bins], dtype=np.int64)
indexing = (np.ones_like(arr).T * np.arange(len(arr))).T
np.add.at(count, (indexing, arr), 1)
return count
This is a function that does exactly what you want, but without any loops.
def sub_sum_partition(a, partition):
"""
Generalization of np.bincount(partition, a).
Sums rows of a matrix for each value of array of non-negative ints.
:param a: array_like
:param partition: array_like, 1 dimension, nonnegative ints
:return: matrix of shape ('one larger than the largest value in partition', a.shape[1:]). The i's element is
the sum of rows j in 'a' s.t. partition[j] == i
"""
assert partition.shape == (len(a),)
n = np.prod(a.shape[1:], dtype=int)
bins = ((np.tile(partition, (n, 1)) * n).T + np.arange(n, dtype=int)).reshape(-1)
sums = np.bincount(bins, a.reshape(-1))
if n > 1:
sums = sums.reshape(-1, *a.shape[1:])
return sums