Multiply tensors containing matrices, following matrix multiplication rule

Multiply tensors containing matrices, following matrix multiplication rule - python

Say I have a tensor, where A, B, C, and D are all 2x2 matrices:
M = [[A, B],
[C, D]]
How do I get to the power of n, for example with n=2, with Python or MATLAB
M^2 = [[A#A + B#C, A#B + B#D],
[C#A + D#C, C#B + D#D]]
Here the power just follows the normal matrix multiplication rule; it's just that the elements are matrices themselves. I tried matmul, matrix_power, and pagemtimes, but nothing works.

You are just computing the normal matrix product of the 4x4 block matrix created by joining the smaller matrices A through D.
In MATLAB, your expected result using some arbitrary matrices:
A = [1, 2
3, 4];
B = [5, 6
7, 8];
C = [ 9, 10
11, 12];
D = [13, 14
15, 16];
res = [A*A + B*C, A*B + B*D
C*A + D*C, C*B + D*D]
res =
118 132 174 188
166 188 254 276
310 356 494 540
358 412 574 628
The 4x4 block matrix, and its square:
M = [A, B
C, D];
res2 = M^2
res2 =
118 132 174 188
166 188 254 276
310 356 494 540
358 412 574 628

Probably not the most efficient, but here's a manual solution:
M = np.random.randint(0, 10, (2, 2, 2, 2))
def matmatmul(a, b):
output = np.zeros((a.shape[0], b.shape[1]), dtype = object)
for i in range(output.shape[0]):
for j in range(output.shape[1]):
row = a[i]
col = b[:, j]
output[i,j] = sum([r # c for r,c in zip(row, col)])
return output
def matmatpow(a, n):
if n == 1:
return a
else:
output = matmatmul(a, a)
for i in range(2, n):
output = matmatmul(output, a)
return output
M2 = matmatpow(M, 2)
print(M2)
[[A, B], [C, D]] = M
assert np.all(M2[0,0] == A#A + B#C)
assert np.all(M2[0,1] == A#B + B#D)
assert np.all(M2[1,0] == C#A + D#C)
assert np.all(M2[1,1] == C#B + D#D)

Defining a set of (2,2) arrays, and their composite:
In [45]: A,B,C,D = [np.arange(i,i+4).reshape(2,2) for i in range(4)]
In [46]: M=np.array([[A,B],[C,D]])
Your desired M^2 array:
In [47]: np.array([[A#A + B#C, A#B + B#D],
...: [C#A + D#C, C#B + D#D]])
Out[47]:
array([[[[12, 16],
[28, 40]],
[[16, 20],
[40, 52]]],
[[[28, 40],
[44, 64]],
[[40, 52],
[64, 84]]]])
The same thing using einsum. In this j and l are the sum-of-products dimensions:
In [48]: np.einsum('ijkl,jmln->imkn',M,M)
Out[48]:
array([[[[12, 16],
[28, 40]],
[[16, 20],
[40, 52]]],
[[[28, 40],
[44, 64]],
[[40, 52],
[64, 84]]]])
matmul is the equivalent of 'ijkl,ijlm->ijkm', where ij are batch dimensions, and l is the sum-of-products. Often an einsum can be reproduced with some reshape and generalized transposing. But I'll leave that for someone else to explore.
Playing around with the einsum indices and transposing and reshaping the arrays, I can get the equivalent of:
In [56]: np.matmul(M.transpose(0,2,1,3).reshape(4,4),M.transpose(0,2,1,3).reshape(4,4))
Out[56]:
array([[12, 16, 16, 20],
[28, 40, 40, 52],
[28, 40, 40, 52],
[44, 64, 64, 84]])
which with a bit more massaging becomes the desired (4,4,4,4)
In [57]: np.matmul(M.transpose(0,2,1,3).reshape(4,4),M.transpose(0,2,1,3).reshape(4,4)).reshape(2,2,2,2).transpose(0,2,1,3)
Out[57]:
array([[[[12, 16],
[28, 40]],
[[16, 20],
[40, 52]]],
[[[28, 40],
[44, 64]],
[[40, 52],
[64, 84]]]])

Related

Is it possible to make this function on numpy array more efficient?

Here a is a 1D array of integer indices. To give some context, a is the atom indices of the 1st molecules. The return is the atom indices for n identical molecules, each of which contains step atoms. This function basically applies the same atom selection to many molecules
def f(a, step, n):
a.shape=1,-1
got = np.repeat(a, n, axis=0)
blocks = np.arange(0, n*step, step)
blocks.shape = n, -1
return got + blocks
For example
In [52]: f(np.array([1,3,4]), 10, 4)
Out[52]:
array([[ 1, 3, 4],
[11, 13, 14],
[21, 23, 24],
[31, 33, 34]])

It looks like broadcasting should be enough:
def f(a, step, n):
return a + np.arange(0, n*step, step)[:, None]
f(np.array([1,3,4]), 10, 4)
output:
array([[ 1, 3, 4],
[11, 13, 14],
[21, 23, 24],
[31, 33, 34]])

Python: create a matrix with differing integers in each individual row-vector [duplicate]

I'm trying to create a 2d array (which is a six column and lots of rows) with numpy random choice with unique values between 1 and 50 for every row not all of the array
np.sort(np.random.choice(np.arange(1,50),size=(100,6),replace=False))
But this raises an error.
ValueError: Cannot take a larger sample than population when 'replace=False'
Is it possible to make this with an one liner without a loop
Edit
Okey i get the answer.
These are the results with jupyter %time cellmagic
##James' solution
np.stack([np.random.choice(np.arange(1,50),size=6,replace=False) for i in range(1_000_000)])
Wall time: 25.1 s
##Divakar's solution
np.random.rand(1_000_000, 50).argpartition(6,axis=1)[:,:6]+1
Wall time: 1.36 s
##CoryKramer's solution
np.array([np.random.choice(np.arange(1, 50), size=6, replace=False) for _ in range(1_000_000)])
Wall time: 25.5 s
I changed dtypes of np.empty and np.random.randint on #Paul Panzer's solution because it was not working on my pc.
3.6.0 |Anaconda custom (64-bit)| (default, Dec 23 2016, 11:57:41) [MSC v.1900 64 bit (AMD64)]
Fastest one is
def pp(n):
draw = np.empty((n, 6), dtype=np.int64)
# generating random numbers is expensive, so draw a large one and
# make six out of one
draw[:, 0] = np.random.randint(0, 50*49*48*47*46*45, (n,),dtype=np.uint64)
draw[:, 1:] = np.arange(50, 45, -1)
draw = np.floor_divide.accumulate(draw, axis=-1)
draw[:, :-1] -= draw[:, 1:] * np.arange(50, 45, -1)
# map the shorter ranges (:49, :48, :47) to the non-occupied
# positions; this amounts to incrementing for each number on the
# left that is not larger. the nasty bit: if due to incrementing
# new numbers on the left are "overtaken" then for them we also
# need to increment.
for i in range(1, 6):
coll = np.sum(draw[:, :i] <= draw[:, i, None], axis=-1)
collidx = np.flatnonzero(coll)
if collidx.size == 0:
continue
coll = coll[collidx]
tot = coll
while True:
draw[collidx, i] += coll
coll = np.sum(draw[collidx, :i] <= draw[collidx, i, None], axis=-1)
relidx = np.flatnonzero(coll > tot)
if relidx.size == 0:
break
coll, tot = coll[relidx]-tot[relidx], coll[relidx]
collidx = collidx[relidx]
return draw + 1
##Paul Panzer' solution
pp(1_000_000)
Wall time: 557 ms
Thank you all.

Here's a vectorized approach with rand+argsort/argpartition trick from here -
np.random.rand(rows, 50).argpartition(6,axis=1)[:,:6]+1
Sample run -
In [41]: rows = 10
In [42]: np.random.rand(rows, 50).argpartition(6,axis=1)[:,:6]+1
Out[42]:
array([[ 1, 9, 3, 26, 14, 44],
[32, 20, 27, 13, 25, 45],
[40, 12, 47, 16, 10, 29],
[ 6, 36, 32, 16, 18, 4],
[42, 46, 24, 9, 1, 31],
[15, 25, 47, 42, 34, 24],
[ 7, 16, 49, 31, 40, 20],
[28, 17, 47, 36, 8, 44],
[ 7, 42, 14, 4, 17, 35],
[39, 19, 37, 7, 8, 36]])
Just to prove the random-ness -
In [56]: rows = 1000000
In [57]: out = np.random.rand(rows, 50).argpartition(6,axis=1)[:,:6]+1
In [58]: np.bincount(out.ravel())[1:]
Out[58]:
array([120048, 120026, 119942, 119838, 119885, 119669, 119965, 119491,
120280, 120108, 120293, 119399, 119917, 119974, 120195, 119796,
119887, 119505, 120235, 119857, 119499, 120560, 119891, 119693,
120081, 120369, 120011, 119714, 120218, 120581, 120111, 119867,
119791, 120265, 120457, 120048, 119813, 119702, 120266, 120445,
120016, 120190, 119576, 119737, 120153, 120215, 120144, 120196,
120218, 119863])
Timings on one million rows of data -
In [43]: rows = 1000000
In [44]: %timeit np.random.rand(rows, 50).argpartition(6,axis=1)[:,:6]+1
1 loop, best of 3: 1.07 s per loop

This isn't pure numpy but you could wrap your solution within a list comprehension
>>> rows = 10
>>> cols = 6
>>> np.array([np.random.choice(np.arange(1, 50), size=cols, replace=False) for _ in range(rows)])
array([[ 9, 10, 21, 33, 34, 15],
[48, 46, 36, 7, 37, 45],
[21, 15, 5, 9, 31, 26],
[48, 24, 30, 18, 47, 23],
[22, 31, 19, 32, 3, 33],
[35, 44, 15, 46, 20, 43],
[11, 37, 44, 6, 16, 35],
[42, 49, 41, 28, 12, 19],
[19, 6, 32, 3, 1, 22],
[29, 33, 42, 5, 30, 43]])

You can create each row by itself and then stack them.
np.stack([np.random.choice(np.arange(1,50),size=6,replace=False) for i in range(100)])

Here is a constructive approach, draw first (50 choices), second (49 choices) etc. For large sets it's quite competitive (pp in table):
# n = 10
# pp 0.18564210 ms
# Divakar 0.01960790 ms
# James 0.20074140 ms
# CK 0.17823420 ms
# n = 1000
# pp 0.80046050 ms
# Divakar 1.31817130 ms
# James 18.93511460 ms
# CK 20.83670820 ms
# n = 1000000
# pp 655.32905590 ms
# Divakar 1352.44713990 ms
# James 18471.08987370 ms
# CK 18369.79808050 ms
# pp checking plausibility...
# var (exp obs) 208.333333333 208.363840259
# mean (exp obs) 25.5 25.5064865
# Divakar checking plausibility...
# var (exp obs) 208.333333333 208.21113972
# mean (exp obs) 25.5 25.499471
# James checking plausibility...
# var (exp obs) 208.333333333 208.313436938
# mean (exp obs) 25.5 25.4979035
# CK checking plausibility...
# var (exp obs) 208.333333333 208.169585249
# mean (exp obs) 25.5 25.49
Code including benchmarking. Algo is a bit complicated because mapping to free spots is hairy:
import numpy as np
import types
from timeit import timeit
def f_pp(n):
draw = np.empty((n, 6), dtype=int)
# generating random numbers is expensive, so draw a large one and
# make six out of one
draw[:, 0] = np.random.randint(0, 50*49*48*47*46*45, (n,))
draw[:, 1:] = np.arange(50, 45, -1)
draw = np.floor_divide.accumulate(draw, axis=-1)
draw[:, :-1] -= draw[:, 1:] * np.arange(50, 45, -1)
# map the shorter ranges (:49, :48, :47) to the non-occupied
# positions; this amounts to incrementing for each number on the
# left that is not larger. the nasty bit: if due to incrementing
# new numbers on the left are "overtaken" then for them we also
# need to increment.
for i in range(1, 6):
coll = np.sum(draw[:, :i] <= draw[:, i, None], axis=-1)
collidx = np.flatnonzero(coll)
if collidx.size == 0:
continue
coll = coll[collidx]
tot = coll
while True:
draw[collidx, i] += coll
coll = np.sum(draw[collidx, :i] <= draw[collidx, i, None], axis=-1)
relidx = np.flatnonzero(coll > tot)
if relidx.size == 0:
break
coll, tot = coll[relidx]-tot[relidx], coll[relidx]
collidx = collidx[relidx]
return draw + 1
def check_result(draw, name):
print(name[2:], ' checking plausibility...')
import scipy.stats
assert all(len(set(row)) == 6 for row in draw)
assert len(set(draw.ravel())) == 50
print(' var (exp obs)', scipy.stats.uniform(0.5, 50).var(), draw.var())
print(' mean (exp obs)', scipy.stats.uniform(0.5, 50).mean(), draw.mean())
def f_Divakar(n):
return np.random.rand(n, 50).argpartition(6,axis=1)[:,:6]+1
def f_James(n):
return np.stack([np.random.choice(np.arange(1,51),size=6,replace=False) for i in range(n)])
def f_CK(n):
return np.array([np.random.choice(np.arange(1, 51), size=6, replace=False) for _ in range(n)])
for n in (10, 1_000, 1_000_000):
print(f'n = {n}')
for name, func in list(globals().items()):
if not name.startswith('f_') or not isinstance(func, types.FunctionType):
continue
try:
print("{:16s}{:16.8f} ms".format(name[2:], timeit(
'f(n)', globals={'f':func, 'n':n}, number=10)*100))
except:
print("{:16s} apparently failed".format(name[2:]))
if(n >= 10000):
for name, func in list(globals().items()):
if name.startswith('f_') and isinstance(func, types.FunctionType):
check_result(func(n), name)

np.sort(np.random.choice(np.arange(1,50),size=(100,6),replace=False))
I think you should change replace to true as you are just consuming the numbers in the range

Fast way to remove array of specific row values from 2D numpy array

I have a 2D array like this:
a = np.array([[25, 83, 18, 71],
[75, 7, 0, 85],
[25, 83, 18, 71],
[25, 83, 18, 71],
[75, 48, 8, 43],
[ 7, 47, 96, 94],
[ 7, 47, 96, 94],
[56, 75, 50, 0],
[19, 49, 92, 57],
[52, 93, 58, 9]])
and I want to remove rows that has specific values, for example:
b = np.array([[56, 75, 50, 0], [52, 93, 58, 9], [25, 83, 18, 71]])
What is the most efficient way to do this in numpy or pandas? Expected output:
np.array([[75, 7, 0, 85],
[75, 48, 8, 43],
[ 7, 47, 96, 94],
[ 7, 47, 96, 94],
[19, 49, 92, 57]])
Update
The fastest approach is dimensionality reduction but it requires quite strict limitations of ranges of columns in general. There is my perfplot:
import pandas as pd
import numexpr as ne
import perfplot
from time import time
def remove_pd(data):
a,b = data
dfa, dfb = pd.DataFrame(a), pd.DataFrame(b)
return dfa.merge(dfb, how='left', indicator=True)\
.query('_merge == "left_only"').drop(columns='_merge').values
def remove_smalldata(data):
a,b = data
return a[(a[None,:,:] != b[:,None,:]).any(-1).all(0)]
'''def remove_nploop(data):
a, b = data
for arr in b:
a = a[np.all(~np.equal(a, arr), axis=1)]
return a'''
def remove_looped(data):
a, b = data
to_remain = [True]*len(a)
ind = 0
for vec_a in a:
for vec_b in b:
if np.array_equal(vec_a, vec_b):
to_remain[ind] = False
break
ind += 1
return a[to_remain]
def remove_looped_boost(data):
a, b = data
to_remain = [True]*len(a)
a_map = list(map(tuple, a.tolist()))
b_map = set(map(tuple, b.tolist()))
for i in range(len(a)):
to_remain[i] = not(a_map[i] in b_map)
return a[to_remain]
def remove_reducedim(data):
a,b = data
a, b = a.astype(np.int64), b.astype(np.int64) #make sure box is not too small
ma, MA = np.min(a, axis=0), np.max(a, axis=0)
mb, MB = np.min(b, axis=0), np.max(b, axis=0)
m, M = np.min([ma, mb], axis=0), np.max([MA, MB],axis=0)
ravel_a = np.ravel_multi_index((a-m).T, M - m + 1)
ravel_b = np.ravel_multi_index((b-m).T, M - m + 1)
return a[~np.isin(ravel_a, ravel_b)]
def remove_reducedim_boost(data):
a,b = data
a, b = a.astype(np.int64), b.astype(np.int64) #make sure box is not too small
ma, MA = np.min(a, axis=0), np.max(a, axis=0)
mb, MB = np.min(b, axis=0), np.max(b, axis=0)
m1,m2,m3,m4 = np.min([ma, mb], axis=0)
M1,M2,M3,M4 = np.max([MA, MB], axis=0)
s1,s2,s3,s4 = M1-m1+1, M2-m2+1, M3-m3+1, M4-m4+1
a1,a2,a3,a4 = a.T
b1,b2,b3,b4 = b.T
d = {'a1':a1, 'a2':a2, 'a3':a3, 'a4':a4, 'b1':b1, 'b2':b2, 'b3':b3, 'b4':b4,
's1':s1, 's2':s2, 's3':s3, 'm1':m1, 'm2':m2, 'm3':m3, 'm4':m4}
ravel_a = ne.evaluate('(a1-m1)+(a2-m2)*s1+(a3-m3)*s1*s2+(a4-m4)*s1*s2*s3',d)
ravel_b = ne.evaluate('(b1-m1)+(b2-m2)*s1+(b3-m3)*s1*s2+(b4-m4)*s1*s2*s3',d)
return a[~np.isin(ravel_a, ravel_b)]
def setup(x):
a1 = np.random.randint(50000, size=(x,4))
a2 = a1[np.random.randint(x, size=x)]
return a1, a2
def build_args(figure):
kernels = [remove_reducedim, remove_reducedim_boost, remove_pd, remove_looped, remove_looped_boost, remove_smalldata]
return {'setup': setup,
'kernels': {'A': kernels, 'B': kernels[:3]}[figure],
'n_range': {'A': [2 ** k for k in range(12)], 'B': [2 ** k for k in range(11, 25)]}[figure],
'xlabel': 'Remowing n rows from n rows',
'title' : {'A':'Testing removal of small dataset', 'B':'Testing removal of large dataset'}[figure],
'show_progress': False,
'equality_check': lambda x,y: np.array_equal(x, y)}
t = time()
outs = [perfplot.bench(**build_args(n)) for n in ('A','B')]
fig = plt.figure(figsize=(20, 20))
for i in range(len(outs)):
ax = fig.add_subplot(2, 1, i+1)
ax.grid(True, which="both")
outs[i].plot()
plt.show()
print('Overall testing time:', time()-t)
Output:
Overall testing time: 529.2596168518066

Here's a pandas approach doing a "anti join" using merge and query.
dfa = pd.DataFrame(a)
dfb = pd.DataFrame(b)
df = (
dfa.merge(dfb, how='left', indicator=True)
.query('_merge == "left_only"')
.drop(columns='_merge')
)
0 1 2 3
1 75 7 0 85
4 75 48 8 43
5 7 47 96 94
6 7 47 96 94
8 19 49 92 57
Note: a plain numpy solution should be faster, but this should do fine.
Plain numpy but with a single loop:
for arr in b:
a = a[np.all(~np.equal(a, arr), axis=1)]
array([[75, 7, 0, 85],
[75, 48, 8, 43],
[ 7, 47, 96, 94],
[ 7, 47, 96, 94],
[19, 49, 92, 57]])

Approach #1 : Views + searchsorted
One approach with array-views to view each row as a scalar each -
# https://stackoverflow.com/a/45313353/ #Divakar
def view1D(a, b): # a, b are arrays
a = np.ascontiguousarray(a)
b = np.ascontiguousarray(b)
void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
return a.view(void_dt).ravel(), b.view(void_dt).ravel()
A,B = view1D(a,b)
sidx = B.argsort()
idx = np.searchsorted(B, A, sorter=sidx)
idx[idx==len(B)] = 0
out = a[B[sidx[idx]] != A]
A variant of it would be with sorting B -
Bs = np.sort(B)
idx = np.searchsorted(Bs, A)
idx[idx==len(B)] = 0
out = a[Bs[idx] != A]
Another variant would using searchsorted on A given B -
sidx = A.argsort()
As = A[sidx]
idx = np.searchsorted(A, B, sorter=sidx)
id_ar = np.zeros(len(a), dtype=int)
id_accum = np.cumsum(np.r_[False, As[:-1] != As[1:]])
count = np.bincount(id_accum)
idx_end = idx + count[id_accum[idx]]
id_ar[idx] = 1
id_ar[idx_end] -= 1
out = a[sidx[id_ar.cumsum()==0]]
For the last step, if you need to maintain order, use :
a[np.sort(sidx[id_ar.cumsum()==0])]
Approach #2 : KDtree
Another with SciPy's cKDTree -
from scipy.spatial import cKDTree
dist,idx = cKDTree(b).query(a,k=1)
out = a[dist!=0]
Approach #3 : Masking with array-assignment
For smaller range of numbers, we can go for a boolean array assignment one :
s = np.maximum(a.max(0)-np.minimum(0,a.min(0)),b.max(0)-np.minimum(0,b.min(0)))+1
mask = np.empty(s, dtype=bool)
mask[tuple(a.T)] = 1
mask[tuple(b.T)] = 0
out = a[mask[tuple(a.T)]]
Note that this will initialize an array of shape (n,n,n,n) for that range of values n on 4 cols input datasets.
Approach #4 : With sorting
# https://stackoverflow.com/a/44999009/ #Divakar
def view1D_onevar(a): # a is array
a = np.ascontiguousarray(a)
void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
return a.view(void_dt).ravel()
def app4(a,b):
ba = np.vstack((b,a))
sidx = np.lexsort(ba.T)
basi = ba[sidx]
v = view1D_onevar(basi)
idar = np.r_[False,v[:-1] != v[1:]].cumsum()
mapar = np.ones(idar[-1]+1, dtype=bool)
mapar[idar[sidx<len(b)]] = 0
out = basi[mapar[idar]]
return out
Two more variants :
def app4_v2(a,b):
ba = np.vstack((b,a))
v0 = view1D_onevar(ba)
sidx = v0.argsort()
v = v0[sidx]
idar = np.r_[False,v[:-1] != v[1:]].cumsum()
mapar = np.ones(idar[-1]+1, dtype=bool)
mapar[idar[sidx<len(b)]] = 0
out = a[sidx[mapar[idar]]-len(b)]
return out
def app4_v3(a,b):
ba = np.vstack((b,a))
sidx = view1D_onevar(ba).argsort()
basi = ba[sidx]
v = view1D_onevar(basi)
idar = np.r_[False,v[:-1] != v[1:]].cumsum()
mapar = np.ones(idar[-1]+1, dtype=bool)
mapar[idar[sidx<len(b)]] = 0
out = basi[mapar[idar]]
return out
Note that order is not maintained.
That lexsort/argsort is the bottleneck, which could be improved if we are willing to go dimensionality-reduction.

If the data are not too big, broadcast is another option:
a[(a[None,:,:] != b[:,None,:]).any(-1).all(0)]
Output:
array([[75, 7, 0, 85],
[75, 48, 8, 43],
[ 7, 47, 96, 94],
[ 7, 47, 96, 94],
[19, 49, 92, 57]])

The only way I can think of is based on dimensionality reduction:
def remove(a, b):
a, b = a.astype(np.int64), b.astype(np.int64) #make sure box is not too small
ma, MA = np.min(a, axis=0), np.max(a, axis=0)
mb, MB = np.min(b, axis=0), np.max(b, axis=0)
m, M = np.min([ma, mb], axis=0), np.max([MA, MB],axis=0)
ravel_a = np.ravel_multi_index((a-m).T, M - m + 1)
ravel_b = np.ravel_multi_index((b-m).T, M - m + 1)
return a[~np.isin(ravel_a, ravel_b)]
Since we need to do a lot of elementary algebra here, some performance boost can be achieved with numexpr:
import numexpr as ne
def remove_boost(a,b):
a, b = a.astype(np.int64), b.astype(np.int64) #make sure box is not too small
ma, MA = np.min(a, axis=0), np.max(a, axis=0)
mb, MB = np.min(b, axis=0), np.max(b, axis=0)
m1,m2,m3,m4 = np.min([ma, mb], axis=0)
M1,M2,M3,M4 = np.max([MA, MB], axis=0)
s1,s2,s3,s4 = M1-m1+1, M2-m2+1, M3-m3+1, M4-m4+1
a1,a2,a3,a4 = a.T
b1,b2,b3,b4 = b.T
d = {'a1':a1, 'a2':a2, 'a3':a3, 'a4':a4, 'b1':b1, 'b2':b2, 'b3':b3, 'b4':b4,
's1':s1, 's2':s2, 's3':s3, 'm1':m1, 'm2':m2, 'm3':m3, 'm4':m4}
ravel_a = ne.evaluate('(a1-m1)+(a2-m2)*s1+(a3-m3)*s1*s2+(a4-m4)*s1*s2*s3',d)
ravel_b = ne.evaluate('(b1-m1)+(b2-m2)*s1+(b3-m3)*s1*s2+(b4-m4)*s1*s2*s3',d)
return a[~np.isin(ravel_a, ravel_b)]
It's quite unexpected to see that dimensionality reduction is the only working way in numpy suitable to do this multidimensional removal on larger data :)
Output of both remove(a, b) and remove_boost(a, b):
[[75 7 0 85]
[75 48 8 43]
[ 7 47 96 94]
[ 7 47 96 94]
[19 49 92 57]]
Disadvantage: it's capable to work only with the boxes that are not larger than 2^63 (s1*s2*s3*s4 = np.prod(np.ptp(np.r_[a,b], axis=0)+1) should be less than 2^63).

You can try my solution with a loop comparison between the vectors in the two matrices.
Code
import np
a = np.array([[25, 83, 18, 71],
[75, 7, 0, 85],
[25, 83, 18, 71],
[25, 83, 18, 71],
[75, 48, 8, 43],
[ 7, 47, 96, 94],
[ 7, 47, 96, 94],
[56, 75, 50, 0],
[19, 49, 92, 57],
[52, 93, 58, 9]])
b = np.array([[56, 75, 50, 0], [52, 93, 58, 9], [25, 83, 18, 71]])
to_remain = [True]*len(a)
ind = 0
for vec_a in a:
for vec_b in b:
if np.array_equal(vec_a, vec_b):
to_remain[ind] = False
break
ind += 1
output = a[to_remain]
print(output)
Output
[[75 7 0 85]
[75 48 8 43]
[ 7 47 96 94]
[ 7 47 96 94]
[19 49 92 57]]

Here is another way of implementing dimensionality reduction
scales = np.array([1,10,100,1000])
a_scaled = np.sum(a * scales,axis=1)[:,None]
b_scaled = np.sum(b * scales,axis=1)[None,:]
a_slice = a[~np.any(a_scaled == b_scaled,axis=1)]
as you can see, we multiply numbers of different columns with different scales, and basically casting this problem into comparing two 1d arrays.
If the order of the columns does not matter, you also need to add np.sort
scales = np.array([1,10,100,1000])
a_scaled = np.sum(np.sort(a,axis=1) * scales,axis=1)[:,None]
b_scaled = np.sum(np.sort(b,axis=1) * scales,axis=1)[None,:]
a_slice = a[~np.any(a_scaled == b_scaled,axis=1)]
I guess there exist cases where this approach fails, so I would be happy about feedback.

Benchmarking post
We will follow OP's strategy of first benchmarking of small to decent sized datasets as stage-1 and then dropping few among them for the larger datasets as stage-2. Also, we will keep it as 4 columns as used in OP's benchmarking codes.
Stage-1 : Upto 5000 sized datasets with b of length 5%-90% of length of a
Stage-2: Drop last two and then run for datasets of size upto 1000000 again for the same percentages for b.
Using benchit package (few benchmarking tools packaged together; disclaimer: I am its author) to benchmark proposed solutions.
Proposed working solutions :
import numpy as np
import pandas as pd
import numexpr as ne
def remove_pd(a,b):
dfa, dfb = pd.DataFrame(a), pd.DataFrame(b)
return dfa.merge(dfb, how='left', indicator=True)\
.query('_merge == "left_only"').drop(columns='_merge').values
def remove_smalldata(a,b):
return a[(a[None,:,:] != b[:,None,:]).any(-1).all(0)]
def remove_looped(a, b):
to_remain = [True]*len(a)
ind = 0
for vec_a in a:
for vec_b in b:
if np.array_equal(vec_a, vec_b):
to_remain[ind] = False
break
ind += 1
return a[to_remain]
def remove_looped_boost(a, b):
to_remain = [True]*len(a)
a_map = list(map(tuple, a.tolist()))
b_map = set(map(tuple, b.tolist()))
for i in range(len(a)):
to_remain[i] = not(a_map[i] in b_map)
return a[to_remain]
def remove_reducedim(a, b):
a, b = a.astype(np.int64), b.astype(np.int64) #make sure box is not too small
ma, MA = np.min(a, axis=0), np.max(a, axis=0)
mb, MB = np.min(b, axis=0), np.max(b, axis=0)
m, M = np.min([ma, mb], axis=0), np.max([MA, MB],axis=0)
ravel_a = np.ravel_multi_index((a-m).T, M - m + 1)
ravel_b = np.ravel_multi_index((b-m).T, M - m + 1)
return a[~np.isin(ravel_a, ravel_b)]
def remove_reducedim_boost(a,b):
a, b = a.astype(np.int64), b.astype(np.int64) #make sure box is not too small
ma, MA = np.min(a, axis=0), np.max(a, axis=0)
mb, MB = np.min(b, axis=0), np.max(b, axis=0)
m1,m2,m3,m4 = np.min([ma, mb], axis=0)
M1,M2,M3,M4 = np.max([MA, MB], axis=0)
s1,s2,s3,_ = M1-m1+1, M2-m2+1, M3-m3+1, M4-m4+1
a1,a2,a3,a4 = a.T
b1,b2,b3,b4 = b.T
d = {'a1':a1, 'a2':a2, 'a3':a3, 'a4':a4, 'b1':b1, 'b2':b2, 'b3':b3, 'b4':b4,
's1':s1, 's2':s2, 's3':s3, 'm1':m1, 'm2':m2, 'm3':m3, 'm4':m4}
ravel_a = ne.evaluate('(a1-m1)+(a2-m2)*s1+(a3-m3)*s1*s2+(a4-m4)*s1*s2*s3',d)
ravel_b = ne.evaluate('(b1-m1)+(b2-m2)*s1+(b3-m3)*s1*s2+(b4-m4)*s1*s2*s3',d)
return a[~np.isin(ravel_a, ravel_b)]
def view1D(a, b): # a, b are arrays
a = np.ascontiguousarray(a)
b = np.ascontiguousarray(b)
void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
return a.view(void_dt).ravel(), b.view(void_dt).ravel()
def div_app1(a, b):
A,B = view1D(a,b)
sidx = B.argsort()
idx = np.searchsorted(B, A, sorter=sidx)
idx[idx==len(B)] = 0
return a[B[sidx[idx]] != A]
def div_app1_v2(a, b):
A,B = view1D(a,b)
Bs = np.sort(B)
idx = np.searchsorted(Bs, A)
idx[idx==len(B)] = 0
return a[Bs[idx] != A]
from scipy.spatial import cKDTree
def div_app2(a, b):
dist,idx = cKDTree(b).query(a,k=1)
return a[dist!=0]
def div_app3(a, b):
s = np.maximum(a.max(0)-np.minimum(0,a.min(0)), b.max(0)-np.minimum(0,b.min(0)))+1
mask = np.empty(s, dtype=bool)
mask[tuple(a.T)] = 1
mask[tuple(b.T)] = 0
return a[mask[tuple(a.T)]]
def Darkonaut(a, b):
a_rows = a.view([('', a.dtype)] * a.shape[1]) # '' default fieldname
b_rows = b.view([('', b.dtype)] * b.shape[1])
return np.setdiff1d(a_rows, b_rows, assume_unique=True)
Benchmarking code :
def setup(array_len, b_len_percentage):
a = np.random.randint(0,100,(array_len,4))
b_len = int(len(a)*b_len_percentage/100.)
b = np.unique(a[np.random.choice(len(a), b_len, replace=False)], axis=0)
return a,b
import benchit
benchit.setparams(rep=1) # use rep=2 or bigger if you have time & patience
funcs = [remove_pd, remove_smalldata, remove_looped, remove_looped_boost,
remove_reducedim, remove_reducedim_boost, div_app1, div_app1_v2,
div_app2, div_app3, app4, app4_v2, app4_v3, Darkonaut]
percentages = np.r_[5,list(range(10,100,20))]
in_ = {(n,x):setup(n,x) for n in [100, 500, 1000, 2000, 5000] for x in percentages}
t = benchit.timings(funcs, in_, multivar=True, input_name=['Len_a', 'Len_b_as_Percentage'])
t.plot(logx=True, sp_ncols=2, legend_fontsize=8, rot=90, save='timings1.png',dpi=200)
funcs.remove(remove_smalldata)
funcs.remove(remove_looped)
in_ = {(n,x):setup(n,x) for n in np.linspace(10**4,10**6,5,dtype=int) for x in percentages}
t2 = benchit.timings(funcs, in_, multivar=True, input_name=['Len_a', 'Len_b_as_Percentage'])
t2.plot(logx=True, sp_ncols=2, legend_fontsize=8, rot=90, save='timings2.png',dpi=200)

Index multiple dimensions of a multi-dimensional array with another - NumPy/ Python

Lets say I have an tensor of the following form:
import numpy as np
a = np.array([ [[1,2],
[3,4]],
[[5,6],
[7,3]]
])
# a.shape : (2,2,2) is a tensor containing 2x2 matrices
indices = np.argmax(a, axis=2)
#print indices
for mat in a:
max_i = np.argmax(mat,axis=1)
# Not really working I would like to
# change 4 in the first matrix to -1
# and 3 in the last to -1
mat[max_i] = -1
print a
Now what I would like to do is to use indices as a mask on a to replace every max element with say -1. Is there a numpy way of doing this ? so far all I have figured out is using for loops.

Here's one way using linear indexing in 3D -
m,n,r = a.shape
offset = n*r*np.arange(m)[:,None] + r*np.arange(n)
np.put(a,indices + offset,-1)
Sample run -
In [92]: a
Out[92]:
array([[[28, 59, 26, 70],
[57, 28, 71, 49],
[33, 6, 10, 90]],
[[24, 16, 83, 67],
[96, 16, 72, 56],
[74, 4, 71, 81]]])
In [93]: indices = np.argmax(a, axis=2)
In [94]: m,n,r = a.shape
...: offset = n*r*np.arange(m)[:,None] + r*np.arange(n)
...: np.put(a,indices + offset,-1)
...:
In [95]: a
Out[95]:
array([[[28, 59, 26, -1],
[57, 28, -1, 49],
[33, 6, 10, -1]],
[[24, 16, -1, 67],
[-1, 16, 72, 56],
[74, 4, 71, -1]]])
Here's another way with linear indexing again, but in 2D -
m,n,r = a.shape
a.reshape(-1,r)[np.arange(m*n),indices.ravel()] = -1
Runtime tests and verify output -
In [156]: def vectorized_app1(a,indices): # 3D linear indexing
...: m,n,r = a.shape
...: offset = n*r*np.arange(m)[:,None] + r*np.arange(n)
...: np.put(a,indices + offset,-1)
...:
...: def vectorized_app2(a,indices): # 2D linear indexing
...: m,n,r = a.shape
...: a.reshape(-1,r)[np.arange(m*n),indices.ravel()] = -1
...:
In [157]: # Generate random 3D array and the corresponding indices array
...: a = np.random.randint(0,99,(100,100,100))
...: indices = np.argmax(a, axis=2)
...:
...: # Make copies for feeding into functions
...: ac1 = a.copy()
...: ac2 = a.copy()
...:
In [158]: vectorized_app1(ac1,indices)
In [159]: vectorized_app2(ac2,indices)
In [160]: np.allclose(ac1,ac2)
Out[160]: True
In [161]: # Make copies for feeding into functions
...: ac1 = a.copy()
...: ac2 = a.copy()
...:
In [162]: %timeit vectorized_app1(ac1,indices)
1000 loops, best of 3: 311 µs per loop
In [163]: %timeit vectorized_app2(ac2,indices)
10000 loops, best of 3: 145 µs per loop

You can use indices to index into the last dimension of a provided that you also specify index arrays into the first two dimensions as well:
import numpy as np
a = np.array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 3]]])
indices = np.argmax(a, axis=2)
print(repr(a[range(a.shape[0]), range(a.shape[1]), indices]))
# array([[2, 3],
# [2, 7]])
a[range(a.shape[0]), range(a.shape[1]), indices] = -1
print(repr(a))
# array([[[ 1, -1],
# [ 3, 4]],
# [[ 5, 6],
# [-1, -1]]])

Python: conversion of a iterated assignment with an atomic assignment using numpy is not working when matrix height > 256

I'm working using numpy 1.6.2 and python 2.7.
Given an N x M x D matrix A and a matrix I that contains a list of indices.
I have to fill a zeros matrix ACopy with the sum of element of A according to the indeces found in I (see code).
Here is my code:
ACopy = zeros(A.shape)
for j in xrange(0, size(A, 0)):
i = I[j]
ACopy[j, i, :] = A[j, i, :] + A[j, i + 1, :]
Indices matrix:
I = array([2, 0, 3, 2, 1])
A matrix:
A = array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]],
[[15, 16, 17],
[18, 19, 20],
[21, 22, 23],
[24, 25, 26],
[27, 28, 29]],
[[30, 31, 32],
[33, 34, 35],
[36, 37, 38],
[39, 40, 41],
[42, 43, 44]],
[[45, 46, 47],
[48, 49, 50],
[51, 52, 53],
[54, 55, 56],
[57, 58, 59]],
[[60, 61, 62],
[63, 64, 65],
[66, 67, 68],
[69, 70, 71],
[72, 73, 74]]])
I try to improve my code avoiding the for loop in this way:
r = r_[0:len(I)]
ACopy[r, I, :] = A[r, I, :] + A[r, I + 1, :]
I noticed that the output matrices ACopy are different and I can't understand why. Any idea?
Thank you all!
EDIT: I'm computing a lot of matrices and I try with np.array_equals(ACopy1,ACopy2), where ACopy1 is the output of the first method and ACopy2 the output of the second method. Sometimes the matrices are the same, but not everytime. The two methods output should be the same or are there any bordeline case?
EDIT2: I noticed that this strange behaviour happens only when matrix height is bigger than 256.
Here is my test suite:
from numpy import *
w = 5
h = 257
for i in xrange(1000):
Z = random.rand(w, h, 5)
I = (random.rand(w) * h - 1).astype(uint8)
r = r_[0:w]
ZCopy = zeros(Z.shape)
ZCopy2 = zeros(Z.shape)
for j in xrange(0, size(Z, 0)):
i = I[j]
ZCopy[j, i, :] = Z[j, i, :] + Z[j, i + 1, :]
ZCopy2[r, I, :] = Z[r, I, :] + Z[r, I + 1, :]
if (ZCopy - ZCopy2).any() > 0:
print(ZCopy, ZCopy2, I)
raise ValueError

I get the problem!
I cast the matrix I to uint8 and so matrix I elements are between 0 and 255.
I resolved using I = (random.rand(w) * h - 1).astype(uint32)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Multiply tensors containing matrices, following matrix multiplication rule - python

Related

Is it possible to make this function on numpy array more efficient?

Python: create a matrix with differing integers in each individual row-vector [duplicate]

Fast way to remove array of specific row values from 2D numpy array

Index multiple dimensions of a multi-dimensional array with another - NumPy/ Python

Python: conversion of a iterated assignment with an atomic assignment using numpy is not working when matrix height > 256

Categories

Resources