Related
How can I optimize the following code, or more specifically, how can I eliminate the for-loop?
array = np.zeros((x.shape[0], K), dtype=np.float128)
for k in range(K):
array[:, k] = np.prod((np.power(ms[k, :], x) * np.power(1 - ms[k, :], 1 - x)).astype('float128'), axis=1)
where x is a two-dimensional array shaped like [70000, 784] and ms like [K, 784] and K=10.
Edit: after it was brought up to my attention, the desired code is fixed.
With 2 arrays with smaller dimensions:
In [142]: K=5; x=np.arange(30*4).reshape(30,4); ms = np.arange(K*4).reshape(K,4)
Your iterative solution looks like:
In [143]: np.array([np.power(ms[k,:],x) for k in range(K)]).shape
Out[143]: (5, 30, 4)
In [144]: arr = np.array([np.power(ms[k,:],x) for k in range(K)])
In [145]: arr.shape
Out[145]: (5, 30, 4)
A equivalent solution without the loop:
In [146]: arr1 = np.power(ms[:,None,:],x)
In [147]: arr1.shape
Out[147]: (5, 30, 4)
In [148]: np.allclose(arr,arr1)
Out[148]: True
This is my code in python, the dimension of sx should be of 100X4 and sy 100X1 by the multiplication (sx)(B)(sy).
import numpy as np
B= [[-6.08066634428988e-10, -8.61023850910464e-11, 5.48222828615260e-12, -9.49229025004441e-14],
[-3.38148313553674e-11, 6.47759097087283e-12, 1.14900158474371e-13, -5.70078947874486e-15],
[-2.55893304237669e-13, -1.40941560399352e-13, 5.76510238931847e-15, -5.52980385181738e-17],
[3.39795122177475e-15, 7.95704191204353e-16, -5.31260642039813e-17, 7.83532802015832e-19]]
[X, Y] = np.meshgrid(np.arange(0, 3, 0.01*3),np.arange(0, 15, 0.01*(15)))
sx=[]
sy=[]
F=[]
for i in range(len(X)):
for j in range(len(X)):
for k in range(len(B)):
sx[i,k].append(X[i,j]**k)
for l in range(len(B)):
sy[l].append((Y[i,j]**l))
F[i,j] = sx*B*sy
The error:
sx[i,k].append(X[i,j]**k) TypeError: list indices must be integers or slices, not tuple
MATLAB code copied from comment (guess as to formatting)
[x,y]=meshgrid(0:0.01*3:3,0:0.01*15:15);
for i=1:size(x)
for j=1:size(x)
for k=0:size(B) -1
sx(1,k+1)=(x(i,j)^k);
end
for k=0:size(B) -1
sy(k+1,1)=(y(i,j)^k);
end
G(i,j)=sx*B*sy;
end
end
If sx or X is a 2D list then indices must be [i][j]. If you're trying to append to two indices i and j then it should be separate calls to append.
In an Octave session:
B =
-6.0807e-10 -8.6102e-11 5.4822e-12 -9.4923e-14
-3.3815e-11 6.4776e-12 1.1490e-13 -5.7008e-15
-2.5589e-13 -1.4094e-13 5.7651e-15 -5.5298e-17
3.3980e-15 7.9570e-16 -5.3126e-17 7.8353e-19
>>
>> [x,y]=meshgrid(0:0.01*3:3,0:0.01*15:15);
>> for i=1:size(x)
for j=1:size(x)
for k=0:size(B) -1
sx(1,k+1)=(x(i,j)^k);
end
for k=0:size(B) -1
sy(k+1,1)=(y(i,j)^k);
end
G(i,j)=sx*B*sy;
end
end
produces
x, y, G (101 x 101)
>> sx (1,4)
sx =
1 3 9 27
>> sy (4,1)
sy =
1
15
225
3375
So the G element is (1,4) * (4,4) * (4,1) => (1,1)
Looks like I should be able to make a
In [100]: B= [[-6.08066634428988e-10, -8.61023850910464e-11, 5.48222828615260e-12, -9.49229025004441e-14],
...: [-3.38148313553674e-11, 6.47759097087283e-12, 1.14900158474371e-13, -5.70078947874486e-15],
...: [-2.55893304237669e-13, -1.40941560399352e-13, 5.76510238931847e-15, -5.52980385181738e-17],
...: [3.39795122177475e-15, 7.95704191204353e-16, -5.31260642039813e-17, 7.83532802015832e-19]]
...:
In [101]: B = np.array(B)
In [106]: [X, Y] = np.meshgrid(np.linspace(0, 3, 101),np.linspace(0, 15, 101),indexing='ij')
In [107]: X.shape
Out[107]: (101, 101)
In [108]: k = np.arange(0,4)
In [109]: k
Out[109]: array([0, 1, 2, 3])
In [110]: SX = X[:,:,None]**k # (101,101,4)
In [111]: SY = Y[:,:,None]**k
In [114]: G = np.einsum('ijk,kl,ijl->ij',SX,B,SY)
In [115]: G.shape
Out[115]: (101, 101)
Allowing for the "F" order of MATLAB (ie. transpose), looks like these results match:
>> G(1,1)
ans = -0.00000000060807
In [118]: G[0,0]
Out[118]: -6.08066634428988e-10
>> G(50,23)
ans = -0.00000000097117
In [119]: G[22,49]
Out[119]: -9.71172989297259e-10
With broadcasting I don't to make the meshgrid arrays
In [121]: x, y = np.linspace(0,3,101), np.linspace(0,15,101)
In [124]: sx = x[:,None]**k
In [125]: sy = y[:,None]**k
In [126]: sx.shape
Out[126]: (101, 4)
In [129]: g = sx#B#sy.T
In [130]: g.shape
Out[130]: (101, 101)
In [131]: np.allclose(G,g)
Out[131]: True
Here I'm doing a matrix product of
(101,4) (4,4) (4,100) => (101,101)
I don't udnerstand how tensordot works and I was reading the official documentation but I don't understand at all what is happening there.
a = np.arange(60.).reshape(3,4,5)
b = np.arange(24.).reshape(4,3,2)
c = np.tensordot(a,b, axes=([1,0],[0,1]))
c.shape
(5, 2)
Why is the shape (5, 2)? What exactly is happening?
I also read this article but the answer is confusing me.
In [7]: A = np.random.randint(2, size=(2, 6, 5))
...: B = np.random.randint(2, size=(3, 2, 4))
...:
In [9]: np.tensordot(A, B, axes=((0),(1))).shape
Out[9]: (6, 5, 3, 4)
A : (2, 6, 5) -> reduction of axis=0
B : (3, 2, 4) -> reduction of axis=1
Output : `(2, 6, 5)`, `(3, 2, 4)` ===(2 gone)==> `(6,5)` + `(3,4)` => `(6,5,3,4)`
Why is the shape (6, 5, 3, 4)?
In [196]: a = np.arange(60.).reshape(3,4,5)
...: b = np.arange(24.).reshape(4,3,2)
...: c = np.tensordot(a,b, axes=([1,0],[0,1]))
In [197]: c
Out[197]:
array([[4400., 4730.],
[4532., 4874.],
[4664., 5018.],
[4796., 5162.],
[4928., 5306.]])
I find the einsum equivalent to be easier to "read":
In [198]: np.einsum('ijk,jil->kl',a,b)
Out[198]:
array([[4400., 4730.],
[4532., 4874.],
[4664., 5018.],
[4796., 5162.],
[4928., 5306.]])
tensordot works by transposing and reshaping the inputs to reduce the problem to a simple dot:
In [204]: a1 = a.transpose(2,1,0).reshape(5,12)
In [205]: b1 = b.reshape(12,2)
In [206]: np.dot(a1,b1) # or a1#b1
Out[206]:
array([[4400., 4730.],
[4532., 4874.],
[4664., 5018.],
[4796., 5162.],
[4928., 5306.]])
tensordot can do further manipulation to the result, but that's not needed here.
I had to try several things before I got a1/b1 right. For example a.transpose(2,0,1).reshape(5,12) produces the right shape, but different values.
yet another version:
In [210]: (a.transpose(1,0,2)[:,:,:,None]*b[:,:,None,:]).sum((0,1))
Out[210]:
array([[4400., 4730.],
[4532., 4874.],
[4664., 5018.],
[4796., 5162.],
[4928., 5306.]])
Numpy's arange accepts only single scalar values for start/stop/step. Is there a multi version of this function? Which can accept array inputs for start/stop/step? E.g. having input 2D array like:
[[1 5 1], # start/stop/step first
[3 8 2]] # start/stop/step second
should create array consisting of concatenation of aranges for every row of input (each start/stop/step), input above should create 1D array
1 2 3 4 3 5 7
i.e. we need to design such function that it does next:
print(np.multi_arange(np.array([[1,5,1],[3,8,2]])))
# prints:
# array([1, 2, 3, 4, 3, 5, 7])
And this function should be efficient (pure numpy), i.e. very fast process input array of shape (10000, 3) without pure-Python looping.
Of cause it is possible to create pure Python's loop (or listcomp) to create arange for each row and concatenate results of this loop. But I have very many rows with triples start/stop/step and need to have efficient and fast code, hence looking for pure numpy function.
Why do I need it. I needed this for several tasks. One of this is for indexing - suppose I have 1D array a and I need to extract many (possibly intersecting) subranges of this array. If I had that multi version of arange I would just do:
values = a[np.multi_arange(starts_stops_steps)]
Maybe it is possible to create multi arange function using some combinations of numpy functions? Can you suggest?
Also maybe there are some more efficient solutions for the specific case of extracting subranges of 1D array (see last line of code above) without creating all indexes using multi_arange?
Here's a vectorized one with cumsum that accounts for positive and negative stepsizes -
def multi_arange(a):
steps = a[:,2]
lens = ((a[:,1]-a[:,0]) + steps-np.sign(steps))//steps
b = np.repeat(steps, lens)
ends = (lens-1)*steps + a[:,0]
b[0] = a[0,0]
b[lens[:-1].cumsum()] = a[1:,0] - ends[:-1]
return b.cumsum()
If you need to validate for valid ranges : (start < stop when step > 0) and (start > stop when step < 0) , use a pre-processing step :
a = a[((a[:,1] > a[:,0]) & (a[:,2]>0) | (a[:,1] < a[:,0]) & (a[:,2]<0))]
Sample run -
In [17]: a
Out[17]:
array([[ 1, 5, 1],
[ 3, 8, 2],
[18, 6, -2]])
In [18]: multi_arange(a)
Out[18]: array([ 1, 2, 3, 4, 3, 5, 7, 18, 16, 14, 12, 10, 8])
In [1]: np.r_[1:5:1, 3:8:2]
Out[1]: array([1, 2, 3, 4, 3, 5, 7])
In [2]: np.hstack((np.arange(1,5,1),np.arange(3,8,2)))
Out[2]: array([1, 2, 3, 4, 3, 5, 7])
The r_ version is nice and compact, but not faster:
In [3]: timeit np.r_[1:5:1, 3:8:2]
23.9 µs ± 34.6 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [4]: timeit np.hstack((np.arange(1,5,1),np.arange(3,8,2)))
11.2 µs ± 19.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
I've just came up with my solution using numba. Still I prefer numpy-only solution if we find best one not to carry heavy numba JIT compiler.
I've also tested #Divakar solution in my code.
Next code output is:
naive_multi_arange 0.76601 sec
arty_multi_arange 0.01801 sec 42.52 speedup
divakar_multi_arange 0.05504 sec 13.92 speedup
Meaning my numba solution has 42x speedup, #Divakar's numpy solution has 14x speedup.
Next code can be also run online here.
import time, random
import numpy as np, numba
#numba.jit(nopython = True)
def arty_multi_arange(a):
starts, stops, steps = a[:, 0], a[:, 1], a[:, 2]
pos = 0
cnt = np.sum((stops - starts + steps - np.sign(steps)) // steps, dtype = np.int64)
res = np.zeros((cnt,), dtype = np.int64)
for i in range(starts.size):
v, stop, step = starts[i], stops[i], steps[i]
if step > 0:
while v < stop:
res[pos] = v
pos += 1
v += step
elif step < 0:
while v > stop:
res[pos] = v
pos += 1
v += step
assert pos == cnt
return res
def divakar_multi_arange(a):
steps = a[:,2]
lens = ((a[:,1]-a[:,0]) + steps-np.sign(steps))//steps
b = np.repeat(steps, lens)
ends = (lens-1)*steps + a[:,0]
b[0] = a[0,0]
b[lens[:-1].cumsum()] = a[1:,0] - ends[:-1]
return b.cumsum()
random.seed(0)
neg_prob = 0.5
N = 100000
minv, maxv, maxstep = -100, 300, 15
steps = [random.randrange(1, maxstep + 1) * ((1, -1)[random.random() < neg_prob]) for i in range(N)]
starts = [random.randrange(minv + 1, maxv) for i in range(N)]
stops = [random.randrange(*(((starts[i] + 1, maxv + 1), (minv, starts[i]))[steps[i] < 0])) for i in range(N)]
joined = np.array([starts, stops, steps], dtype = np.int64).T
tb = time.time()
aref = np.concatenate([np.arange(joined[i, 0], joined[i, 1], joined[i, 2], dtype = np.int64) for i in range(N)])
npt = time.time() - tb
print('naive_multi_arange', round(npt, 5), 'sec')
for func in ['arty_multi_arange', 'divakar_multi_arange']:
globals()[func](joined)
tb = time.time()
a = globals()[func](joined)
myt = time.time() - tb
print(func, round(myt, 5), 'sec', round(npt / myt, 2), 'speedup')
assert a.size == aref.size, (a.size, aref.size)
assert np.all(a == aref), np.vstack((np.flatnonzero(a != aref)[:5], a[a != aref][:5], aref[a != aref][:5])).T
Consider the following variable length 2D array
[
[1, 2, 3],
[4, 5],
[6, 7, 8, 9]
]
How can i find the mean of the variables along the column?
I want something like [(1+4+6)/3,(2+5+7)/3, (3+8)/2, 9/1]
So the end result would be [3.667, 4.667, 5.5, 9]
Is this possible using numpy?
I tried np.mean(x, axis=0), but numpy expects the arrays of same dimension.
Right now, I am popping the elements of each column and finding the mean. Is there a better way to achieve the result?
You could use pandas:
import pandas as pd
a = [[1, 2, 3],
[4, 5],
[6, 7, 8, 9]]
df = pd.DataFrame(a)
# 0 1 2 3
# 0 1 2 3 NaN
# 1 4 5 NaN NaN
# 2 6 7 8 9
df.mean()
# 0 3.666667
# 1 4.666667
# 2 5.500000
# 3 9.000000
# dtype: float64
Here is another solution that only uses numpy:
import numpy as np
nrows = len(a)
ncols = max(len(row) for row in a)
arr = np.zeros((nrows, ncols))
arr.fill(np.nan)
for jrow, row in enumerate(a):
for jcol, col in enumerate(row):
arr[jrow, jcol] = col
print np.nanmean(arr, axis=0)
# array([ 3.66666667, 4.66666667, 5.5 , 9. ])
Very simple alternative approach using itertools.izip_longest() as:
>>> mean_list = []
>>> for sub_list in izip_longest(*my_list):
... filtered_list = filter(None, sub_list)
... mean_list.append(sum(filtered_list)/(len(filtered_list)*1.0))
...
>>> mean_list
[3.6666666666666665, 4.666666666666667, 5.5, 9.0]
where my_list equals to:
[
[1, 2, 3],
[4, 5],
[6, 7, 8, 9]
]
Listed in this post is an almost vectorized approach using NumPy. We would try to assign each element in list element an ID based on their positions. These IDs could then be fed to np.bincount as it would perform ID based summations. Finally, we would divide the summations respectively by the lengths of each ID to get the final average values.
Thus, we would have an implementation like so -
def variable_mean(a):
vals = np.concatenate(a)
lens = np.array(map(len,a))
id_arr = np.ones(vals.size,dtype=int)
id_arr[0] = 0
id_arr[lens.cumsum()[:-1]] = -lens[:-1] + 1
IDs = id_arr.cumsum()
return np.bincount(IDs,vals)/np.bincount(IDs)
Runtime test -
In [298]: # Setup input
...: N = 1000 # number of elems in input list
...: minL = 3 # min len of an element (list) in input list
...: maxL = 10 # max len of an element (list) in input list
...: a = [list(np.random.randint(0,9,(i))) \
...: for i in np.random.randint(minL,maxL,(N))]
...:
In [299]: %timeit pd.DataFrame(a).mean() ##Julien Spronck's pandas soln
100 loops, best of 3: 3.33 ms per loop
In [300]: %timeit variable_mean(a)
100 loops, best of 3: 2.36 ms per loop
In [301]: # Setup input
...: N = 1000 # number of elems in input list
...: minL = 3 # min len of an element (list) in input list
...: maxL = 100 # max len of an element (list) in input list
...: a = [list(np.random.randint(0,9,(i))) \
...: for i in np.random.randint(minL,maxL,(N))]
...:
In [302]: %timeit pd.DataFrame(a).mean() ##Julien Spronck's pandas soln
10 loops, best of 3: 27.1 ms per loop
In [303]: %timeit variable_mean(a)
100 loops, best of 3: 9.58 ms per loop
If you want to do it manually, what I would do:
max_length = 0
Figure out the max array length:
for array in arrays:
if len(array) > max:
max = len(array)
Pad all arrays to that length with 'None'
for array in arrays:
while len(array) < max:
array.append(None)
Zip will group the columns
columns = zip(*arrays)
columns == [(1, 4, 6), (2, 5, 7), (3, 'None', 8), ('None', 'None', 9)]
Calculate the average as you would for any list:
for col in columns:
count = 0
sum = 0.0
for num in col:
if num is not None:
count += 1
sum += float(num)
print "%s: Avg %s" % (col, sum/count)
Or as a list comprehension after padding the arrays:
[sum(filter(None, col))/float(len(filter(None, col))) for col in zip(*arrays)]
Output:
(1, 4, 6): Avg 3.66666666667
(2, 5, 7): Avg 4.66666666667
(3, 'None', 8): Avg 5.5
('None', 'None', 9): Avg 9.0
In Py3, zip_longest takes a fillvalue parameter:
In [1208]: ll=[
...: [1, 2, 3],
...: [4, 5],
...: [6, 7, 8, 9]
...: ]
In [1209]: list(itertools.zip_longest(*ll, fillvalue=np.nan))
Out[1209]: [(1, 4, 6), (2, 5, 7), (3, nan, 8), (nan, nan, 9)]
By filling with nan, I can use np.nanmean to take the mean ignoring the nan. nanmean turns its input (here _ from the previous line) into an array:
In [1210]: np.nanmean(_, axis=1)
Out[1210]: array([ 3.66666667, 4.66666667, 5.5 , 9. ])