I'm having trouble translating this operation from MatLab to Python:
xup(1:ncomp,1)=aa(1+k:ncomp+k).';
"aa" is a vector of 1x1000 elements.
"ncomp" = 128
"k" is a variable for a loop cycle.
The problem is ... I don't understand how does it work.
I'm posting the whole section of the algorithm:
while(testnorm>0.0001 && epoca<maxit)
k=0;
xup=[];
while(k<=npatt)
xup(1:ncomp,1)=aa(1+k:ncomp+k).';
if (funz==1)
gy=tanh(alpha.*xup.'*w);
else
gy=sign(xup.'*w).*log(1+abs(alpha*xup.'*w));
end
w=w+lr.*(xup*gy-w*triu((xup.'*w).'*gy));
w = w / norm(w);
k=k+1;
end
[...]
end
can you help ?
Essentially what this
xup(1:ncomp,1)=aa(1+k:ncomp+k).';
line is doing; It is fetching data from vector aa, starting from index 1+k to ncomp+k index(i.e ncomp total elements) and transposing(from single row to ncomp rows) those elements. Now this transposed data is being inserted into the xup as ncomp rows, where each row has a single element. The similar python code would be
import numpy as np
aa = np.array([i for i in range(1000)])
ncomp = 128
k = 0
xup = [[0] for i in range(ncomp)]
while(k<10):
data = (np.transpose(aa[k:ncomp+k]))
for i in range(ncomp):
xup[i] = data[i]
k += 1
print(xup[:128])
I am trying to swap two indices in the 2D array of NumPy. Unfortunately, only one element is getting swapped. Here is the code:
n = len(A)
perMatrix = np.zeros((n,n))
np.fill_diagonal(perMatrix, 1)
perMatrix = A
# swapping the row
print(perMatrix)
temp = perMatrix[switchIndex1]
print(temp)
# perMatrix[switchIndex1][0] = 14
perMatrix[switchIndex1], perMatrix[switchIndex2] = perMatrix[switchIndex2], perMatrix[switchIndex1]
print(perMatrix)
Here's what the code is outputting:
You could just add (on the line after perMatrix is created):
sigma = [switchIndex1, switchIndex2]
tau = [switchIndex2, switchIndex1]
perMatrix[sigma,:] = perMatrix[tau,:]
I'm trying to optimize the function 'pw' in the following code using only NumPy functions (or perhaps list comprehensions).
from time import time
import numpy as np
def pw(x, udata):
"""
Creates the step function
| 1, if d0 <= x < d1
| 2, if d1 <= x < d2
pw(x,data) = ...
| N, if d(N-1) <= x < dN
| 0, otherwise
where di is the ith element in data.
INPUT: x -- interval which the step function is defined over
data -- an ordered set of data (without repetitions)
OUTPUT: pw_func -- an array of size x.shape[0]
"""
vals = np.arange(1,udata.shape[0]+1).reshape(udata.shape[0],1)
pw_func = np.sum(np.where(np.greater_equal(x,udata)*np.less(x,np.roll(udata,-1)),vals,0),axis=0)
return pw_func
N = 50000
x = np.linspace(0,10,N)
data = [1,3,4,5,5,7]
udata = np.unique(data)
ti = time()
pw(x,udata)
tf = time()
print(tf - ti)
import cProfile
cProfile.run('pw(x,udata)')
The cProfile.run is telling me that most of the overhead is coming from np.where (about 1 ms) but I'd like to create faster code if possible. It seems that performing the operations row-wise versus column-wise makes some difference, unless I'm mistaken, but I think I've accounted for it. I know that sometimes list comprehensions can be faster but I couldn't figure out a faster way than what I'm doing using it.
Searchsorted seems to yield better performance but that 1 ms still remains on my computer:
(modified)
def pw(xx, uu):
"""
Creates the step function
| 1, if d0 <= x < d1
| 2, if d1 <= x < d2
pw(x,data) = ...
| N, if d(N-1) <= x < dN
| 0, otherwise
where di is the ith element in data.
INPUT: x -- interval which the step function is defined over
data -- an ordered set of data (without repetitions)
OUTPUT: pw_func -- an array of size x.shape[0]
"""
inds = np.searchsorted(uu, xx, side='right')
vals = np.arange(1,uu.shape[0]+1)
pw_func = vals[inds[inds != uu.shape[0]]]
num_mins = np.sum(xx < np.min(uu))
num_maxs = np.sum(xx > np.max(uu))
pw_func = np.concatenate((np.zeros(num_mins), pw_func, np.zeros(xx.shape[0]-pw_func.shape[0]-num_mins)))
return pw_func
This answer using piecewise seems pretty close, but that's on a scalar x0 and x1. How would I do it on arrays? And would it be more efficient?
Understandably, x may be pretty big but I'm trying to put it through a stress test.
I am still learning though so some hints or tricks that can help me out would be great.
EDIT
There seems to be a mistake in the second function since the resulting array from the second function doesn't match the first one (which I'm confident that it works):
N1 = pw1(x,udata.reshape(udata.shape[0],1)).shape[0]
N2 = np.sum(pw1(x,udata.reshape(udata.shape[0],1)) == pw2(x,udata))
print(N1 - N2)
yields
15000
data points that are not the same. So it seems that I don't know how to use 'searchsorted'.
EDIT 2
Actually I fixed it:
pw_func = vals[inds[inds != uu.shape[0]]]
was changed to
pw_func = vals[inds[inds[(inds != uu.shape[0])*(inds != 0)]-1]]
so at least the resulting arrays match. But the question still remains on whether there's a more efficient way of going about doing this.
EDIT 3
Thanks Tin Lai for pointing out the mistake. This one should work
pw_func = vals[inds[(inds != uu.shape[0])*(inds != 0)]-1]
Maybe a more readable way of presenting it would be
non_endpts = (inds != uu.shape[0])*(inds != 0) # only consider the points in between the min/max data values
shift_inds = inds[non_endpts]-1 # searchsorted side='right' includes the left end point and not right end point so a shift is needed
pw_func = vals[shift_inds]
I think I got lost in all those brackets! I guess that's the importance of readability.
A very abstract yet interesting problem! Thanks for entertaining me, I had fun :)
p.s. I'm not sure about your pw2 I wasn't able to get it output the same as pw1.
For reference the original pws:
def pw1(x, udata):
vals = np.arange(1,udata.shape[0]+1).reshape(udata.shape[0],1)
pw_func = np.sum(np.where(np.greater_equal(x,udata)*np.less(x,np.roll(udata,-1)),vals,0),axis=0)
return pw_func
def pw2(xx, uu):
inds = np.searchsorted(uu, xx, side='right')
vals = np.arange(1,uu.shape[0]+1)
pw_func = vals[inds[inds[(inds != uu.shape[0])*(inds != 0)]-1]]
num_mins = np.sum(xx < np.min(uu))
num_maxs = np.sum(xx > np.max(uu))
pw_func = np.concatenate((np.zeros(num_mins), pw_func, np.zeros(xx.shape[0]-pw_func.shape[0]-num_mins)))
return pw_func
My first attempt was utilising a lot of boardcasting operation from numpy:
def pw3(x, udata):
# the None slice is to create new axis
step_bool = x >= udata[None,:].T
# we exploit the fact that bools are integer value of 1s
# skipping the last value in "data"
step_vals = np.sum(step_bool[:-1], axis=0)
# for the step_bool that we skipped from previous step (last index)
# we set it to zerp so that we can negate the step_vals once we reached
# the last value in "data"
step_vals[step_bool[-1]] = 0
return step_vals
After looking at the searchsorted from your pw2 I had a new approach that utilise it with much higher performance:
def pw4(x, udata):
inds = np.searchsorted(udata, x, side='right')
# fix-ups the last data if x is already out of range of data[-1]
if x[-1] > udata[-1]:
inds[inds == inds[-1]] = 0
return inds
Plots with:
plt.plot(pw1(x,udata.reshape(udata.shape[0],1)), label='pw1')
plt.plot(pw2(x,udata), label='pw2')
plt.plot(pw3(x,udata), label='pw3')
plt.plot(pw4(x,udata), label='pw4')
with data = [1,3,4,5,5,7]:
with data = [1,3,4,5,5,7,11]
pw1,pw3,pw4 are all identical
print(np.all(pw1(x,udata.reshape(udata.shape[0],1)) == pw3(x,udata)))
>>> True
print(np.all(pw1(x,udata.reshape(udata.shape[0],1)) == pw4(x,udata)))
>>> True
Performance: (timeit by default runs 3 times, average of number=N of times)
print(timeit.Timer('pw1(x,udata.reshape(udata.shape[0],1))', "from __main__ import pw1, x, udata").repeat(number=1000))
>>> [3.1938983199979702, 1.6096494779994828, 1.962694135003403]
print(timeit.Timer('pw2(x,udata)', "from __main__ import pw2, x, udata").repeat(number=1000))
>>> [0.6884554479984217, 0.6075002400029916, 0.7799002879983163]
print(timeit.Timer('pw3(x,udata)', "from __main__ import pw3, x, udata").repeat(number=1000))
>>> [0.7369808239964186, 0.7557657590004965, 0.8088172269999632]
print(timeit.Timer('pw4(x,udata)', "from __main__ import pw4, x, udata").repeat(number=1000))
>>> [0.20514375300263055, 0.20203858999957447, 0.19906871100101853]
I am trying to create a function (or series of functions), that perform the following operations:
Having an input array(A), for each cell A[i,j], extract a window (W), of custom size, where the value 'min' will be:
min = np.min(W)
The output matrix (H) will store the values as:
H[i,j] = A[i,j] - min(W)
For an easier understanding of the issue, I attached a picture (Example):
My current code is this:
def res_array(matrix, size):
result = []
sc.generic_filter(matrix, nothing, size, extra_arguments=(result,), mode = 'nearest')
mat_out = result
return mat_out
def local(window):
H = np.empty_like(window)
w = res_array(window, 3)
win_min = np.apply_along_axis(min, 1, w)
# This is where I think it's broken
for k in win_min:
for i in range(window.shape[0]):
for j in range(window.shape[1]):
h[i, j] = window[i,j] - k
k += 1
return h
def nothing(window, out):
list = []
for i in range(window.shape[0]):
list.append(window[i])
out.append(list)
return 0
test = np.ones((10, 10)) * np.arange(10)
a = local(test)
I need the code to pass to the next value in 'for k in win_min', for each cell of the input matrix A, or test.
Edit: I thought of something like directly accessing the index of the 'win_min', and increment by one, like I saw here: Increment the value inside a list element, but I don't know how to do that.
Thanks for any help!
N=4 #matrix size
a=random((N,N)) #input
#--window size
wl=1 #left
wr=1 #right
wt=1 #top
wb=1 #bottom
#---
H=np.zeros((N,N)) #output
def h(k,l): #individual cell function
#--- checks to not run out of array
k1=max(k-wt,0)
k2=min(k+wb+1,N)
l1=max(l-wl,0)
l2=min(l+wr,N)
#---
return a[k,l]-np.amin(a[k1:k2,l1:l2])
H=array([[h(k,l) for l in range(N)] for k in range(N)]) #running over all matrix elements
print a
print H
I am trying to generate a random array of 0s and 1s, and I am getting the error: shape mismatch: objects cannot be broadcast to a single shape. The error seems to be occurring in the line randints = np.random.binomial(1,p,(n,n)). Here is the function:
import numpy as np
def rand(n,p):
'''Generates a random array subject to parameters p and N.'''
# Create an array using a random binomial with one trial and specified
# parameters.
randints = np.random.binomial(1,p,(n,n))
# Use nested while loops to go through each element of the array
# and assign True to 1 and False to 0.
i = 0
j = 0
rand = np.empty(shape = (n,n),dtype = bool)
while i < n:
while j < n:
if randints[i][j] == 0:
rand[i][j] = False
if randints[i][j] == 1:
rand[i][j] = True
j = j+1
i = i +1
j = 0
# Return the new array.
return rand
print rand
When I run it by itself, it returns <function rand at 0x1d00170>. What does this mean? How should I convert it to an array that can be worked with in other functions?
You needn't go through all of that,
randints = np.random.binomial(1,p,(n,n))
produces your array of 0 and 1 values,
rand_true_false = randints == 1
will produce another array, just with the 1s replaced with True and 0s with False.
Obviously, the answer by #danodonovan is the most Pythonic, but if you really want something more similar to your looping code. Here is an example that fixes the name conflicts and loops more simply.
import numpy as np
def my_rand(n,p):
'''Generates a random array subject to parameters p and N.'''
# Create an array using a random binomial with one trial and specified
# parameters.
randInts = np.random.binomial(1,p,(n,n))
# Use nested while loops to go through each element of the array
# and assign True to 1 and False to 0.
randBool = np.empty(shape = (n,n),dtype = bool)
for i in range(n):
for j in range(n):
if randInts[i][j] == 0:
randBool[i][j] = False
else:
randBool[i][j] = True
return randBool
newRand = my_rand(5,0.3)
print(newRand)